I repeated my tests on the Sylphyo latency with more care and double-checking. I also did each test three times to get averages and an indication of variability. I think (hope) I've accounted for every latency-producing element in the signal chain when making final calcs.
I have interspersed ideas and thought in italics.
Caveat: I am rather new to this kind of equipment testing, and may have screwed up at any point ... independent measurement of these types of numbers would be helpful and reassuring ... Your Mileage May Vary!
General Results and Thoughts
I measured about 11 msec from the time I initiate breath into the Sylphyo till I received USB-based MIDI from the Link over the radio link. This is using the Link's {USB Power Supply} port.
If you need MIDI over USB, latency will be lower when using the {USB Power Supply} port on the Link rather than the {MIDI DIN Out} port. The difference is about 2 msec (plus the latency introduced by any DIN-to-USB conversion that is needed).
If you need MIDI-over-DIN and you have a MIDI USB-to-DIN converter that is faster than 2 msec, you will lower latency by routing the Link {USB Power Supply} => MIDI USB-to-DIN converter rather than coming straight out of the Link {MIDI DIN Out} port.
If you are using the Sylphyo's internal sounds, the Link (over the radio link) produces sound on the Line Out ports after about 18 msec. You can save about roughly 5 msec of that latency by using a wired connection to the Sylphyo headphone port rather than the Link, if that fits your situation.
Of course, from there a lot of things may get added on to the latency of the Sylphyo when rendering sound in a real-world MIDI-based system. My current rig system has a latency of (very) roughly 35 msec, from breath initiation to the analog signal is sent to the house system.
I have found no research or discussion of how long it takes to initiate sound from real-world wind instruments. It must be quite "a while" for instruments like oboes and low tones on pipe organs.
One counter-point of comparison would be to measure the latency on an analog (control voltage) - based wind controller. Matt Traum comments that "Playing the analog synth that is connected via real control voltage feels to me like the synth is IN MY MOUTH ..." (https://www.patchmanmusic.com/WindControllerFAQ.html).
Summary of the Numbers
The average net time from breath initiation to sound coming out of the Sylphyo was measured at 12.08 msec (range 9.68-14.99).
The average net time from breath initiation to sound coming out of the Link {Line Out} was measured at 17.76 msec (range 16.70-19.59).
The time delta between the beginning of sound out of the Sylphyo headphone port and the beginning of sound out of the Link {Line Out} ports was highly variable. The average was 5.68 msec, with a range of 2.00 - 8.03. If we assume that the time to render sound is the same on the Sylphyo and the Link this gives us insight into the speed (and variability) of the radio link. One thing that may affect the radio link is the dense wireless IP environment that these tests were performed in.
The average net time from breath initiation the appearance of MIDI on the Link {USB Power Supply} port was measured at 11.04 msec (range 8.59-12.50).
The average net time from breath initiation the appearance of MIDI on the Link {MIDI DIN Out} port was measured at 12.98 msec (range 11.59-14.52).
This means that, on average, the MIDI signal appears on the {MIDI DIN Out} port of the Link 1.94 msec after the {USB Power Supply} port.
The Details
Testing was done 11/26/2021. The primary host is an Intel Core i7-6600U Toshiba 2016 laptop, Win10x64.
The DAW is Cantabile 3.3694 [6Aug2021] x64.
The audio interface is a RME UCX II using the RME ASIO driver v1.212 [4May2021].
All testing done at 44.1 kHz sample rate, with a 48 sample buffer size. Note that there is an additional 32 sample buffer internally in the UCX II.
The Hub is an Atolla 7-port powered hub.
Post-processing and waveform analysis done in Reaper v6.21 [23Jan2021].
I selected the #31 - El Harrachi Phi sound on the Sylphyo for its relatively sharp attack. This sound was also used in similar testing by Peter Ostry.
I took time measurements on these points in the sequence:
-
The "initiation" of my breath into the Sylphyo (more on this below).
-
The beginning of sound out of the Sylphyo headphone port.
-
The beginning of sound out of the Link {Line Out} ports.
-
The appearance of MIDI on the Link {USB Power Supply} port.
-
The appearance of MIDI on the Link {MIDI DIN Out} port.
I was not able to capture the appearance of MIDI on the Link {USB Host} host port (I use a Sevilla Soft MIDI USB-USB device for these purposes. The Sevilla does not transfer USB data from the USB Host port on the Link ... not sure why ... maybe because there is no way to select the USB port on from the Link?)
I recorded my breath input using a mic at my mouth to capture the sound when I breathed into the Sylphyo. I pre-pressured my mouth using my tongue to block the air escape, and then removed my tongue to get a "puff" of air.
I could find no hard data on the transient response time of various microphones. However, based on newsgroup discussions and informative articles, I chose this small-diaphragm condenser (see, for example, https://www.quora.com/What-type-of-microphone-has-the-best-transient-response and https://www.pro-tools-expert.com/production-expert-1/2018/8/22/do-you-ever-think-about-how-fast-a-microphone-is-and-how-that-affects-the-sound).
Based on this, I chose an Oktava MK-012A-01 small-diaphragm condenser with a cardiod capsule and no pop filter.
Measuring the "initiation" of the breath was challenging and somewhat variable. The image below shows how I made the selection. The waveforms are greatly expanded vertically, so the clipping is only in Reaper, not the original waveform.
(click for a larger image)
I chose a point (marked in this example case with a vertical bar) that seemed to represent an initial movement of air rather than the sound of my lips. Selecting the initiation points of the other audio signals and the MIDI initiation points was far easier and could be done precisely to the sample in Reaper.
All calculations were done based on the sample numbers in the audio steams.
I have previously measured the delays from these elements in the signal chain:
-
A/D and D/A transfers in the UCX II at 1.81 msec.
-
Transfers on the Atolla hub at 0.11 msec.
-
Conversion from DIN MIDI to USB MIDI was done with an M-Audio MIDISport 2x2 Anniversary Edition. I have measured a transfer through the hub plus the conversion of the MIDISport at 1.00 msec.
The raw time from breath initiation to the beginning of sound out of the Sylphyo headphone port was 12.08 msec (range 9.68-14.99). Both the mic that records breath initiation (on the UCX Mic 1 Input port) and the line inputs from the Sylphyo headphone port (on unbalanced stereo cables to the UCX Line 3-4 Input ports) experience the A/D conversion so they cancel. There is additional delay from the actuation of the microphone itself, but that is unknown and should be minimal. So, ...
The average net time from breath initiation to sound coming out of the Sylphyo was measured at 12.08 msec (range 9.68-14.99).
The raw time from breath initiation to the beginning of sound out of the Link {Line Out} ports was 17.76 msec (range 16.70-19.59). As with the prior example, the A/D conversions cancel out, so ...
The average net time from breath initiation to sound coming out of the Link {Line Out} was measured at 17.76 msec (range 16.70-19.59).
The raw (and net) time delta between the beginning of sound out of the Sylphyo headphone port and the beginning of sound out of the Link {Line Out} ports was highly variable. The average was 5.68 msec, with a range of 2.00 - 8.03.
There are two elements that can affect latency on these two paths: the difference in time to render sound on the Sylphyo vs. the Link, and the radio transmission to the link. If we assume that the time to render sound is the same on the Sylphyo and the Link this gives us insight into the speed (and variability) of the radio link. One thing that may affect the radio link is the dense wireless IP environment that these tests were performed in.
The raw time from breath initiation to the appearance of MIDI on the Link {USB Power Supply} port was 12.34 msec (range 10.29-14.20). The mic that records breath initiation (on the UCX Mic 1 Input port) experiences a measured delay of 1.81 msec from the A/D conversion. The Link {USB Power Supply} port was sent through the Atolla hub, adding 0.11 msec to the latency. This gives a net bias of (1.81-0.11) = 1.70 msec. So, ...
The average net time from breath initiation the appearance of MIDI on the Link {USB Power Supply} port was measured at 11.04 msec (range 8.59-12.50).
The raw time from breath initiation to the appearance of MIDI on the Link {MIDI DIN Out} port was 13.68 msec (range 12.29-15.22). The mic that records breath initiation (on the UCX Mic 1 Input port) experiences a measured delay of 1.81 msec from the A/D conversion. The Link {MIDI DIN Out} port was sent through the MIDISport unit and the Atolla hub, adding (1.00+0.11) = 1.11 msec to the latency. This gives a net bias of (1.81-1.11) = 0.70 msec. So, ...
The average net time from breath initiation the appearance of MIDI on the Link {MIDI DIN Out} port was measured at 12.98 msec (range 11.59-14.52).