Latency Testing
-
I took a few hours yesterday to test specific latencies in the various rig setups I use. I was looking to compare things like Analog vs SPDIF, MIDI over DIN vs USB, USB direct vs through a Hub, etc. Also wanted to get real-world latency measures of A/D conversions in my interface, my looper (and it's own A/D conversions), and how my sound modules (XpressO and VL70-m) perform.
As part of this, I got some interesting measurements on the Sylphyo. These latencies were done by recording various audio and MIDI ports in Cantabile on a Win10x64 PC and comparing the waveforms and MIDI notes using Reaper.
I recorded both the direct output from the Sylphyo's headphone jack and the audio and MIDI output from the Link. I measured the sound produced by the Link to be 4.35 msec behind the Sylphyo sound. If the rendering engines in the Sylphyo and the Link take identical times (no way to know that), this imply that the radio link takes 4.35 msec (measurements were done with about a 2 foot distance between the Sylphyo and the Link).
It took another 7.35 msec for the Link to produce MIDI on the USB Power Supply port - i.e. Link MIDI appeared 7.35 msec after the Link had produced sound. Not sure why there is such a delay. I suspect that the MIDI appears on the DIN MIDI Out port earlier than on the USB Power Supply port (from some back calculation), but more tests would be needed.
(I now see this discussion - https://community.aodyo.com/topic/651/link-jitter-latency - and will look into re-doing these tests after re-pairing the Link and maybe switching my USB configuration)
I also attempted to measure the time from the intiation of a sharp breath input to my Aodyo Sylphyo electronic wind instrument to the production of MIDI. I used an external mic into Cantabile to capture the breath input. Of course the air has to go through the mouthpiece, into the inner chamber, hit the breath sensor, go out the Sylphyo on a radio link to the external Link box, and get turned into MIDI over USB, through a hub, and into Cantabile. I measured this at 30.32 msec.
I could not find any references that measured latency in physical instruments, but I suspect it is largely based on technique - an experienced sax player does not initiate sound by ramping up their breath - they have initial breath pressure in their mouth chamber and damp the reed with their tongue, then initiate the note by releasing their tongue (see https://www.saxontheweb.net/threads/getting-the-reed-to-vibrate.7885/)
More details on other aspects of my latency testing are in my post on the Cantabile forum: https://community.cantabilesoftware.com/t/latency-testing/7245
-
MIDI:
Sylphyo>USB and Wireless>Link>DIN MIDI: Link is 4.4 ms late (average).
It can become more sometimes, I don't know why.
This fits to your theory of 4.35 ms for the radio link.Audio:
From Link and from Sylphyo to the interface: Link is 2.1 ms late.
Oops. The difference should be over 4 ms, not? There are wireless MIDI and a synth in the path.I tested audio with high notes from the "El Harrachi" sound. The attacks are clearly visible in the DAW and I measured 102 samples difference at 48 kHz.
The radio link must be faster than 4 ms, rather around 2 ms. But I've never seen audio arriving faster than MIDI. Maybe my single measurement was not correct or the Link box eats up MIDI speed.
-
I repeated my tests on the Sylphyo latency with more care and double-checking. I also did each test three times to get averages and an indication of variability. I think (hope) I've accounted for every latency-producing element in the signal chain when making final calcs.
I have interspersed ideas and thought in italics.
Caveat: I am rather new to this kind of equipment testing, and may have screwed up at any point ... independent measurement of these types of numbers would be helpful and reassuring ... Your Mileage May Vary!
General Results and Thoughts
I measured about 11 msec from the time I initiate breath into the Sylphyo till I received USB-based MIDI from the Link over the radio link. This is using the Link's {USB Power Supply} port.
If you need MIDI over USB, latency will be lower when using the {USB Power Supply} port on the Link rather than the {MIDI DIN Out} port. The difference is about 2 msec (plus the latency introduced by any DIN-to-USB conversion that is needed).
If you need MIDI-over-DIN and you have a MIDI USB-to-DIN converter that is faster than 2 msec, you will lower latency by routing the Link {USB Power Supply} => MIDI USB-to-DIN converter rather than coming straight out of the Link {MIDI DIN Out} port.
If you are using the Sylphyo's internal sounds, the Link (over the radio link) produces sound on the Line Out ports after about 18 msec. You can save about roughly 5 msec of that latency by using a wired connection to the Sylphyo headphone port rather than the Link, if that fits your situation.
Of course, from there a lot of things may get added on to the latency of the Sylphyo when rendering sound in a real-world MIDI-based system. My current rig system has a latency of (very) roughly 35 msec, from breath initiation to the analog signal is sent to the house system.
I have found no research or discussion of how long it takes to initiate sound from real-world wind instruments. It must be quite "a while" for instruments like oboes and low tones on pipe organs.
One counter-point of comparison would be to measure the latency on an analog (control voltage) - based wind controller. Matt Traum comments that "Playing the analog synth that is connected via real control voltage feels to me like the synth is IN MY MOUTH ..." (https://www.patchmanmusic.com/WindControllerFAQ.html).
Summary of the Numbers
The average net time from breath initiation to sound coming out of the Sylphyo was measured at 12.08 msec (range 9.68-14.99).
The average net time from breath initiation to sound coming out of the Link {Line Out} was measured at 17.76 msec (range 16.70-19.59).
The time delta between the beginning of sound out of the Sylphyo headphone port and the beginning of sound out of the Link {Line Out} ports was highly variable. The average was 5.68 msec, with a range of 2.00 - 8.03. If we assume that the time to render sound is the same on the Sylphyo and the Link this gives us insight into the speed (and variability) of the radio link. One thing that may affect the radio link is the dense wireless IP environment that these tests were performed in.
The average net time from breath initiation the appearance of MIDI on the Link {USB Power Supply} port was measured at 11.04 msec (range 8.59-12.50).
The average net time from breath initiation the appearance of MIDI on the Link {MIDI DIN Out} port was measured at 12.98 msec (range 11.59-14.52).
This means that, on average, the MIDI signal appears on the {MIDI DIN Out} port of the Link 1.94 msec after the {USB Power Supply} port.
The Details
Testing was done 11/26/2021. The primary host is an Intel Core i7-6600U Toshiba 2016 laptop, Win10x64.
The DAW is Cantabile 3.3694 [6Aug2021] x64.The audio interface is a RME UCX II using the RME ASIO driver v1.212 [4May2021].
All testing done at 44.1 kHz sample rate, with a 48 sample buffer size. Note that there is an additional 32 sample buffer internally in the UCX II.
The Hub is an Atolla 7-port powered hub.
Post-processing and waveform analysis done in Reaper v6.21 [23Jan2021].
I selected the #31 - El Harrachi Phi sound on the Sylphyo for its relatively sharp attack. This sound was also used in similar testing by Peter Ostry.
I took time measurements on these points in the sequence:
-
The "initiation" of my breath into the Sylphyo (more on this below).
-
The beginning of sound out of the Sylphyo headphone port.
-
The beginning of sound out of the Link {Line Out} ports.
-
The appearance of MIDI on the Link {USB Power Supply} port.
-
The appearance of MIDI on the Link {MIDI DIN Out} port.
I was not able to capture the appearance of MIDI on the Link {USB Host} host port (I use a Sevilla Soft MIDI USB-USB device for these purposes. The Sevilla does not transfer USB data from the USB Host port on the Link ... not sure why ... maybe because there is no way to select the USB port on from the Link?)
I recorded my breath input using a mic at my mouth to capture the sound when I breathed into the Sylphyo. I pre-pressured my mouth using my tongue to block the air escape, and then removed my tongue to get a "puff" of air.
I could find no hard data on the transient response time of various microphones. However, based on newsgroup discussions and informative articles, I chose this small-diaphragm condenser (see, for example, https://www.quora.com/What-type-of-microphone-has-the-best-transient-response and https://www.pro-tools-expert.com/production-expert-1/2018/8/22/do-you-ever-think-about-how-fast-a-microphone-is-and-how-that-affects-the-sound).
Based on this, I chose an Oktava MK-012A-01 small-diaphragm condenser with a cardiod capsule and no pop filter.
Measuring the "initiation" of the breath was challenging and somewhat variable. The image below shows how I made the selection. The waveforms are greatly expanded vertically, so the clipping is only in Reaper, not the original waveform.
(click for a larger image)
I chose a point (marked in this example case with a vertical bar) that seemed to represent an initial movement of air rather than the sound of my lips. Selecting the initiation points of the other audio signals and the MIDI initiation points was far easier and could be done precisely to the sample in Reaper.
All calculations were done based on the sample numbers in the audio steams.
I have previously measured the delays from these elements in the signal chain:
-
A/D and D/A transfers in the UCX II at 1.81 msec.
-
Transfers on the Atolla hub at 0.11 msec.
-
Conversion from DIN MIDI to USB MIDI was done with an M-Audio MIDISport 2x2 Anniversary Edition. I have measured a transfer through the hub plus the conversion of the MIDISport at 1.00 msec.
The raw time from breath initiation to the beginning of sound out of the Sylphyo headphone port was 12.08 msec (range 9.68-14.99). Both the mic that records breath initiation (on the UCX Mic 1 Input port) and the line inputs from the Sylphyo headphone port (on unbalanced stereo cables to the UCX Line 3-4 Input ports) experience the A/D conversion so they cancel. There is additional delay from the actuation of the microphone itself, but that is unknown and should be minimal. So, ...
The average net time from breath initiation to sound coming out of the Sylphyo was measured at 12.08 msec (range 9.68-14.99).
The raw time from breath initiation to the beginning of sound out of the Link {Line Out} ports was 17.76 msec (range 16.70-19.59). As with the prior example, the A/D conversions cancel out, so ...
The average net time from breath initiation to sound coming out of the Link {Line Out} was measured at 17.76 msec (range 16.70-19.59).
The raw (and net) time delta between the beginning of sound out of the Sylphyo headphone port and the beginning of sound out of the Link {Line Out} ports was highly variable. The average was 5.68 msec, with a range of 2.00 - 8.03.
There are two elements that can affect latency on these two paths: the difference in time to render sound on the Sylphyo vs. the Link, and the radio transmission to the link. If we assume that the time to render sound is the same on the Sylphyo and the Link this gives us insight into the speed (and variability) of the radio link. One thing that may affect the radio link is the dense wireless IP environment that these tests were performed in.
The raw time from breath initiation to the appearance of MIDI on the Link {USB Power Supply} port was 12.34 msec (range 10.29-14.20). The mic that records breath initiation (on the UCX Mic 1 Input port) experiences a measured delay of 1.81 msec from the A/D conversion. The Link {USB Power Supply} port was sent through the Atolla hub, adding 0.11 msec to the latency. This gives a net bias of (1.81-0.11) = 1.70 msec. So, ...
The average net time from breath initiation the appearance of MIDI on the Link {USB Power Supply} port was measured at 11.04 msec (range 8.59-12.50).
The raw time from breath initiation to the appearance of MIDI on the Link {MIDI DIN Out} port was 13.68 msec (range 12.29-15.22). The mic that records breath initiation (on the UCX Mic 1 Input port) experiences a measured delay of 1.81 msec from the A/D conversion. The Link {MIDI DIN Out} port was sent through the MIDISport unit and the Atolla hub, adding (1.00+0.11) = 1.11 msec to the latency. This gives a net bias of (1.81-1.11) = 0.70 msec. So, ...
The average net time from breath initiation the appearance of MIDI on the Link {MIDI DIN Out} port was measured at 12.98 msec (range 11.59-14.52).
-
-
I love this kind of technical threads, and you've been doing a great job of documenting your experiments, it's really amazing!
I wanted to take a moment today to look in detail at your numbers, run similar experiments, and share my results (as well as details about how the Sylphyo and Link work internally and what are the hard constraints), because these numbers look way higher than what I remember from the tests we did when I was designing and developing the firmwares, the internal synth, and the wireless system.
I'll take some time to leave a proper answer and try to contribute some numbers, but for now I just would like to point out two things to consider.
First, there are a few Sylphyo settings that can increase latency, and the factory settings aren't optimal when it comes to latency. In other words, we usually_add_ latency to make the Sylphyo more easy to play. So any measurement report should include which settings were used. You can remove the added latency by setting MIDI Velocity to a fixed value (not Dynamic), the Breath rate at 1 kHz, and no key debounce time. Maybe there are some other settings I forgot, I'll double check when writing a longer answer.
Also, El Harrachi is not a good choice for this kind of tests, because it uses an internal delay/debounce (independent from the MIDI one; they don't add up) that waits a little bit for breath to build up so that a reliable "force" or velocity can be estimated when triggering the string excitation. A few other percussive/string sounds have this behavior (they all rely on a trigger that "fires" the sound and whose initial conditions must be known at that exact moment), and this results in more latency on these specific sounds. A much better choice would be a synth sound with minimal breath-controlled filtering, many harmonics (easier to see the beginning of a square wave than a sine), and a very fast attack, maybe like MacGuffin, Sync Asset, Cheap Tunes, or any of the organ sounds. Again, not sure of which would be best, but it's definitely not El Harrachi, because it's designed to have more latency.
I'll post a longer reply soon. Great work!
-
@join
The built-in delay of the Harrachi sound did not matter in my test. I just recorded audio from the Sylphyo and parallel via the link and measured the distance of the two attacks in a DAW. I just needed a sharp attack from both at the same time, no matter what they were thinking before (!) they sound.There is however another uncertainty. For the first test I felt free to assume, that both devices have the same synth and both synths are triggered by MIDI. But I do not know if this is true. The Sylphyo could get controlled directly by converted sensor data while MIDI generation happens somewhere else, in parallel or not. Then the Sylphyo may be at a disadvantage, timing-wise. Maybe you can provide a block/flow schema to help us understand the principle.
@Clint
Wow. What a tough job.
I will also need a while to study your text and possibly add comments -
@join
Thank you so much for the feedback. I am new to this kind of analysis / testing, but I really want to understand how this all works to get the best control over my sound ...Settings are provided below. Key settings I guess are:
** MIDI Mapping Breath rate: Low (125 Hz) -- Based on the XpressO and Ingo's advice Velocity: Dynamic Capture delay: 20ms ** Keys Reaction time: 20ms
Breath rate is Low / 125Hz because of the XpressO, which was choking (for a while, at least) on Medium and above.
Velocity = Dynamic because many of the VSTs I use depend on that. My VL70-m is fantastic, but I currently have a (very) hybrid rig (still in the expansion / exploration phase with my rig).
Key Feature would be a way to re-configure the Sylphyo via MIDI program changes. I could then switch settings when switching sound rendering modules or VSTs. Without that, I am really stuck with the "least common denominator" across all my sound rendering tools.
I am not sure what your term "key debounce time" is ... is it Keys / Reaction time?
I will use 15 - Cheap Tunes in the future for testing ... (that sounds the most immediate to me of the choices you suggested).
Thanks again!!
I never considered that the sound triggering mechanism might be other than MIDI. But, of course, analog synths use CV and that's a whole 'nother world ... would love to understand how this actually works on the Sylphyo!
All the settings I used for latency tests:
Clint's Configuration Settings for the Aodyo Sylphyo Firmware version: 1.4.8 As of 11/27/2021 * Sound Volume: 100% Reverb: 70% * MIDI Mapping Breath: CC 2 ...and --- ...and --- Breath rate: Low (125 Hz) -- Based on the XpressO and Ingo's advice ...per CC: On Delay notes if needed: On Velocity: Dynamic Capture delay: 20ms Max. velocity: 127 Slider ctrl. CC 1 Top slider ctrl. CC 12 Btm. slider ctrl. CC 13 Elevation ctrl. CC 75 Roll ctrl. CC 76 Compass ctrl. CC 77 Key-bend ctrl. CC 78 Program change > Send Bank Off Bank (MSB) --- Bank (LSB) --- Note-off delay 10ms * Breath Minimum: 20 Range: 600 Curve: Log Filtering: On * Keys Fingering: NativeFlute+ Octaves: 5(+2) octaves Invert ocatves: Disabled Left pinky: -1st: Disabled Right pinky: +1st: Disabled Replay same note: Disabled Reaction time: 20ms R.time octaves: 20ms (If [Reaction time] is 0ms, then this must be 0ms) More sensitive: Enabled Key-bend (BETA): PB+ Key noise (BETA): Disabled * Slider Function: Ctrl (latch) Top edge: Bend - Btm. edge: Ctrl. Edge size: 9mm * Movement Shake to move: On Shake vibrato: On Range: 20% Sensitivity: 43% Threshold: 9% Elevation control: On Bidirectional: Off Absolute: Off Range: +-35 degrees Roll control: On Bidirectional: On Absolute: Off Range: +-20 degrees Compass ctrl. (BETA): On Absolute: Off Bidirectional: Off Range: +-20 degrees Always on: Off ---- * Base Key: A3 * Output: MIDI ch. 1 ---- * Invert Display: Off * Receiver device: Connected CH 23 * LED: Off
-
-