(This is a continuation from my previous blog post.)
Re-cap
I was trying to read in audio samples from my microphone using Rust’s cpal library. Somehow I couldn’t get 44100 samples per second even though I had verified that the sampling rate was 44100 Hz. I tried various things and nothing seemed to work.
More attempts to solve the mystery
Attempt 4
Cpal uses coreaudio-rs on OSX, which is a “Rust-esque” interface over Apple’s CoreAudio framework. From my comparisons of portaudio and cpal, I picked one difference which seemed important to me. When portaudio wants to ‘stop’ an input stream, it calls a bunch of stuff including AudioUnitReset - and cpal was not doing this one particular call. What’s more, coreaudio-rs doesn’t even have a way to call the AudioUnitReset routine. I downloaded and built my own copies of cpal and coreaudio-rs and made the necessary changes. This taught me how to modify Cargo.toml to tweak where a dependent package got picked up from - the crate registry, a git repo, or a local directory. I didn’t get the desired result yet - I was still receiving somewhere between 40000 and 43000 samples per second.
Attempt 5
Since nothing seemed to help, I decided to see what happened if I tried to record audio for more than 5 seconds. I added a delay of 500ms after the 5 seconds, like
println!("start recording");
stream.play();
std::thread::sleep(std::time::Duration::from_secs(5);
std::thread::sleep(std::time::Duration::from_millis(500);
println!("stop recording");
For the first time I got close to 44100 x 5 samples. I started experimenting with the additional delay, and I found out that if I added 600ms I got all 44100 x 5 samples.
The Aha! moment
This is when the solution finally occurred to me - all the samples arrive as expected, but they can take a while to arrive. I was not waiting long enough. And why? Because I was doing the equivalent of looking at my watch and expecting an 8-hour job to be done in 8 hours precisely. Instead, I should have stopped looking at my watch and simply waited for the job to finish. Coming to my particular problem, I knew I should have got 44100 x 5 samples - so I should have waited till I got all of them.
The solution
Here is what I did - I added a variable delay based on the total number of samples received.
let max_samples: usize = 5 * 44100 * 2;
if let Ok(stream) = device.build_input_stream (&custom_config,
catch_data,
catch_error) {
println!("start recording");
stream.play();
std::thread::sleep(std::time::Duration::from_secs(5);
while num_samples2.load(Ordering::Relaxed) < max_samples {
std::thread::sleep(std::time::Duration::from_millis(200));
}
And this worked again and again. It worked for both debug and release builds. What’s more, I went back to the first thing I had done - adding a print statement in the callback function. And even then I did not miss a single sample.
Reflection
I realized that this solution was right in front of me all along. Because this is how portaudio works. The callback function supplied to the portaudio stream is supposed to return 0 when it knows it should receive more samples, and return 1 when it knows it got all the samples it needed. The portaudio ‘paex_record.c’ example which I was running had this all along - in the 8 lines starting from here. A lot of my debugging attempts were unnecessary to solve the problem. They did help me learn a lot of Rust concepts, so I am not complaining!