You have a lot of settings that you can use and to choose from, such as noise cancellation or auto gain control, for instance, but I've used only these three settings because in the documentation there's plenty. You can go check it out, but this is what I needed.
Then, user is prompted by this API, when you call it, with a pop-up. So, it asks the user to access microphone. User clicks yes and you get access to the microphone.
Then, we have the Audio API, which is an API that exposes a lot of interfaces, one of which is the audio context, which is an interface that allows you to decode and process the signal and to perform calculation. Then, after the audio context, I've used the createAnalyzer method. There's another method that allows you to perform real-time time and domain frequency analysis, so you get access to the signal and you can analyze this signal.
As you can see in this image, you have the signal, you insert this analyzer node, which is a node that goes between the signal. You perform some calculation, some analysis, but the signal is unchanged. So, it does not alter the signal. It performs only analysis and it is used only for analysis purposes, not for changes.
The actual implementation is made with React and Next.js. The file structure is the following. I have created three main files, one of which is BrowserAudio.ts that is used to access the user media, so all the stuff that we have seen earlier. Then, we have Tuner where all the logic is orchestrated between the components. This is the main component of the application. Then, we have Pitch Detector, which is an utility library that is used to perform calculation, frequency estimation, pitch detection, and auto-correlation.
Let's see BrowserAudio.ts. This class is very straightforward. You have two attributes that are AudioContext and Analyzer that are the methods that I'll talk about later, so the interface that we are going to use to get access to the user's microphone. Then, we have this method that I call GetMixStream. You use the MediaDevices API, WebMedia API, to access user media and it's done. The user gets prompted and you get access to the signal.
Then, in Tuner, you create instances of the BrowserAudio.ts and then we have this Buffer, which is an array, a float32 array, in which we are going to store all the data related to the signal because the AudioContext gives us the signal in this kind of form, so it's converted to numbers. After that, we instance the AudioContext and the Analyzer in order to get access to these interfaces. Then, we have a method that is called StartTuner that is used to access the MixStream, so the method that we've seen earlier. You get access to the MixStream and you perform all the calculations that you want. After that, I set the source, which is a state variable with the AudioContext, so I get access to all the signal and then SetListening is only a state variable used for display purposes to show only an array or something. Then, this effect is used to perform the custom estimation, in which I have an interval that runs every one millisecond and calls this function, which is source.connect, in which I connect the Analyzer, that I've seen earlier in the image before, into the signal in order to perform estimation and frequency time analysis.
Comments