Audio files must be in mp3 format. If your audio files are in another format, use this online tool to convert your files. Then, import into Audacity (below) to post process.
Use a quality microphone
Pick a quiet place with little background noise or disturbances. Both recorder and 'talent' should remove all accessories which may interfere with the recordings (ex. jewelry, keys, phones).
Have the "talent" speak clearly and slowly — slower than feels natural, with good annunciation, discernable breaks between words, and plenty of pauses. Have him/her speak slightly louder than usual ("project") but not so much that it sounds unnatural
Record a short test take before starting to ensure equipment is functional and audio quality is good
Have all text to be recorded printed out and numbered in large font.
Begin recording. Have them read each phrase in order with a short pause after each (~3 seconds). Have them read the number (in English, if possible) before each phrase, with a short pause (~1 second) between the number and the phrase. The numbers will aid greatly in identifying which phrase is which, especially if they were recorded in a language other than your own.
Try to record all the phrases in one take (one audio file). Don't use a separate file for each phrase. If the recording is interrupted with background noise and the speaker messes up, let the recorder keep running and continue on when possible, starting with that same phrase.
Do several takes.
Depending on the pronunciation of the 'talent' you may need to adjust the position of the recorder. Generally, a 45 degree angle downwards from the mouth works well. Additionally, a 1 inch gap between the mouth and the recorder is recommended.
If using Zoom H1
Carry extra batteries.
Turn On: Slide power switch down for 1 second
Turn Off: Slide power switch down for 1 second
Reducing Noise: Back of Recorder – Low Cut On will reduce any wind or background noise.
Input Level: Can be automatic by switching Auto level – On (back of recorder). Or can be done manually (recommended settings to come)
Output Level: Volume that will come through your headphones. Manually adjusted near headphone input.
Recording Format: Wave Format is higher quality sound than MP3. Change the Bit Rate using the arrows (this will only change if SD card is inserted). Recommended settings would be 48/16 or 48/24. Higher bit rate results in highly quality audio recordings, however, decreases the recording time of the card. Depending on the length of the audio recording, the bit rate can be adjusted appropriately.
Listening to Recordings: After recording, you can use the play button on the side to listen to any recording. The side arrows will allow you to choose which recording you would like to listen to. As the recording plays, the remaining time of the recording will be shown. The playback can be heard using your headphones or simply through the recorder.
Deleting Recordings: While the playback is running or when it is complete, you can press the trash key to delete. Press the Record button for confirmation of the deletion. A message ‘Done’ should appear once the deletion is complete.
Extracting and Splitting
Configure the MP3 encoding settings in Audacity (Edit -> Preferences -> File Formats -> Bit Rate). For speech, 64kbit mono encoding should be adequate. If the audio contains other noises or music, 96kbit mono could be considered. For very high quality applications (at a minimum, CommCare user will be using headphones) use 128kbit stereo. 64kbit mono requires ~7KB per second of audio. We use variable bit-rate encoding (better quality for a given file size).
Extract the recordings from the recording device.
Convert the recordings into .wav format if they are not already converted (most mp3 players have an option to create .wav files).
Open each .wav file in 'audacity' (music editing program)
You should be able to see the numbers and phrases clearly. For each phrase, select the portion you want to extract as the audio clip (with a slight pause both before and after the speaking). Skip attempts that are not usable. Among all the takes for each phrase, choose the best one and discard the rest.
Once selected, chose File --> Export Selection as MP3. Save to the new file name that you want. That's it!
This step makes all the recordings approximately the same volume. First download and install MP3Gain.
Open MP3Gain and choose "open file/folder" and open all the clips you want to use. Then do Gain --> Apply Constant Gain. Configure and tweak as needed.
Command Line Instructions
mp3gain -r -c -d 10 *.mp3 (assuming all the mp3s are in the current directory)
The -d 10 is a volume boost (here, 10dB) to give to all files after they have all been normalized to the same volume. This is because the default volume level tends to sound quiet on the phones. Tailor the amount of boost to your deployment and the devices you will use). Each 10 dB of boost approximately doubles perceived loudness.
Don't boost too much or clipping will occur (the stength of the signal is boosted beyond the maximum of what the sound file can represent; the rest is 'clipped' off). Excessive clipping will sound harsh and severely degrade sound quality. You can view the amount of clipping in audacity.