Select Page

Auto Logout
The system will automatically log you out of account when you close the window or browser.

General Audio Transcription Training

Training Curriculum

We will go over in brief the training curriculum for the general audio transcription program.

In this training we take you from understanding of what general transcription is and the complete understanding on how to complete the needed job tasks

What we will teach you is what is called general audio transcription which simply means you will be listening to general audio and typing what you hear to text.

The other common transcription jobs are “medical” and “legal” which uses the same concept but requirement of medical coding and legal experience is required to do those two types of transcribing.

You will be considered a “General Audio Transcrriber” by the end of this program.

Here is the Training Curriculum:

1. What is Transcription
2. Hardware and Software Commonly Used
3. Style Choices For Transcription
4. How the Audio and Video is Collected to be Transcribed
5. What Formats are Used to Collect Audio/Video
6. Free Audio Transcribing Software
7. Transcription Formats and Usages
8. Practice Assignments
9. The Job Resource Categories

1. What is Transcription?

Can’t see video? Then CLICK HERE to view at original source in new window

The actual dictionary definition reads as follow:

Transcription (linguistics), the conversion of spoken words into written language. Also the conversion of handwriting, or a photograph of text into pure text.

In simple words you will take recorded audio/video and type it into text form using a computer document form such as Microsoft Word or Excel. The biggest thing to learn is how to format this text, as we will show you.

You may have already done transcription and didn’t even know it; for example, maybe when you were younger, you had a favorite song and you wanted to learn the words.

So you put the digital media player, cassette or CD on play and jotted down the exact words of the song by starting and stopping the song so as to gather all the words. Yes, this would be transcribing.

The only difference now is you will be jotting down into particular formats by typing the words onto a computer document.


2. Hardware and Software Commonly Used

Can’t see video? Then CLICK HERE to view at original source in new window

We will list all the hardware and software a transcriber will need. We are going to explain what you will need to get started, and potentially need to get at a later time as well as optional equipment, guides, and software that can increase your level.

Hardware and Software Needed:

Computer/Mouse/Keyboard – If you are doing this program, then you obviously have this already. You don’t need to have a state-of-the-art computer, as long as it has enough hard drive space and memory to handle a small work load.

Headphones or Speakers – If you don’t have headphones that is okay as long as you have speakers on your computer. If you don’t have speakers then you will need headphones. I don’t think we need to explain why you need some kind of listening device to do audio transcribing. It is ideal to use headphones if possible.

If you would like to see headsets commonly used by transcribers Go Here

Operating System (OS) – We suggest you use Windows XP or higher, which of course would include Widows Vista, Widow 7 or Windows 8.  You can use any MAC OS X version or Linux as well. We provide the transcription software for any of the mentioned operating systems.

Internet Connection – It is not written in stone that you need high-speed Internet; you can get by with dial up. However, some of the companies will require high speed Internet for connecting to their servers to upload/download files for transcription.

Transcribing Software – Don’t worry we have you covered on this. We will give you a 100% free transcription software you will be able to use to get started.

Word Processing Software (Window Microsoft Office, Corel, Lotus etc.) – We suggest you use an advanced Word Processor such as Microsoft Office, Corel Office, Lotus or Open Office.  Again! we have you covered as you can download the free Open Office Word Processor we provide in the “Resources” section.

A Good Idea to Have, But Not Required:

CD Player – This may not come into play so much, but might be something you may need if the audio for transcribing is mailed to you on CD. You can use any CD player whether it is on your computer or an external unit. If you don’t have a CD player no worries as most audio transcribing has already been formatted into digital files for transcribing.

Printer – The use of printers are now being fazed out because most of the transcribed documents will be uploaded directly to the client or company you do the jobs for. If you have one then great, if you don’t you are okay and you won’t need to rush out and buy one.

DVD Player – A DVD player is required, however, some video transcription assignments may come DVD in the mail. This falls into the same category as a CD player because most video is now transferred by MP3, MP4 and AVI file to name a few of the digital formats.

Hardware and Software to Consider in the Future:

Advanced Transcribing Software – The transcribing software we will give you for free is very good but there are some really good advanced transcribing software that can really accelerate your transcribing. We would suggest to look to a more advanced software when you start doing many assignments.

Pedal Foot Controller – The transcribing pedal foot controller will allow you to start/stop the audio while you are typing with out interrupting your type flow. The pedal foot controllers are not too expensive and will pay for itself in no time with the amount of time it will save you.

Here are some options for a Pedal Foot Controller:

Option 1 Professional Three Pedal Foot Controller


NCH Swift Sound offers two pedals that you can purchase online and they are available for both serial and USB ports. Both USB foot pedals are compatible with Windows and Mac OS X. These are high quality 3 pedal controllers made for professional transcription work that cost around $US70 and can be shipped worldwide. The VEC foot pedal is also a suitable alternative for other transcription players.
More Information

Option 2 Game Controller Pedal


These are pedals made for games. The advantage is that they are common and available at most computer stores so if you really need a foot control today, this may be your only option. They should cost around $US 70. The disadvantages are that they are only 2 pedal controllers and that you may need to keep a steering wheel or joystick under your desk that is wired to the foot pedals!

What we have just mentioned will give you a good idea of what will be needed now and in the future to do transcription.


3. Styles Choices for Transcription

Can’t see video? Then CLICK HERE to view at original source in new window

There are a few different ways you transcribe the audio or video to the typed document. The ones to be aware of are:

Exact Verbatim
Also know as “Script Transcription.” At this level of detail, you transcribe everything that the speakers say. Stuttering or unfinished words or sentences will be transcribed, as well as superfluous speech such as “umm”, “ah”, “you know”, and nonverbal sounds (laughter, sighing).

Any pauses or interruptions will be included, as well as remarks (yeah, OK) by the interviewer that the interviewee does not respond to.

This level of detail is usually required only by researchers interested in speech patterns.

This is commonly used in:

Research study (scientific, medical, behavioral)

Here is an example of Exact Verbatim:


Near Verbatim
Also known as “Smooth Verbatim” or “Magazine Transcript.” This is the most popular type of transcription. Many clients are interested in some of the verbatim content, but not all of it. You can omit or include whichever details are of use to their project.

Typically at this level you would include nonverbal sounds such as laughter, slang words, improper grammar, most superfluous speech and some indication of unfinished words or stuttering. However you would not transcribe noises such as coughs or every “umm” or “ahh” or include superfluous remarks from people other than the speaker.

This level of detail is used by most clients who require accurate recordings of their subjects, yet are not studying exact speech patterns.

This is commonly used in:

Interviews and focus groups (primarily the content is of interest to the researcher)
Business Meetings
Presentations with discussion, questions and answers (typically for publication)
Conferences (several speakers, moderated dialog)
Q/A- sessions (questions and comments from the audience)

Here is an example of Near Verbatim:


Content Only
This is an expansion of the near verbatim style. You edit as you transcribe. You correct grammar, eliminate interrupting comments from the interviewer, correct slang, and omit personal comments. This level of detail is appropriate for any project where the content of the speech is the main priority. At this level of detail, anything superfluous to the content will be ignored, including most non-verbal sounds, stuttering, incomplete or revised sentences and superfluous speech by the speaker or others.

In many cases you would also edit the text to remove incorrect grammar or slang words (editing “yup” into “yes,” or correct word use). This style is often used by those publishing their transcript, such as a conference lecture or a Congressional hearing.

Those clients are looking for a transcript of the content of the talk and are not interested in how exactly it was presented; this level is appropriate for their project. Content only is mostly used when the “gist” of the recording is desired, perhaps for the purpose of summarizing a discussion.

Rough Draft
You would type what you hear as you listen to the audio just once. You would not stop to rewind and re listen to pick up inaudibles.

The client would have to be prepared to spend considerable time to review the audio and correct the document. This is a fairly uncommon type of assignment, but at least you will know what to expect if you receive a rough draft assignment.

Time Coding
Here are some of the format styles of transcription that is uniquely designed for audio, video editing. Time codes are indicated at every speaker change using a certain time stamping format that we will teach you how to use..

Here is an example of Time Coding:


Here is a list of the common type of transcription assignments:

Technology podcast shows, daily transcription
Business meetings transcription
Inspirational teleseminars transcription
Conference calls transcription
Training cassette transcription
Business leaders interview transcription for publishing
Focus group interview transcription
Church Sermon transcription
TV show transcription
Oral history transcription
Quarterly Earnings conference calls transcription
Research interview transcription
Customer service calls transcription
Entertainment industry
and much more…

You will transcribe audio to text (most common) and video to text.

4. How the Audio and Video is Collected to be Transcribed

Can’t see video? Then CLICK HERE to view at original source in new window

There are three ways used to record data to be transcribed. They are as follows:

Using any hand-held dictation device such as a VHS for video, cassette, micro-cassette or, now the most common, digital recording device. The easiest way is to simply use a hand-held digital recorder for audio/video; it is much cheaper and effective than using a cassette or VHS. These hand-held units come from a number of suppliers and offer a wide variety of options for audio quality and file length. The recorder is then connected to a PC and the audio file, usually in “.wave” format, is uploaded into a designated directory in the PC.

These files can be played back by invoking the Windows Media Player or Winamp software (which we give you). The recorded digital file is then uploaded directly to you through e-mail or by accessing the client’s server files, if requested, with a one-button click. If a cassette is used, the client will have to mail the original copy to you (the transcriber) for transcription; that is why the digital is now being used more often and with better quality. We will explain a little later in the training how to use the different digital formats.

Dictation via telephone. Some companies will allow clients to use dial-in dictation from any remote phone via a local or 800 number. This allows dictation to be done on a 24/7 basis. The dictation is recorded on digitalized sound, compressed, encrypted, and immediately placed on an electronic queue for professional transcribers.

Digital dictation, in general, eliminates the cost of pickup and delivery of analog tapes resulting in huge cost savings and quick turnaround. Unless you have your own transcription business, which we will show you how to do later in the program, you will not need to worry about this. The companies that will outsource the assignments to you will do this for you and send you the digital file ready for you to transcribe.

Dictation support via PC
Another effective option is to utilize the PC or a laptop. All reasonably current PCs and laptops come with an audio function built in. All it needs is a low cost microphone to be plugged into the “MIC” jack. Windows comes standard with the sound recorder, usually found in the Accessories section.

There is, however, a downside in using such a system. The files are sometimes digital formats, such as “.wave” or “.mp3” “.mp4” or “.wmv”. Direct PC recordings usually default to the “.pcm” format, which results in very large files, and is therefore not amenable for efficient Internet transmission. There are solutions to this problem that you will not need to worry about as the transcriber. The companies that will collect the recorded audio will do this for you and transfer it to you ready to be transcribed.

General Recording Tips to Give Your Clients for Transcription
It will make it easier for you to transcribe if the person who is recording can follow some simple tips when setting up recordings. If you are only going to do outsource assignments and NOT own your own transcription company, then you will not be responsible for the quality it will be the company outsourcing the work to you who should make sure the quality is good enough to get an accurate transcription.

However, even if you are only going to be doing outsource assignments, you can always make suggestions to the company that will be sending you the assignments if you continually get bad quality files.

Here are some good tips (Courtesy of Wordworth Typing and Transcription) to give your clients:

Ask participants to avoid talking at the same time

Before you start the event, a sound check (where you record a few words from each subject and then listen to make sure the result is clear) is helpful. When doing a sound check, make sure each person speaks at the distance from the mike that he or she will be at during the entire interview.

Try to minimize background noise. Some common sources of background noise include:

-Traffic, construction, and other street noise coming through open (or even closed) windows.
-Noise from other rooms or hallways coming through open doors.
-Machinery running in the background, e.g., fans or air conditioners.
-TV sets and radios.
-People making noise in the background.
-Pets or other animals.
-Clocks that chime (especially those that do so every fifteen minutes).
-Doors shutting or slamming.
-Coughs, sneezes, etc.

If anyone is leaving or entering the room during the conversation, encourage them to close the door softly and encourage speakers to pause while the door is being opened.

Ensure that a microphone is close to the person speaking. One mike per person is ideal.

Try to place microphones quite close to the speaker and pointing directly toward him or her.

If in an interview there is only one microphone, direct the mike to the interviewee as it will be less of a concern to miss out on transcription of the questions than the answers.

If you have a choice of microphones and do not have one mike per speaker available, or if a speaker will be moving around during the event you might prefer an omnidirectional mike (which picks up sounds from all directions). Conversely, directional mikes work best if you have one mike per speaker and the speakers will not be moving much.

If you use lapel mikes, make sure they won’t be rubbed by a piece of clothing and that they pick up the speaker’s voice when his or her head is turned.

If possible, encourage speakers to make some verbal reference to things they may be indicating visually.

If it’s important to get down references to people, places, Web sites, organizations, etc. that the transcriber might not know or be able to easily distinguish, it’s ideal to repeat them clearly or even spell them out.

Alternatively, if your project involves a lot of jargon or technical terminology, consider sending the transcriptionists a list of terms likely to have been used. The more context the transcriptionist has, the more accurate his or her work.

If you feel comfortable that the recording is quite clear, you may wish to urge an interviewer not to repeat back what the respondents say, as some interviewers are inclined to do. Alternatively, you might direct that the transcription leave out such repetitions.

However, if you’re concerned about sound quality of a recorded interview, you might prefer to have an interviewer repeat important responses.

If an interviewer is using a standard list of questions, you may want to provide that list with the recorded interview.

Recording Tips to give clients for Multiple Speakers

The following additional guidelines are useful for events where there are more than two people involved:

It is very important to have a microphone for each speaker. This is commonly done in conferences, but often overlooked in focus groups, group interviews, or other smaller settings. Having a speaker some distance from a microphone almost guarantees that their contributions will disappear behind background noise.

If you have multiple speakers, it’s ideal to be able to identify each speaker each time she or he speaks. If that is not possible, it’s helpful for the speakers to introduce themselves at the beginning in their own voice.

If you use a mike for an audience or other large group, such as a mike in the aisle for questions, it helps cut down on noise (e.g., coughing) if you turn that mike on only when, for example, someone in that group is asking a question.

If you have an audience asking questions but don’t use a separate mike for them, you can ask your speakers to repeat the question that has been asked, before answering it. This is also sometimes helpful if other audience members may not have heard the question.

When recording an interview, meeting, lecture or other event with the intention of having it transcribed later, you can help make the transcription process as efficient and accurate as possible. While it’s not always possible to follow all of these tips, taking them into account can help ensure better transcription by improving sound quality and minimizing incidental noise.

The better the recording, the more accurate and cost-effective the transcription will be.

As much as possible, ask the client to try to follow these guidelines for best results, which will make your job as the transcriber much easier.

Poor Quality Recordings
In some cases clients may still transcribe poor-quality dictation because the content is essential. The outsource companies or yourself will review each case individually to let the client know what can be done to provide the best quality transcription possible. Poor-quality dictation includes those in which there are noisy, muffled, simultaneous overlapping conversations, and two or more speakers recorded at greatly different volumes. In some cases the outsource company or yourself will be able to digitize and enhance the audio to remove noise or clarify the speakers.

In such cases you will work to understand how many “inaudible” sections are permissible. This work is billed at an hourly rate depending on the services needed. Rush service for poor-quality tapes, if offered, is billed at rates higher than normal rush service because of the huge effort and possible transcriber fatigue involved.

In large jobs where you encounter a poor-quality tape or digital file, you can often choose to not transcribe the particular recording until the client is contacted for guidance. In rare instances you may refuse to transcribe very poor audio because of the likely poor quality of the resulting transcription or fatigue on the transcriber (yourself).

More than likely, the outsource company sending you work will weed out these assignments. However, you can suggest that you will try certain assignments, and thus build a good relationship with the outsource company. However, this can result in many future projects of poor quality recordings. The upside is that these will pay the best; the downside is that you have to work a little harder (maybe a lot harder in some cases!) to decipher the recording.



5. What Formats are Used to Collect Audio/Video

Can’t see video? Then CLICK HERE to view at original source in new window

Here we will teach more about the devices used in the Voicescriber method to gather the recordings. The two methods used today are Tape and Digital.

We will not go over tape because tape recording are pretty much a thing of the past. So we will spend the time going over digital audio files.

Recording Formats
Most popular recorders use a single track of audio. Some of them have two speeds at which the audio can be recorded. Recording on the fastest speed produces higher quality dictation, but provides less recording time on the digital tape or digital file.

Multiple-track recorders are typically used in settings that require very accurate transcriptions and have multiple persons that might speak simultaneously. For instance, courtroom transcripts are often taken by a four-track recorder with each person wearing a separate microphone and recording on a different track of the tape: judge, two lawyers, witness.

Multiple-track recorders are rare outside of the courtroom setting. However, they provide superior transcripts because the transcriber allows one to listen to each track individually or all tracks at once.

At this time you should not concern yourself too much about cassette recordings because, as already mentioned these are a thing of the past.

Digital Audio Files
As mentioned earlier, the latest way of transcribing is from digital files, such as MP3 or MP4. In the next few years almost all transcription work will be through computers, using specific file formats on the computer. This is why now is the best time to take this opportunity to learn this new way of transcribing.

Many of the large transcription companies who have been using the old methods such as Dictaphone and tape transcription are struggling to get up-to-date with the newest technology. This is why we will focus this training on the latest and greatest transcribing – DIGITAL!

With the advent of multimedia computers (audio, video, etc.), more material is being generated in the form of digital computer files. Digital hand held dictation devices are now available that record to a memory card and can generate audio files you can place on disk or send over the Internet. You will have the ability to transcribe such files that come in a variety of formats.

We are only going to teach you the terms and some brief definitions. DON’T get overwhelmed with this information or feel you need to become an expert on computer audio formats. We are only providing you with terminology so if someone refers to certain terms, you will know they are referring to an audio format. You will not be expected to supply technical information on these terms.


Some of the existing formats for digital audio files are:

File Extension Creation Company Description
.3gp multimedia container format can contain proprietary formats as AMR, AMR-WB or AMR-WB+, but also some open formats
.act ACT is a lossy ADPCM 8 kbit/s compressed audio format recorded by most Chinese MP3 and MP4 players with a recording function, and voice recorders
.aiff Apple standard audio file format used by Apple. It could be considered the Apple equivalent of wav.
.aac the Advanced Audio Coding format is based on the MPEG-2 and MPEG-4 standards. aac files are usually ADTS or ADIF containers.
.amr AMR-NB audio, used primarily for speech.
.au Sun Microsystems the standard audio file format used by Sun, Unix and Java. The audio in au files can be PCM or compressed with the μ-law, a-law or G729 codecs.
.awb AMR-WB audio, used primarily for speech, same as the ITU-T‘s G.722.2 specification.
.dct NCH Software A variable codec format designed for dictation. It has dictation header information and can be encrypted (as may be required by medical confidentiality laws). A proprietary format of NCH Software.
.dss Olympus dss files are an Olympus proprietary format. It is a fairly old and poor codec. Gsm or mp3 are generally preferred where the recorder allows. It allows additional data to be held in the file header.
.dvf Sony a Sony proprietary format for compressed voice files; commonly used by Sony dictation recorders.
.flac File format for the Free Lossless Audio Codec, a lossless compression codec.
.gsm designed for telephony use in Europe, gsm is a very practical format for telephone quality voice. It makes a good compromise between file size and quality. Note that wav files can also be encoded with the gsm codec.
.iklax iKlax An iKlax Media proprietary format, the iKlax format is a multi-track digital audio format allowing various actions on musical data, for instance on mixing and volumes arrangements.
.ivs 3D Solar UK Ltd A proprietary version with Digital Rights Management developed by 3D Solar UK Ltd for use in music downloaded from their Tronme Music Store and interactive music and video player.
.m4a An audio-only MPEG-4 file, used by Apple for unprotected music downloaded from their iTunes Music Store. Audio within the m4a file is typically encoded with AAC, although lossless ALAC may also be used.
.m4p Apple A version of AAC with proprietary Digital Rights Management developed by Apple for use in music downloaded from their iTunes Music Store.
.mmf Samsung a Samsung audio format that is used in ringtones.
.mp3 MPEG Layer III Audio. Is the most common sound file format used today.
.mpc Musepack or MPC (formerly known as MPEGplus, MPEG+ or MP+) is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 kbit/s.
.msv Sony a Sony proprietary format for Memory Stick compressed voice files.
.ogg, .oga Xiph.Org Foundation a free, open source container format supporting a variety of formats, the most popular of which is the audio format Vorbis. Vorbis offers compression similar to MP3 but is less popular.
.opus Internet Engineering Task Force a lossy audio compression format developed by the Internet Engineering Task Force (IETF) and made especially suitable for interactive real-time applications over the Internet. As an open format standardised through RFC 6716, a reference implementation is provided under the 3-clause BSD license.
.ra, .rm RealNetworks a RealAudio format designed for streaming audio over the Internet. The .ra format allows files to be stored in a self-contained fashion on a computer, with all of the audio data contained inside the file itself.
.raw a raw file can contain audio in any format but is usually used with PCM audio data. It is rarely used except for technical tests.
.sln Signed Linear PCM format used by Asterisk. Prior to v.10 the standard formats were 16-bit Signed Linear PCM sampled at 8kHz and at 16kHz. With v.10 many more sampling rates were added.[7]
.tta The True Audio, real-time lossless audio codec.
.vox the vox format most commonly uses the Dialogic ADPCM (Adaptive Differential Pulse Code Modulation) codec. Similar to other ADPCM formats, it compresses to 4-bits. Vox format files are similar to wave files except that the vox files contain no information about the file itself so the codec sample rate and number of channels must first be specified in order to play a vox file.
.wav standard audio file container format used mainly in Windows PCs. Commonly used for storing uncompressed (PCM), CD-quality sound files, which means that they can be large in size—around 10 MB per minute. Wave files can also contain data encoded with a variety of (lossy) codecs to reduce the file size (for example the GSM or MP3 formats). Wav files use a RIFF structure.
.wma Microsoft Windows Media Audio format, created by Microsoft. Designed with Digital Rights Management (DRM) abilities for copy protection.
.wv format for wavpack file


The latest MP4 technology – This is now the most common transcription format.
Since the latest technological advances have resulted in using MP4 and digital recordings when doing interviews, we will explain why researchers (clients needing transcription work) are switching to digital MP4 technology, and how to use this technology for transcribing. This is a huge plus when it comes to applying for online assignments from these companies.

You will have the ability to present current credentials of knowledge of MP4 and digital transcribing. We can tell you that digital recordings are by far the clearest and most easily transcribed recordings for any of the scores of transcription projects you will work on.

What is MP4 and what does it have to do with research and transcription?

A very helpful and clear description of these recording methods is given by Nicholas Sheon. Analog recorders store sound as a magnetic coating on thin plastic tape, usually housed inside a cassette. Digital recordings store sound as a series of numbers; each number represents a sample or cross section of the continuous sound vibration. In order to fool the ear into hearing a continuous sound wave, digital recordings capture thousands of sounds per second. This results in the rather large computer files.

For example, music stored on CDs that we buy at the record store consists of large .WAV files of about 10 megabytes per minute of music. These .WAV files are so large that they quickly become unwieldy when stored on a computer. MP4 is a standard way to compress these large digital audio files.

The growth of the Internet and increasingly powerful home computers have made MP3 a very popular format for storing, organizing, and distributing music files over computer networks. MP3 files sound very similar in quality to the standard .WAV format that comes on a music CD, but MP4 compression software removes data from the original audio that your ears cannot really hear, such as very high or low frequencies.

By removing these frequencies and compressing what is left, MP4 compression software can reduce the size of audio files to one tenth of their original size before you can really notice a difference in audio quality. This compression process is often called “ripping” a CD. You could compress MP3 files even smaller, and reduce the number of samples per second, or bit rate.
Why use digital audio for data analysis?

This is why researchers who frequently use transcription services are switching to MP4 formats instead of the previous analog cassette. The MP4 format not only sounds a lot better than traditional audio-cassettes, but it is easier to store and takes less time to duplicate and transfer to a computer.

When a researcher records interviews or counseling sessions using MP4, it results in a much clearer and better sound which means less time and fatigue involved in transcription and analysis. Gone is the sense that you are listening to your informant’s voice through a wall of static and rumbling bass that is typical with analog tape recorders. MP4 files can be stored on an MP4 player, a CD-R, or a computer hard drive.

For example, if a standard music CD can hold 80 minutes of music in .WAV format, a researcher can fit ten times more, i.e. 800 minutes or 13 hours of MP4 audio onto one CD-R. CD-R’s, and digital files can be transferred onto CD at very high rates without any loss of sound quality. Having multiple backup copies ensures they won’t lose their data and that they have it with them when they need it.

Each time an analog tape is dubbed, e.g., to send to a transcriptionist, some of the audio quality is lost with each copy. This is not a problem with copies of digital files because each copy is essentially identical to the original. Digital files can also facilitate data security. CD-R’s and computer folders containing MP4 files can be password protected or even encrypted so that only the researcher can access the audio data.

The researcher can also “bleep” out identifying information from an interview (e.g. “My boyfriend’s name is bleeeeep.He lives on bleeeep Street”.)

So if this is a medical researcher they can edit out personal health information so that their project complies with HIPAA regulations. They can also change the pitch of a voice while retaining the speed of the recording, effectively altering the sound a voice, without slowing it down. This is useful for presentations where you want to play a segment of audio but make the voice unrecognizable.

Besides the better sound quality and ease of duplication and storage already mentioned, digital audio has some distinct advantages over analog tapes (micro cassettes). Because MP4 files are recorded at a fixed bit rate, MP4 files always play back at the same speed with which they were recorded.

Changes in pitch are a common problem with tapes, in that they may record at one speed on one tape machine and then when played on another machine, they will play either faster or slower. The resulting change in playback speed alters the pitch and the feel of the voice.

Even the slightest drop in pitch or speed can lead a transcriber to interpret what they are hearing differently. Slower playback makes them sound drunk, and faster playback makes them sound nervous or on speed. Although, MP3 files record and play at a fixed speed, playback speed can be manipulated to make a recording play at a slower or faster tempo while preserving the pitch.

This is very useful for transcription. Another advantage of a constant playback and record rate is the ability to precisely index or bookmark a selection in an audio file.

For example, a researcher has noted in their transcript that there is an unintelligible utterance at 2 minutes and 32.457 seconds in the recording. They want to ask someone else to listen to it and see if they can make sense of it. If the researcher gives them a tape, they will have trouble finding it because the other person’s tape player may play at a different speed; what happens at 2:32.457 on one person’s machine may be different than with another tape player.

This has been a problem for years with researchers sending cassette tapes to transcribers. However, this segment will always be found at a precise time on a digital recording, no matter what computer or MP4 player you are using to listen to it. The other big advantage is that a researcher could store their audio data on a server or in computer files so a transcriptionist could download it.

They can also play segments of their data in a presentation or make it available on the web so that readers can hear what the transcript is attempting to represent. The difficulty of working with cassettes for analysis has led many qualitative researchers to rely heavily on text transcriptions as a basis for their analysis.

This reflects a particular view of spoken language as merely a window into concepts, beliefs, and experiences that are seen to exist in the mind of the speaker. According to this view, there is little difference between written words and the spoken words they represent.

This is a bit like saying a dead butterfly specimen pinned down in a display case is equivalent to watching a live specimen fluttering and interacting in its environment. There is so much detail that tends to be lost when analog recording technology is used or when spoken discourse is transcribed, i.e. translated into written text.

Certain software programs used for qualitative analysis, like Transcriber (which we will be giving you for free a little later), will allow you to synchronize transcripts to a digital audio file, much like a Karaoke machine. Such synchronizing and linking is not possible with cassette tapes. Finally, if you want to measure how long a particular phenomenon takes, digital audio and video allow you to precisely measure the temporal unfolding of social interaction.

For example, if a researcher wants to measure and tabulate the amount of time counselors spend discussing various issues with their clients, by indexing the counseling sessions in this way, they can also retrieve related segments for further comparison and analysis. They can even create time lines that chart the sequence and distribution of various communication formats over the course of the session.

What can be done with Audio files to enhance the recording?

Digital audio files can be enhanced either to improve poor-quality sound or by adding various special effects.

* Uneven speaker volumes can be adjusted so that low volume speakers can be heard.

* One speaker can be increased/decreased in volume to generate a sense of distance or depth.

* Many constant background noises can be eliminated without distorting the speech.

* A large number of special effects can be added to all or parts of the recording.
Such services are typically charged additionally at an hourly rate

About audio files from video
Video may now refer to videotape or electronic video files. Digital audio can usually be extracted from video files and transcribed as noted above. Videotape transcription requires making an intermediate audiotape that can be more easily transcribed.

You also will be able to transcribe audio from any source on the World Wide Web if you can access it with a standard browser or program. You could transcribe an online video or podcast directly online.

6. Free Audio Transcribing Software

Can’t see video? Then CLICK HERE to view at original source in new window


Express Scribe Transcription Player Publisher’s Description:

Express Scribe is professional audio-playback-control software designed to assist the transcription of audio recordings. It is installed on the typist’s computer and can be controlled using the Features: Plays compressed .WAV files ~ Variable speed (constant pitch) playback ~ Can use computer rudder pedals (or some other specialists transcription pedals) to control playback ~Dock portable recorders to load recordings ~ Uses system wide Hot Keys.


Variable speed playback (constant pitch).
Supports many professional foot pedals, which connect to the game, serial or USB port to control playback.
Uses ‘hot’ keys to control playback instead of foot pedals or when using other software (eg., Word).
Ability to dock both analog and digital portable recorders to load recordings such as external tape deck, etc.
Works with Microsoft Word and all major word-processor applications.
Automatically receives and loads files by Internet, email or over a local computer network.
Automatically sends typing to the person who dictated the work.
Works with speech recognition software such as Dragon Naturally Speaking to automatically convert speech to text.
Loads CD audio directly – listen as it loads in the background.
Works with FastFox typing utility to turn difficult medical/legal phrases and common terms into mere shortcuts from your keyboard.
Express Scribe is 100 % free for you to use for your transcription work.
Supported File Formats:

Ability to play most audio file formats (including encrypted dictation files) including:

wav, mp3, mp4, au, aif, vox,
dct (encrypted dictation)
Windows Media, VoiceIt (sri)
RealAudio (ra and rm)
Olympus, Lanier & Grundig (dss)
Sony Recorder formats (msv, dvf)
Philips Digital Recorder format
Sanyo Digital Recorder format
DSP TrueSpeech*, GSM 6.10,
mp2, vox, PCM, uLaw,

System Requirements:

Works on Windows XP 2000/2003/Vista/2008/Windows 7/Windows8
Mac OS X – 10.1 or above

Express Scribe – Download, Install and Run Express Scribe

Click on the Icon of Your Operating System to Download Express Scribe Below:

Windows                                          Mac                                                 Linux

microsoft_windows-128                         apple-128                         linux_tux-128

The following screen will displayed when download begins:


We recommend that you should click Run unless you want to save the setup file to your hard drive then choose Save.


Step 2 of 4 Open Setup File
If you clicked Run in Step 1, you may receive a security warning, click Run again if you receive this warning and go on to step 2B.


Step 2B: If you clicked Save in Step 1 or if you are using Firefox or Chrome, you need to double click on the file “essetup.exe” in your downloads list for the browser you are using. Click Run when you see the screen below:



Step 3 of 4 license Terms

After you have clicked run, the Express Scribe Software License Terms window will be displayed as below: Read the license terms and if you agree with them select “I agree with these terms” and click Next


After the license terms appear a Related Programs dialog will appear. Click on the related product to se description and if you would like to install.

None of the options are required to do this training, but you may find some nice added features to try. If you do not want to try any of the related programs then make sure all boxes are unticked. Click Next to finish the install


Step 4 of 4 Installation Complete
You have finished installation. Express Scribe is now ready to use and you can proceed to the next part of the online tutorial.


You have finished installation. Express Scribe is now ready to use and you can proceed to the next part of the online tutorial.


Express Scribe – Basic Operation

Before continuing, please look at the screen shot below and locate the positions of all basic controls:



Step 1 of 6 Loading a Recording
Express Scribe displays recordings that are similar to tapes. Each recording is listed with details of who created the file, when it was created, the length of the recording, notes, and the current location.

To load a recording file, press Ctrl+L (hold down the Control key and press L) or click on the Load button on Express Scribe.

Express Scribe displays recordings that are similar to tapes. Each recording is listed with details of who created the file, when it was created, the length of the recording, notes, and the current location.

There are three options to load audio files into your Express Scribe program.

Automatic/Sync: You can set Express Scribe to check an FTP server, local network or computer folder and Express Delegate server for new dictations at timed intervals. The program will automatically load new dictations found in the specified path. Set up automatic loading from the Incoming tab in the Options dialog.

Manual Load: Load from a local computer folder, network or CD by clicking on the Load button on the main interface. You will be able to browse your computer and network connections for your audio file.

Dock: Transfer audio from a portable device directly to Express Scribe. There are two options, an Audio Cable Method which allows you to record audio from an analog source such as a cassette tape player, or Audio File Transfer for devices that can be connected by USB to your computer.


Step 2 of 6 – Playing a Recording

To play a recording, do one of the following:

Press the F9 key to start playing the file, and press the F4 key when you want to stop.
Click the play button at the base of your screen (A green triangle) to start playing the file, and click the stop button (A black square) when you want to stop.

If using a foot pedal – Press the middle pedal of your Foot Pedal control to start playing the audio file. Release or quick tap to stop the audio.

Try playing a recording now. Use your mouse to highlight the “Welcome” recording and press the play button. If you cannot hear anything then open Options, go to the Playback tab and check that the correct sound card is selected and increase the volume. If you still cannot hear anything then ask your local computer technician to check the sound card on your computer.


Step 3 of 6 – Moving Back and Forward within the recording

Keyboard and Hot Key Control:

To rewind the recording, press and hold down F7
To fast forward the recording, press and hold down F8
To move directly to the start, press the Home key
To move directly to the end, press the End key
Rewind now by pressing and holding down F7.


Step 4 of 6 – Resizing the Window

Practice resizing the window. This can be done in three ways – by “dragging “, by opening the “view ” menu or by pressing one of the buttons at the top right hand side of the window.

Express Scribe allows various different view of the application.

Full Size: The main and small toolbar icons are present, Dictation file list view, Play Back controls and Notes section.
Play Control: By dragging the lower right corner of the application up towards the top left corner you can create a view that does not include the Main and Small toolbars or the Notes section. This leaves you with the Dictation list view and the Play Controls.

Mini: Click on the Scribe Mini icon on the Main Toolbar to reduce Express Scribe to about an inch in size.
Customized: You can customize your view of Express Scribe by going to the View menu and selecting or deselecting the icons, buttons and toolbars that you would like to see or remove.


Step 5 of 6 – Typing the Recording

Select the recording you wish to type out, and press the play button. If the typing pad is not visible then select the “View” menu at the top of your screen, and left click “Show Typing Pad”. Click the left mouse button in the typing pad to begin.

You can also type using any Windows word processor including Microsoft Word, Corel Wordperfect, Lotus Wordpro, OpenOffice and others while Express Scribe runs in the background.

If you use a word processor, you should use system-wide hot keys so that Express Scribe can be controlled while it is in the background. Alternatively, you can use a foot pedal control.


Step 6 of 6 – Dispatch Typing

When you have completed the typing for a file, you can either Dispatch the typing by email to the person who dictated the file or you can simply mark the file as Done.

To Dispatch a file by email either:

1. Press Ctrl+D to Dispatch the file by email to the sender.
2. Click on the Dispatch icon on the Small Toolbar.

If you have typed the document using a word processor (eg. Microsoft Word), you can attach the document file to the email (click Browse).

To remove the file from the current files list (without reply by email to the sender) either:

1. Press Ctrl+N for Done
2. Click on the Done icon on the Small Toolbar

If you need to recover the file to amend it either:

1. Press Ctrl+O.
2. Click on the Recover Old Dictations icon on the Small Toolbar


Express Scribe – Advanced Features


Step 1 of 4 Open Help

You can view the manual at any time pressing F1 key or selecting Help Contents from the Help menu on Express Scribe.



Step 2 of 4 Help Topics

Once the help manual is opened, you will see the list of all available topics. Each topic explains in details how to use the different functions of Express Scribe. If you want to read any topic, just click on it.



Step 3 of 4 Finding keywords

You can also search for keyword by selecting the Find tab on the Help Manual, typing the word that you are looking for and clicking Display.



Step 4 of 4 Technical Support

If you have searched the manual and still cannot find the solution for your problem, please visit the Express Scribe technical support page HERE. and follow instructions there.

Our Advice:
We suggest you type the transcription to your Microsoft Word Doc. or OpenOffice Doc. and then save it to your computer. It is easier to print or send to the client through your own e-mail or through the Internet if you are not familiar with Express Scribe’s features.

Feel free to practice with the Express Scribe sending features to get it figured out, and that can be an easier way to send assignments in the long run.

Feel free to start getting to know Express Scribe. You can upload MP3 song files and listen to music, practice typing the lyrics while using the hotkeys to start/stop/pause, etc.

You are not going to hurt anything by using this software. The better you get to know it, the easier it will be when it cones time to do your transcription for clients.

7. Transcription Formats and Usages

Can’t see video? Then CLICK HERE to view at original source in new window

What we are referring to in this section is the types of formats you will be using.

We will break this down into two general formats:

1. Formatting Standards – is information specific to transcription assignments and the styles we spoke of earlier (Exact Verbatim, Near Verbatim and Content Only), such as standards for speaker identification, formatting documents, time coding, and so on. (e.g. for example, when doing an interview: how the transcription will be formatted to identify speakers, time of recording, etc. or, if you are transcribing a document, how to format into proper text using the usage standards.)

2. Usage Standards – are some commonly used standards that appear in The AP Stylebook, and Chicago Manual of Style as well as exceptions and additional items that apply to transcriptions. These are the way we use punctuation and grammar as a standard. Transcribers use a special style when typing their text that meets these standards. We are going to give you the style guide to follow when typing your transcripts.

This will make more sense as we will explain and show you examples through this section.

Note: We will be using an interview that was recorded by cassette tape for our example. Transcribing an interview from cassette tape will be the hardest assignment you will get. So if you can understand this example we will be showing you in training, the rest will be easy.

1. Formatting Standards

Formatting a Transcript

If using Microsoft Word, turn off all of Word’s “Auto Format As You Type” features. To do so, in Word, go to the Tools Menu, then Auto Correct, then the “Auto Format As You Type” tab. Uncheck all auto-formatting options.

Always double space between speakers or new paragraphs. Do not use your word processing software’s double-space or space-before-paragraph feature to do this; use hard returns instead. This is important for the document to format properly when prepared for the client. Long passages should be broken into new paragraphs to enhance readability. When starting a new paragraph, indent the first line using a single tab.

Beginning/Ending a Transcript

Title Page
You will need to start every transcription with a separate title page. On this page:

The first line of the document should contain the transcript filename (e.g., “TRANSCRIPT CASE – AND ELLENE VAN WYK”) if the audio pertained to a transcription law case involving Ellene Van Wyk. If the assignment requires a case number you can add that as well. It is not necessary to include your name. Always make the first title line two font sizes bigger then the following lines. Also it should be center page.

The second line of the title page is where you will want to name the interviewees. If this is a one-on-one interview, the name usually will be known so you may identify it. If you are transcribing a Focus group, you can put “Focus group participants” or whatever is designated by the interviewer to identify the group.

The third line of the title page will be the name of the interviewer.

The forth line of title page is where you will want to place the date of the interview (if applicable). This will be the date of the actual interview which should be labeled on the tape or digital recording you are transcribing.

The fifth line of the title page is the location where the interview took place (if applicable).

The final line of title page will give the audio recording details – include the type of audio and the length of recording (e.g., 2 cassettes; approximately 120 minutes – OR – Digital MP4 recording; 121 minutes). Display in minutes; don’t use the hours or seconds, and round off the nearest minute.

Lines two through six should be two fonts smaller than the first title line, and to left of page.





Start of transcription
You will start the transcription on the second page, the first being the title page.

Set margins: Top – 1.0″; Bottom – 1.0″; Right – 1.0″; Left – 1.5″. These specifications will provide even margins and allow the transcript to be bound.

Begin all transcriptions with the notation [Beginning of recorded material] or, if appropriate, [Abrupt beginning of recorded material].

Occasionally, you might receive an assignment specifying sections of recorded material to transcribe. In these instances, begin the transcription with

[Recorded material beginning at minute hh:mm:ss]
and end the transcription with
[End of recorded material at minute hh:mm:ss]
substituting the correct times for hh:mm:ss.












Transcription pages

Page numbers should be located in the upper right hand corner starting on the second actual page of the interview, after title page and first page. (No number should be printed on the first page of the interview.)

Indent each time a new speaker enters in. Use the whole name the first time the speaker appears; then use initials each time thereafter.

If a cassette tape is being used, indicate the beginning of a new side of the tape or a new reel by starting a new page and typing “START OF TAPE 1, SIDE B” (or whatever is appropriate).

Indicate the end of the side of a tape by typing “END OF TAPE 1, SIDE B” (or whatever is appropriate).

Indicate when the interview is finished with “END OF INTERVIEW.” When digital MP3 audio files are used you do not need to label anything on the page.

End of Transcription
At the end of the transcribed document, type [End of recorded material] or, if appropriate, [Abrupt end of recorded material].

Research/Focus Group Transcripts – When transcribing a research/focus group use the following guidelines. Most focus groups will have an introduction at the beginning where the facilitator explains how the group will interact and the purpose of the meeting, along with some additional comments. There is no need to transcribe this material unless you have been instructed to do so.

Instead, begin transcribing when the first participant identifies him or herself or when the conversation is obviously beginning on the topic. Identify the focus group leader as “Facilitator:” and use only “Male Voice:” and “Female Voice:” to identify participants. Trying to identify each speaker by name over the course of a long focus group gets too confusing. Consistent usage of “Male Voice:” and “Female Voice:” is best, unless the assignment specifically mentions that you should name the participants. Do not add numbers to the identification.

Speaker Identification – Separate the speaker identification from transcription text with a colon (:) followed by a tab. Do not use spaces. Your final document will be formatted using a standard template that relies on use of the colon and tab to produce the final product for the customer. Use only the following speaker identification formats, unless otherwise instructed in a work order:

Male Voice:

(Use initial caps on both words)
Female Voice:
(Use initial caps on both words)


(For focus groups)
(When you can only identify a speaker’s first name.)
John Smith:
(When you are able to identify a speaker’s first and last name.)
Dr. John Smith:
(When you know a speaker’s name and title in medical transcripts.)

If the transcript is titled “John Smith,” and there is only an interviewer and a respondent, the respondent is obviously “John Smith.” Use your best judgment, but go ahead and name the respondent as “John Smith” if it seems appropriate.

“Respondent” is always better than “Male Voice.”

Don’t use the following identifications: Speaker, Another Female Voice, Second Male Voice, Presenter, Moderator, Person, Child, Voice, or any other convention not listed above.

Don’t number speakers, such as Male Voice 1, Female Voice 2, and so on. This might be useful as you start the transcript if you think you’ll be able to identify the speaker later and then search and replace to update the identification through out the document. However, if this does not happen, remove the numbering before submitting your final transcript.

Every company may use their own format on how to identifying speakers.

Verbatim Transcripts – When an assignment specifies a verbatim transcript, try to
capture every word spoken on the recording, including stutters, false starts, and
exclamations. For consistency, use only the following for exclamations:

Do not use

ah, oh, er, and so forth. Pick from the list above and use what seems closest to what is being uttered. The transcriber is expected to proofread each page of manuscript for mistakes in spelling and/or typing. – Refer to the “Usage Style Guide” for more details which we will list in the next section of training as well a the ability to download for reference.

Time Coding – When time coding a transcript is called for, use the following format:

Speaker identification:
Always use three sets of numbers for hours, minutes, and seconds. Add a leading “00:” if necessary.

If transcribing a video, unless otherwise specified, use the time code displayed in the video itself, not the time shown as elapsed in your player. For example, the time code in the video might start at 04:00:00 rather than 00:00:00.

When time coding an interview, only the respondent’s answers need to be coded. For long answers, place a new time code every 30 seconds. However, time codes should only be inserted at the beginning of a sentence, so this is not a precise measurement. Place a new time code at 30-second intervals, or as close as you can get without breaking up a sentence.










Video Recording Transcripts – In a video or talking-head interview, occasionally there is discussion of camera angles, noises in the room, interviewee coaching, or other technical adjustments that must be made.

Rather than transcribing such off topic conversation, simply identify it as [Director’s comments] in the transcript. Silent footage of scenery, landscapes, crowds, etc. should be marked as [B-roll]. Please ask if you have any questions about what material should or should not be transcribed.

Transcribe obvious questions and answers only. Often these kinds of transcripts will be shown on the assignment as “Summarize ???s, Verbatim answers.” In this case, summarize the interviewer’s questions, but be certain to capture the respondent’s reply word-for-word.

Other Types of Transcription – In our demonstration we used the transcription of an interview, which is one of the most common types of transcription assignments you will see.

You may also see many transcription assignments such as research studies, thesis etc. Here you will need to listen to the audio and transcribe into proper textual forms.

You would not be identifying the speakers as you do in an interview. The only time you would identify anyone is in the title page. Instead if “INTERVIEWER” you would name the person presenting the research audio as “RESEARCHER:”

– When time coding a transcript is called for, use the following format:


2. Usage Standards

NOTE: This section (Transcribing Style Guide) is for reference only and you are not expected to memorize this. Take a brief look below to see the style used to transcribe documents into a certain format.

For general reference, the guide to transcription style you use will be one of two popular styles for usage.

The Associated Press Stylebook

The Chicago Manual of Style.

These are the ways you will transcribe the audio you will receive, whether you are transcribing a focus group, interview, research, thesis, etc. This is your reference guide of usage, and is the most important part of this program. You will know exactly how to transcribe anything that comes your way and how to use common words, phrases, symbols, etc., by using these usage guides.

The most commonly used style is the Chicago Chicago Manual of Style. Below is a complete style guide, adapted from the Baylor University Institute for oral history.

We would not expect anyone to memorize this right away; however, the more you use it you will begin to memorize it. It will make your assignments go much quicker if you seldom have to refer to this guide.



A transcript should reflect as closely as possible the actual words, speech patterns, and thought patterns of the interviewee. The narrator’s word choice, including his/her grammar, and speech patterns should be accurately represented. This is not an exercise in literary composition; the transcriber should avoid value judgments about the grammar or vocabulary of an interviewee. To retain validity in transcripts, most of the editing should be done by the interviewee.

A transcript is at best an imperfect representation of an oral interview. The transcriber’s most important task is to render as close a replica to the actual event as possible. Accuracy, not speed, is the transcriber’s goal.

Although the final product may not closely resemble the tape, because of many changes, the transcriber serves as first editor by putting words on paper. A good transcript is very valuable.

The transcriber will use a style guide to assist in transposing the spoken word into written language. This style guide is adapted from the Chicago Manual of Style, 14th edition. Transcribers and editors needing information on matters pertaining to spelling, punctuation, and usage not covered in the style guide, should refer to the Chicago Manual of Style.


Scroll through the Style Guide below for specific information.


In general, avoid abbreviation in transcripts. One general rule requires that a civil or military title appearing before a surname only should be spelled out, but it should be abbreviated before a given name and/or initial(s) plus surname. (See example e below.).

Do not abbreviate:

a) Okay
b) et cetera
c) Names of countries, territories, provinces, states, or counties
d) Doctor when used without an accompanying name
e) Senator, Judge, Bishop, General, Professor, Brother, or any other political, academic, civic, judicial, religious, or military title when it is used alone or when it precedes a surname alone, i.e., Judge McCall
f) The Reverend, or the Honorable, when the is part of the title preceding the name
g) Books of the Bible
h) Names of the months and days
i) Terms of dimension, measurement, weight, degree, depth, et cetera (i.e., inch, foot, mile)
j) Part of a book: “Chapter 3,” “Section A,” “Table 7”
k) Word elements of addresses used in text: Avenue, Building, North, South, except NW, NE, SE, and SW
l) Portions of company names: Brother, Brothers, Company, Corporation, Incorporated, Limited, or Railroad, unless actual company name uses the abbreviation
m)Senior or Junior when following partial names: Mr. Miller, Junior or Mr. Toland, Senior. (See b in next section.)


a) When preceding the given name and/or initial(s) plus surname:
Bro., M., Ms., Sr., Dr., Messrs., Mmes., Rev., Sra., Fr., Mlle., Mr., Rt., Rev., Srta., Hon., MM., Mrs., Rt., Rev., Msgr., Very Rev.,
b) Jr. or Sr. after given name and/or initial(s) plus surname: John H. Smith, Jr.
c) NE, NW, SE, SW in addresses given in text
d) Points of the compass: N, E, S, W, NE, SE, NNW, WSW, et cetera
e) Era designations: A.D. 70, 753 B.C.
f) Time designations: A.M., M., P.M.

Initials only, initialisms, acronyms, reverse acronyms:

a) Celebrated persons are often referred to by a full set of initials that represent the full name, often without periods. JFK, LBJ, and HST
b) Agencies and various types of organizations in government, industry, and education often are referred to by acronyms or initialisms: avoid periods, as in AMA, IOOF, NATO, UN, USMC, USAF, USN, FDIC, SEC, AFL-CIO, or AF of L-CIO, and especially SMU, Texas A&M


Nonverbal sounds which occur on tape are noted and enclosed in parentheses. For such notations use no capital letters, unless for proper nouns or proper adjectives, and no ending punctuation. Reserve the use of parentheses for such activity notes.

Descriptive terms: (laughs) when speaker laughs, or (Jeffrey laughs) when person other than speaker laughs, or (laughter) or (both laugh) when more than one laughs. Use (both talking at once) or (speaking at same time)–NOT (interrupts). Other examples: (unintelligible), (telephone rings), (truck passing by). When these occur at the end of a sentence or a clause, position them after the punctuation. Avoid editorializing; just put (laughs), not (laughs rudely)!


Brackets [ ] are reserved for the use of editors for notes and words not present on the tape and added to the transcript. The interviewee is free to add or delete material at his/her discretion on the first transcript. Such material is incorporated into the final text as indicated by the interviewee and does not appear in the first draft transcript unless indicated on a word list provided by the interviewer/first editor.


A rule of thumb: When in doubt, don’t. Proper names of institutions, organizations, persons, places, and things follow the forms of standard English practices. When in doubt, consult the dictionary. If still in doubt, don’t capitalize. Partial names of institutions, organizations, or places are usually treated in lower case.

Capitalize — See examples below

a) Names of particular persons, places, organizations, historical time periods, historical events, biblical events and concepts, movements, calendar terms referring to specific days, months, and oriental years
b) Titles of written books
c) Hyphenated compounds in titles, as in Twentieth-Century Authors
d) Generic references to members of athletic, national, political, regional, religious, and social groups–for instance: Bears, John Bulls, Democrats, Masons, Fundamentalists
e) Time designations: A.M., M., and P.M.

Don’t capitalize — See also examples below

a)Oh, except at beginning of sentence or response
b) Incomplete titles of persons
c) Names of dances other than names of dancing events, such as Society Ball
d) Pronouns referring to deities, such as God in his mercy.

Examples: Capitalize/Lower case

Board of Trustees of Mythical University, but board of trustees, the board, the trustees
The University of Texas, but the university
Department of History, but history department
“History of Texas” or History 1301, but a course in Texas history
study French and Spanish, but study history, economics, philosophy
Maricopa County, but Tempe was in this county
City of Tempe (if government), but I live in the city of Tempe
the State (if government) rests its case, but the state’s wild flower
New York Times, Times, but the newspaper
the West, in the Southwest, but to go west, to face southwest
an Easterner, Western American history, but a western university
West Coast, Gulf Coast, but the coast
Interstate 35, I.H. 35 or I-35, but the interstate, the highway
Eighth Street, but the street
Bible, but biblical work
Scripture(s), but scriptural passage
Veterans Administration, but the university administration
Veterans Administration Hospital, but oral history office
the Word of God, but the words of the song
the Fall (of Man), but the fall of 1992
the Gospel of Luke, but the gospel
the Book of Daniel, but a book of poetry
McLennan County Court, but county court
Washington Street Bridge, but the bridge
American Revolution, but the revolution of the colonies
World War I, First World War, but the war
General of the Army Douglas MacArthur, but MacArthur, a general, U.S. Army
President Harry Truman, but the president of the USA, presidency
the Bronze Age, but the third of the four ages of man
the Democratic party, but the party that won in that precinct
the Democrats (party members), but democracy
Great Depression (referring to 1930s), the Depression, but a recession, fifth century, B.C.
Sherman Antitrust Act, but an act of Congress
Bro. Adam Smith, Brother Smith, Aunt Kathryn, but my brother, Bob; Kathryn, my aunt
Grandmother, Grandpa Smith, Dad, (substitute for given name), but my grandmother, Elizabeth; my mother
U.S. Senate, but senate (in reference to state)
Capitol (referring to building), but the capital of Texas (meaning the city)


The em dash (—) is used in BUIOH memoirs without preceding or following blank spaces or punctuation to indicate:

1. A hanging phrase resulting in an incomplete sentence (do not use ellipses)
2. A parenthetic expression or statement
3. An interruption by another speaker
4. Resumption of a statement after an interruption
5. A meaningful pause on the part of the speaker


In the heading on the first page of a transcript, use the European style (i.e., 4 July 1776). Elsewhere in the transcript, typing dates conforms to the rules for typing numbers:

Use numerals for years (1996) except when a sentence begins with a year: Nineteen sixty-two was an important year for me.

Use numerals for days when they follow the name of the month and precede the year: I was born on August 5, 1987.

Spell out the words for the day when the year is not expressed and the speaker uses the ordinal number: My birthday is August fifth. My birthday is August the fifth.

Spell out the word for the day when the day precedes the month: the fifth of August
Other examples: 1930s; the thirties; 1989 or ’90; midsixties; mid-1960s.
When spelling out 1906, use Nineteen 0-six or Nineteen aught-six.

DIRECT ADDRESS: Set off by commas: Pam, I know you will enjoy this.


Hyphenation at the ends of lines is not a concern for the first draft transcript. Later editors should be aware of the following rules and should double-check any computer-generated hyphenation to conform to these rules. Words at the ends of lines should be divided according to syllabifications prescribed in any standard dictionary.

Don’t divide:

a) A syllable
b) A numeral, including numeric representations of money
c) A number from a measurement word or symbol
d) A one- or two-letter syllable from a word
e) The combination ble from a word without preceding it by a vowel, such as able or ible, except for assembling, assembled, and assemble
f) At the ends of three lines in succession
g) Proper names
h) Hyphenated words other than at the hyphens
i) Words of one syllable
j) The following word endings: -ceous -cious -gious -tial -cial -ciple -sial -tion -cier -geous -sible -tious
-cion -gion -sion -tite
k) A single vowel syllable from the first part of the word unless it belongs with ble
l) Words having a misleading appearance when divided
m) Initials used in place of given names from surname
n) Capital letters used as abbreviations or acronyms
o) Abbreviations for academic degrees
p) Divisional marks, i.e., a), (1), (i), from material to which they pertain
q) Dates

ELLIPSES: Do not use ellipses (. . .) in transcribing oral history tapes because they would give the appearance that material was left out.


A false start may be anything from a syllable to a sentence. Repeated words, phrases, or syllables are at times indicative of a person’s thought patterns, overall speech patterns, personality patterns, or of a speaker’s effort to emphasize an element of communication.

Sometimes an interviewee may be deliberately ambiguous or even turgid in meaning for reasons of his own. Where to draw the line in deleting false-start material from the transcript is a difficult decision. We strive to follow a middle course leaving in enough to indicate individual speech patterns.

If repetition is for emphasis as reflected in the voice of the interviewee, the repetition is always retained. Do not try to indicate stuttering unless it is intentional.

FEEDBACK WORDS AND SOUNDS (crutch words, encouraging words, and guggles)

While there is some merit in having an absolutely verbatim tape, which includes all the feedbacks (such as Um-hm and Yeah), too many interruptions in the flow of the interviewer’s remarks make for tedious transcribing now and exhaustive reading later.

Knowing when to include feedback sounds and when to omit them calls for very careful judgment. Usually the interviewer’s noises are intended to encourage the interviewee to keep talking. Look at your transcript. If every other line or so is an interviewer’s feedback, go back and carefully evaluate the merit of each feedback.

Don’t include every feedback, especially if it interrupts the interviewee’s comments in midstream. Only if the feedback is a definite response to a point being made by the interviewee should you include it. When in doubt, ask.

Type no more than two crutch words per occurrence. Crutch words are words, syllables, or phrases of interjection designating hesitation and characteristically used instead of pauses to allow thinking time from the speaker. They also may be used to elicit supportive feedback or simple response from the listener, such as: you know, see?, or understand?

Use of Uh: The most common word used as a crutch word is uh. When uh is used by the narrator as a stalling device or a significant pause, then type uh. But sometimes a person will repeatedly enunciate words ending with the hard consonants with an added “uh,” as in and-uh, at-uh, did-uh, that-uh, in-uh. Other examples are to-uh, of-uh, they-uh. In these instances, do not type uh.

Guggles are words or syllables used to interrupt, foreshorten, or end responses, and also as sounds of encouragement. Guggles are short sounds, often staccato, uttered by the interviewer to signal his desire to communicate. They may be initial syllables of words or merely oh, uh, ah, or er. Spelling of specific guggles: Agreement or affirmation: uh-huh, um-hm; Disagreement: unh-uh

GRADES, ACADEMIC: Set letter grades in capital letters, no period following, no italics, no quotation marks. Show number grades in Arabic numerals with no quotation marks and no following period. Plural should have an apostrophe: I made all A’s by earning 100’s on all my exams.


To determine use of hyphens, especially for compound words, first, check the unabridged dictionary, then, check Table 6.1 in Chicago Manual of Style, 14th ed.


1. To indicate division or separation in the following:

a) Division of words into syllables, as in syl-la-ble
b) spelling out a name or words, as in H-o-r-a-c-e. Capitalize only where appropriate.
c) Separation of numerator from denominator in a fraction expressed in words unless the numerator or the denominator is hyphenated. In that case, use / to separate numerator from denominator. Examples: one-fifth; three/thirty-seconds

2. to indicate unification or combination as follows:

a) Nouns made up of two or more nouns which imply the combination or unification of two or more linked things, functions, or characteristics, as in AFL-CIO, astronaut-scientist
b) Modifiers and adjectival compounds when used before the noun being modified, not after, including those formed with numbers: a one-of-a-kind student

3. To indicate an infrequent pronunciation or meaning of a word: re-creation, recreation; re-cover, recover; re-form, reform

4. To indicate clear meaning when possible confusion could result from adding a prefix to a word starting with a vowel, as in co-op–usually, this convention operates with doubled vowels.

Do not hyphenate

1. A noun compound of a spelled-out number and prefix, as in mideighties (but do hyphenate prefix plus numerals, as in mid-1980s).
2. Chemical terms, as in: sodium nitrate, sodium silicate, or bismuth oxychloride
3. A compound modifier that follows the noun it modifies unless hyphenated in dictionary: Example: Her argument was well balanced. She was good-natured.
4. A compound modifier that includes an adverb ending in -ly
5. A hyphenated word at the end of a line other than at the hyphen
6. A proper noun except when absolutely unavoidable
7. Contractions, such as: can’t, wouldn’t, don’t, didn’t, wasn’t, he’ll, they’re, she’d

INCOMPLETE SENTENCES: Incomplete sentences are familiar occurrences in oral history because of its conversational nature. They are best ended with an em dash (—).

ITALICS: See also QUOTATION MARKS for titles not in italics.


1. Ttles of whole published works, such as Plain Speaking
2. Titles of books, bulletins, periodicals, pamphlets
3. Titles of long poems
4. Titles of plays and motion pictures
5. Titles of long musical compositions: operas, operettas, musical comedies, oratorios, ballets, tone poems, concertos, sonatas, concerti grossi, symphonies, and suites, but not descriptive titles or attributed titles
6. Titles, actual titles, rather than descriptive or attributed titles, of paintings, sculptures, drawings, mobiles; for instance, da Vinci’s Mona Lisa is actually La Gioconda
7. Names of spacecraft, aircraft, and ships, except for abbreviations preceding the names, such as designations of class or manufacture, as follows: S.S. Olympic , H.M.S. Queen Elizabeth, U.S.S. Lexington, Friendship VII
8. Foreign words and phrases that are not in common currency; when in doubt, don’t italicize. Consult the dictionary; don’t italicize a quotation in a foreign language
9. A foreign word or phrase when translation follows that foreign word or phrase; enclose translation in quotation marks and precede translation by a comma
10. For emphasis (use sparingly)
11. References to words as words, phrases as phrases, or letters as letters: “Often is a word I seldom use.”
12. In indexes, the cross-reference terms, See and See also
13. Titles of legal cases, except in footnotes where only ex parte, ex rel., and in re are italicized along with other Latin words
14. Enumeration letters referring to subdivisions within a sentence or within a paragraph as well as those appearing in lists, when such letters are in lower case, such as a, b, or c
15. Newspaper names and the city names that accompany them: New York Times. Note: Do not italicize any articles preceding a newspaper name. Example: the Times.

LEGAL CASES: Italicize titles of legal cases, with v. for versus: Brown v. Board of Education of Topeka, Kansas

NAMES: The spelling of proper names of persons or locations is one of the transcriber’s most difficult tasks. The office has many reference works that contain names and places. Ask for help. See also ABBREVIATIONS; CAPITALIZATION; ITALICS; QUOTATION MARKS


In text, spell out all numbers one hundred and under, whether cardinal or ordinal, and anything above that which can be expressed in two words (even hyphenated ones) or less: Examples: sixty-nine; seventy-fifth, twenty-two hundred, but 2,367


1. All street address numbers, all intrabuilding numbers, all highway numbers
2. Telephone numbers
3. Fractional sums of money above one dollar: $2.984.
4. Dates: See also DATES below:

735 B.C. mid-1950s
A.D. 1066 the midfifties
1990s midfifties fashions
24 February 1997 July 1997 (no comma)
’99 1979-80

5. Time of day–use numerals when A.M. or P.M. follow or when typing a whole plus a fraction of an hour: 8:20 P.M., four o’clock. 7:30, seven in the morning
6. Number elements in names of government bodies and subdivisions of 100th and higher, all union locals and lodges, as in Thirty-sixth Infantry; 139th Tactical Wing
7. Parts of a book, such as chapter numbers, verse numbers
8. Percentages, as in 50 percent

For consistency any sentence which contains numerals pertaining to the same category should have all numerals. Example: The report stated that 7 [instead of seven] out of 265 students voted in the campus elections.


a. The sentence begins with a number: Seven out of 265 students voted.
b. Numbers representing different categories: In the past ten years five new buildings of over 125 stories have been erected in the city.

Numbers as numbers: When spoken of or referred to as numbers, they may be enclosed in quotation marks or italicized; either is acceptable.

Plurals of numbers:

Spelled-out numbers form plurals like any other noun: the twenties and thirties
Numerals form plurals by adding s alone, with no apostrophe: 1920s and 1930s

Prefixes and suffixes with numbers: When connecting figures with a prefix or suffix, add the hyphen in the appropriate place if the compound word is adjectival. Connect numbers expressed in words to a prefix or suffix with a hyphen, except for -fold when forming adjectival compounds, such as twenty-odd.


In final copies of memoirs, lower-case Roman numerals are used on auxiliary pages preceding the main text. Title page is considered to be page i, but is not marked.

For text, appendix, and index pages, center the page numbers (in Arabic figures) one-half inch from the top edge of the paper. Number appendix and index in sequence with the text pages and place the appendix pages between the end of the text and the index.

PARAGRAPHING: Indent for paragraphs where topics change, where subtopics are introduced, or where other dialogue is introduced. This may be very difficult to judge as you are typing and is often left up to the final editor.


Compound words formed with prepositions are pluralized by forming the plurals of the first nouns in the compounds, as in fathers-in-law.

Letters of the alphabet are pluralized by adding s or ‘s: Zs or Z’s. Use the apostrophe only where confusion is possible: A’s, not As.

Foreign words are pluralized, unless Americanized, according to the customs proper to the particular languages. For example, in Hebrew, Kibbutz is pluralized by im: Kibbutzim.

Abbreviations are pluralized by adding s when in the form of acronyms, initialisms, or reverse acronyms without periods: GREs. When periods are used, add an apostrophe: B. K.’s

Proper nouns: Add s to the singular if the addition does not make an extra syllable, as in six King Georges; but add es to the singular form if the addition creates an extra syllable, as in six King Charleses. Nouns–including names of persons–that end in s take addition of es to form the plural: The three Loises are friends with the three Marys.

Everyone at the reunion were Joneses or Martins.

Note that the apostrophe is never used to denote the plural of a personal name.


Follow the standard rules for possessives.

For proper nouns, add ‘s to most, even those ending in sibilant sounds, except Jesus’ and Moses’. Example: Charlie’s, Frances’s. For plural possessives, the apostrophe goes at the end: the Smiths’. Collective nouns are exceptions, as in children’s.

PUNCTUATION: Transcript punctuation follows The Chicago Manual of Style, 14th ed. See also DASHES; HYPHENS;


1. When a direct expression is spoken by one person (I, he, she), set apart the expression with commas, use opening and closing quotation marks, and capitalize the first letter of the first word quoted. Example: She said, “I am going to graduate in May.”

2. When a direct expression is spoken by more than one person (we, they), do not use quotation marks, but do set apart the expression with commas and do capitalize the first letter of the first word quoted. Example: They said, What are you doing here?

3. When a thought is quoted, do not use quotation marks, but do set the thought apart by commas and capitalize the first letter of the first word quoted. Example: I thought, Where am I?

Enclose in quotation marks when text refers to

1. Titles of articles in periodicals
2. Book chapter titles
3. Book divisions other than chapter titles: sections, paragraphs, charts, and other labeled book parts
4. Dissertation titles
5. Essay titles
6. Newspaper headlines (in all capital letters)
7. Poems (short, not book length)
8. Radio program titles
9. Sermon titles
10. Short musical composition titles when not designated by number
11. Song titles
12. Short story titles
13. Television program titles
14. Theses (unpublished)
15. Lecture titles
16. Titles of formal courses of study
17. Debate topics

Do not enclose in quotation marks

1. Names or words used in conjunction with the words call, called, named, or words with similar meanings. Examples: Call me Adam. We named the dog Bowser.
2. The word yes or the word no other than in a sentence which includes other direct discourse.
Examples: He couldn’t say no, yet he didn’t really want to say yes. She said, “No,” when asked, “Do you care to join us?”
3. Thoughts or paraphrases, as in, I thought to myself, Who does she think she is?

Punctuation with quotation marks:

The period and the comma always stay inside the quotation marks. Example: “I’m ready for lunch,” she said, “but it’s only ten o’clock.”

The semicolon and the colon always stay outside the quotations. Example: With trepidation, she scanned “The Raven”; it was too eerie for her tastes.

The em dash, exclamation mark, and question mark are within the quotation marks when they apply only to the quotation. Examples: She began to say, “In the spring of 1920–” and then remembered it was a year later. She began by saying, “In the spring of 1920,”–I think it was really 1921–“I graduated from Baylor and began teaching school.”

REFERENCE WORKS: For stylistic purposes, consult the unabridged dictionary and The Chicago Manual of Style; if the two conflict, try to follow Chicago on all matters except hyphenation.

SPELLED-OUT WORDS: When in the course of the interview, one of the participants spells a word, capitalize appropriately and separate letters with hyphens, as in B-a-y-l-o-r. Follow the exact words of the speaker, as in, They called him Screech, spelled capital S-c-r-double e-c-h.

Always use the computer spell check function before printing and always look up a word if you are not 100 percent sure of its spelling. When the dictionary allows more than one spelling of a word, chose the first one listed.

DO:                                                            DON’T:
for a while                                                 for awhile
awhile ago                                                 a while ago
all right                                                      alright
until,                                                           till ’til
toward                                                        towards (okay if memoirist says it)
nowadays                                                   now-a-days
apiece (They cost six dollars apiece.)    a piece (I ate a piece of pie and gained ten pounds!)
inasmuch as                                               in as much as
insofar as                                                    in so far as
Channel 10                                                 Channel Ten
a lot                                                              alot
et cetera                                                      etc.
okay                                                             O.K

Spellings for slang and certain words and expressions pronounced in regional dialect are available in dictionaries or reference works. Informal language, such as yeah and yep, may be transcribed verbatim if they occur in the dictionary. Words commonly pronounced together in spoken English–such as gonna (going to), sorta (sort of), and kinda (kind of)–are in the dictionary and may be used in the first transcript. The interviewee often edits them out.


When speech on a tape is unintelligible, first play it aloud. Next, ask someone else to listen.

If you can make an educated guess, type the closest possible approximation of what you hear, underline the questionable portion, and add two question marks in parentheses.

Example: I went to school in Maryville (??) or Maryfield (??).

If you and those you consult cannot make a guess as to what is said, leave a blank line and two question marks in parentheses.

Example: We’d take our cotton to Mr. _________(??)’s gin in Cameron.

If a speaker lowers his/her voice, turns away from the microphone, or speaks over another person, it may be necessary to declare that portion of tape unintelligible.

If you absolutely can’t make out the words at all, insert [unintelligible] in the transcript in their place. Use [unintelligible] only, NOT [?], [unknown], [can’t hear], [inaudible], or any other convention.

Example: When he’d say that, we’d–(laughs; [unintelligible]UNFAMILIAR TERMS:
When there is a term you are unable to identify, take your best guess and enclose it in brackets, such as, [hypogammaglobulinemic]. If you have phonetically spelled an unknown term in brackets and you subsequently come across the same term, be sure to use consistent spelling. If you learn the correct spelling of a word or name during the course of transcribing, go back and correct the previous instances of the word.

Unclear Words or Phrases– Please make every effort to hear and understand what is said. Sometimes you can figure out a word by the context of what the speaker is saying. The Internet can also be useful. You can use a search engine like and type the word you are hearing and the search engines can sometime figure out the real meaning as a keyword suggestion as”Did you mean…” which often gives you the correct term.

Or, search for something unique about the subject matter and you might find a document that contains the correct word. Company websites will often have a list of employees, which can be useful in the spelling of names.

* Please note: As will be further explained in the guidelines for editing, overuse of dashes only weakens a transcript. One must judge that it is important to the context of the interview for the reader to know that the speaker paused, was in a quandary, and therefore did not speak straightforwardly. Where the pauses are not this significant, simply end the sentence with a period or a question mark

8. Practice Assignments

Can’t see video? Then CLICK HERE to view at original source in new window

We are going to give you a few practice audio files to download and transcribe. Load this into your Express Scribe and do the best you can to transcribe the audio.

There are three practice transcriptions we are going to give you. We will tell you some instructions you may receive from the client before you transcribe these three projects.

We are going to give you the correct transcription on a separate page after you have transcribed the audio for you to grade yourself.

NOTE: For Test 1 and Test 2, you don’t have to transcribe all the minutes on files. Try to transcribe at least five minutes of these audio files. These are for training purposes only. You will be the only one to see the results. On Test 3 you will need to transcribe all 4+ minutes.  


test1 File Name: Greg Palast Interview
Interviewed by Alastair Thompson on April 20, 2012 in London England.Assignment Details: Transcribe interview in “Near Verbatim” format. Identify the interviewer as “Scoop Magazine’s Alastair Thompson” in bold, and interviewee by last name in bold. No Time Coding needed. Please use Time Roman Numerals font size 12 in your word processor.volume-24-24 Here is the recorded MP3 for transcribing:

DOWNLOAD AUDIO FILE HERETo download audio file, click on the link above, then right click on the audio that pops up. You will Click “Save As” and then save to your desktop.

Download the correct transcription to match your results below:



Test 1 Notes




test2 File Name: Interview with musician Pete Townshend.
Interviewed by KGSR Radio DJ Jody Denberg on October 2 2006, by phone.Assignment Details: Transcribe interview in Near Verbatim format. Identify the Interviewer as KGSR and Interviewee by first name. No time coding needed. Please use Verdana font in font size 12 in your word processor.volume-24-24 Here is the recorded MP3 for transcribing:

DOWNLOAD AUDIO FILE HERETo download audio file, click on the link above, then right click on the audio that pops up. You will Click “Save As” and then save to your desktop.

Download the correct transcription to match your results below:



Test 2 Notes



test3 File Name: Interview with musician/song writer Sean Lennon
Interviewed by KGSR Radio DJ Jody Denberg on January, 9 2007, in NY City.Assignment Details: Transcribe interview in Near Verbatim format. Identify the Interviewer and Interviewee by using the Q & A format. We need time coding done on this assignment. Please use Verdana font in font size 12 in your word processor. Files are on TWO digital .wma files.volume-24-24 Here is the recorded MP3 for transcribing:Audio File #1

Audio File #2


To download audio file, click on the links above, then right click on the audio that pops up. You will Click “Save As” and then save to your desktop.

Download the correct transcription to match your results below: Test_3_Title


Test 3 Notes


9. The Job Resource Categories

Can’t see video? Then CLICK HERE to view at original source in new window

There are typically three types of possible transcription categories to get jobs and and assignments:

1. Transcription Outsource Services – You can apply online with the transcription service companies we provide which will outsource assignments directly to you. Generally you would be a contractor for that company.

2. Transcription Job Listings  – We will provide direct links to job listings from thousands of combined sources. You will be able to view the job assignment directly and if you are interested in the assignment you can apply directly to the source provided.

3. Direct Transcription Jobs – The jobs and assignments listed in this section will allow you direct contact with business or individual to view the transcription details and offer them your services. Sometimes the fee will be set by the business or individual or open for negotiation. We list this section by country, state, region and city. Although in most cases it will not matter which country, state, region or city you choose because the work is done online.

Independent Virtual Home Business – Start your own online virtual home business. Work as an independent contractor; which will allow you to do assignments for individuals and/or companies wanting to subcontract work directly through your business. This is optional but may be a great option to explore.

If you are wanting to get everything you need to know on how you can start your own virtual home business go to our “Virtual Home Business” program.