Recording Voiceover Audio for eLearning — Part 2

We continue with part 2 of Paul Fegan’s series, recording of eLearning material and this time he delves into the main part of a project, the read. Part 1 covered topics such as sitting comfortably, working with a director and even lighting. All come into play and can affect the end result and audio quality.


The read

I recorded an eLearning session with a new voiceover artist recently and I explained to her that voicing a project is somewhat like driving a car. At first, there’s so much to think about: mirror, signal, clutch, handbrake, accellerate, clutch, gear-change, mirror, brake, gear-change, etc., etc. At first, all of this seems like a lot to remember. And it is. But how many experienced drivers out there consciously think about any of these functions now? When you first audition for voiceover work — or you’ve already auditioned and you get your first gig — every piece of direction you receive is just another function you have to think about. You might be asked to watch your pace because you’re beginning to read too quickly. But then you forget to watch your inflection and your read has become a tad monotone. So now you try to watch both inflection and pace, but now you’re not thinking at all about what you’re actually saying. This is completely acceptable when you’re starting out in this business, so don’t be too hard on yourself. At the audition stage, I soon know which VOs show potential and are worth investing in and I try to give them the space to learn. Once that VO has passed the audition stage and their new demo wins them their first gig, I take my time with them in the first couple of paid sessions. I do this because voiceover is more complicated than people might realise at first. Even experienced film and theatre actors who come into the studio are often surprised at how difficult it can be to get into the mindset of what’s required for voiceover work. But as I said, it’s like driving a car. At first, you have so much to concentrate on in order to deliver a good read. After a couple of sessions, though, your subconscious begins to take over these tasks, allowing you to concentrate on performing. And as with driving a car, it’s at this stage that you can embrace the discipline with confidence and really start to enjoy the work.

Before we get into the specifics of the sorts of challenges you might come across in eLearning scripts, I recommend that you listen acutely to narration reads. If you’re a voiceover artist, you’re lucky in that examples of the type of work you do are all around you and readily accessible. Turn on the television or the radio and you’ll hear people narrating, be it in documentaries, news reports or instructional programmes. And you don’t even have to trawl TV and radio stations looking for examples; just go onto YouTube or Vimeo and you’ll find lots of examples of professional VO reads. Until you’ve at least recorded an audition or a demo, you might not know what to listen out for, so if you’re in that category, I recommend you take note of the following:

The VO’s pace — is it fast or slow? Can you digest what they’re saying?

The VO’s diction — do they sound neutral, or do they have a strong, regional accent? Are they slurring any vowels or consonants? Are they having to slow down over difficult phrases?

The VO’s inflection — is it rising and falling in an engaging manner? Is it rising or falling too much, or in a manner that doesn’t seem to fit the topic? Are they accentuating the important words in each line?

The VO’s engagement with the piece — do they ‘smile’ at the right places and generally sound like they know what they’re talking about?

So, what is an eLearning read? Well, to the uninitiated, eLearning is ‘electronic learning’, or training that is conducted via electronic media, often over the Internet or across a network. Learning isn’t exclusive to any particular group of subjects and it’s used by a vast number of corporations and other organisations to educate their staff on matters as diverse as induction (for new hires) and highly technical topics (for perhaps the more experienced among their staff). In my experience, the most common themes for eLearning courses are finance, pharmacy and telecoms, though probably less of the latter in recent years. Why is it important to know this? Well, it’s good to know that what you’ll be reading will often be highly technical, you’ll encounter some very long, comma-less sentences, and you’ll discover phrases you’ve probably never heard before using words you’ve never spoken before. So it can be quite a mouthful. But, you’ll be provided (or you should be) with a pronunciation guide to help you pronounce new terms and you can do some retakes to get it all right. So don’t fret!

So let’s begin. Typically, your audio script will contain two or three columns. One of these columns will contain filenames. This is just for the audio editor so that they know what to call each audio clip as they edit it. Another column might contain notes; these could be pronunciation guides written phonetically so that you know how to pronounce certain words or brand names, or, they could be directorial notes explaining certain characters’ moods in a given scene so that you know how to voice them. The third and most important column will be the body text, i.e. the material you have to read.

ELearning scripts often begin with a welcome message. So think about how you want this to sound. Think about how you would like to be welcomed to a hotel, a conference or a lecture. Chances are, this is going to be one of the warmest lines in your session and it can also set the tone for the rest of your read. This is a good time to mention that listeners can hear when you smile, so if you think you’re not sounding warm enough, physically smile at the mic. Go on, I know you might feel a little self-conscious at first, but it will empower your delivery. And try to punctuate the important words in the line.

Welcome to the Paramedics’ Association’s First Aid Primer training course.

In the example shown above, I’ve formatted the course title in bold type. This is to bind all the words of this noun together; these words need to be read together without any pauses between them. I would also experiment with the word ‘welcome’. Read the line in one continuous flow at first. Then read it again, but place a very short pause (what I’d call a ‘beat’) after welcome and almost resolve your intonation on the word. Can you hear how it adds gravitas? Also, try a miniscule beat after Primer to separate the title from the rest of the sentence. You’ll get a handle on what sounds right and you’ll learn to trust your own judgement (if no one’s directing you during the session). Above all, read it in a way that sounds right to you. Just remember that it’s a welcome message so it needs to sound welcoming. There are exceptions, however. If the course is about, say, berievement counselling, or the threat of criminal prosecution for breaking compliance laws, pull back on the warmth a little. For these, I think using inflection alone without too much warmth is appropriate.

While we’re at the start of the read, it’s important to make a decision on pace. The first thing to remember is: this is training. You don’t want to blurt out your lines at a rate of knots such that the listener can’t digest the information. Documentaries are good examples of pace because they’re often teaching a specific subject to a wide audience, so there’s no assumption that that audience is in any way an expert on the topic discussed. On the other hand, you don’t want to read as slowly as you’d read to a pre-school child, either. You’ll know the rate when you read it; you should find it a relaxing pace and you shouldn’t be tripping up over the words. This pace will also help you with inflection as it will give you time to see what’s coming up (this is partly to do with sight-reading which we’ll deal with in the next paragraph). As with many aspects of reading, there are exceptions. In my experience, if the subject is technical, but the course is targetted at experienced engineers, you’re usually required to pick up the pace a little, but just a little. This audience already knows their stuff so they can digest the information at a slightly higher rate than usual. Conversely, companies who commission corporate induction courses for new-hires, for example, tend to want a slightly slower read than normal. So be mindful of this. Ideally, speak to your client before recording to establish both the pace and the tone of the read. But don’t worry if this isn’t possible; you can always make a judgement yourself by assessing the subject matter and the audience at which it’s targetted.

Having auditioned and hired many, many voiceover artists over the years, I believe you need to have three primary skills as a VO: a good voice, an ability to tell a story, and good sight-reading abilities. These are all important, but I can’t over-emphasise the last one enough. You need to be able to sight-read. Why? Well, if you’re an actor, you get the script in advance and you learn your lines for the play or movie. If you do adverts, you might get the script in advance and it’s usually at most a 60-second piece so you become familiar with the textvery quickly. But eLearning can be tens (or even hundreds) of pages of very technical content. You won’t be expected to read over it beforehand and in truth, the script might not be signed off until minutes before your recording session begins, so you won’t have the opportunity anyway. Add to that the often difficult terminology that appears in eLearning scripts and you have quite a challenge on your hands. If your sight-reading is poor, you may stumble your way through the script and find that it takes twice as long to read it as even the most generous of production metrics would allow. This might sound harsh, but no matter how good your voice sounds, if your sight-reading is poor enough to cost studio time, chances are, you won’t be called back. I only say this to convey how important it is to work on this skill if you feel it’s an issue for you. It’s beyond the scope of this article to make a good sight-reader of you and I suggest you search the Internet for tips and techniques on how to improve in this regard. However, if I can impart one valuable tip, it’s this: when reading, don’t just look at each word as you come to it. If you do this, your brain has no time to see what’s coming up and ends up scrambling to pronounce the words correctly and adopt the right tone. So to provide a buffer or cache for your brain, allow your eyes to dart ahead of the words you’re reading. This may in itself sound conducive to misreads and if you’re not used to doing it, it may feel a little awkward at first, but practise reading this way. The idea is that while you’re reading the first word of a sentence, your eye is flitting across to the fourth and fifth words and effectively loading them into your subconscious buffer. This way, your brain is telling you what is coming up, reducing the likelihood of misreads and helping you inflect the entire sentence correctly. Trust me: if you practise this method, your sight-reading will improve and you will render yourself a much more attractive option for clients and producers.

Sometimes your script might contain strange terms without including a pronunciation note. These terms could be brand-names, Latin medical terms, or even just model numbers that need to be read a certain way. Whenever I’m unsure of a pronunciation, I usually try to contact the Instructional Designer (ID) who wrote the script or the course’s Project Manager. I try to add them all to my Skype contacts list so that contacting them is quick and simple and doesn’t delay the session too much. Alternatively, you could call them if you have their phone numbers. If all else fails, try to include alt-takes to cover the various possible pronunciations. However, if this term appears regularly throughout the entire script, you can’t be expected to do two or three alts for every instance, so make your best call on what you think it should be and run with that. If the term-in-question only appears once or twice, maybe provide alts in each case. Above all, be kind to your clients. They don’t deliberately overlook these things and they’re often doing their best to give you all the information you need. If you feel that certain directions are missing on a regular basis, talk to the client after the session and explain what you need. In the long run, it will save them on retakes and they’ll be grateful to you for that. Ultimately, if you’re attentive to your clients’ needs and help them out, they’ll see you as an important member of their team and come back to you time and time again. So include alt-takes, but just use your own judgement on whether you think it will affect the duration of the session or not.

Here’s another issue. When I was younger, I remember British television channel, ITV, ran a dating programme called ‘Blind Date’. Even at the tender age of 16, I used to find it strange how the compère would announce the show by making a clear distinction between the ‘d’ in ‘blind’ and the ‘d’ in ‘date’. I understand he was being very careful with his diction, speaking what might be termed ‘proper English’, but it sounded odd, as though he was introducing a show called ‘blyne-did-ate’. For this reason — and it is just my opinion — I never ask VOs to read this way in the studio. If it doesn’t sound natural, why would you want to speak that way anyway? Strangely, this is my segue into talking about diction. Diction is very important, but if two consonants come together like that, consider just hitting one of them if it sounds unnatural to hit both. Yes, clients want good diction, but do they want good diction at the expense of a natural-sounding read? In all my years in eLearning, no client has ever complained when we ran two consonants together like this. Another issue on this topic is trailing ‘t’s. When I started out recording voiceover, I noticed that US talent would often soften their trailing ‘t’s so much that sometimes, they weren’t audible at all. I used to worry about this and direct VOs to hit their ‘t’s. However, I’ve discovered over the years that different global regions have different demands in this regard. I remember one US client who said that he felt the ‘t’s in his audio were too hard, so I realised that asking US VOs to hit those trailing ‘t’s wasn’t always natural. We here in Ireland have trailing ‘t’s of our own which actually sound almost like an ‘ish’. So instead of saying ‘right’, we tend to say ‘ryshe’. You’ll hear this on radio and television here, so it’s widely and officially accepted as being part of the way we speak, but I know for a fact it annoys some seasoned Irish broadcasters. British ‘t’s are sharper and more definite. The upshot of this is that while I’d forgive a US VO a dropped trailing ‘t’, I’d ask a British VO for a retake. Different geographical regions; different demands. One other thing I’m not fond of is the recent tendency by British broadcasters to pronounce ‘fifth’ as ‘fith’ and ‘sixth’ as ‘sickth’. I’m not sure where this came from. I wouldn’t have considered these words to have been so difficult to pronounce that they required an easier alternative. In fact, the English language arguably contains thousands of words that are far more difficult to pronounce and which don’t come with alternatives. I read one letter of complaint in a British publication from a television viewer who was mystified by its prevalence. If voiceover is what you do, I don’t think ‘sixth’ and ‘fifth’ are going to overtly challenge your skills, so hopefully this won’t be an issue for you. However, this is just a personal pet-hate of mine, so feel free to ignore my take on it.

One thing you’re sure to encounter in eLearning is the appearance of lengthy sentences. With any luck, they’ll be well punctuated. However, this isn’t always the case. Sometimes it’s because whole sections were cut-and-pasted from legal documents and the original contained little or no commas. It’s very important that you don’t sound out of breath either at the end of your sentences or at the end of your phrases. If you find that you’re struggling to get through a phrase in one breath, chances are, your listener can hear it and they might even feel uncomfortable listening to you. The first thing I do here is go through the sentence and see where I can add commas. These commas might not work grammatically, but that’s OK, they’re only temporary markers to indicate to the VO how the sentence may be phrased and therefore where breaths can be taken. Somtimes, I’ll even add commas before an ‘and’ because a VO might overlook the fact they can take a breath there and might try to flow it all together. Read the line to yourself and try to gauge how it should be broken up into phrases, then place a comma between each phrase. Occasionally, you will find phrases that are long, but you can’t find any logical point at which you can take a breath. In this case, you might just have to take a deep breath to get through the section in one go. But be careful. If you take in a huge amount of air, hold it and then release it like a bursting dam, sometimes the listener will hear that cascade of breath rushing out in your voice like a sigh. To counter this, release your breath in a controlled manner. Don’t hold your breath, as this will be evident, too. If you’ve taken a really deep breath, let the first little bit go before you start reading, just to release the pressure, then let your breath go in a steady, controlled manner as you speak the line and you should find that you get to the end of the phrase without running out of air, or sounding like you have. It might take a few takes to get right, but that’s perfectly fine. When you get it just right, it will have been worth the investment of your time.

This concludes part 2 of this article on recording voiceover for eLearning. In part 3, we’ll explore more ways in which you can help to improve your read.

Paul Fegan

Paul Fegan is a sound engineer and owner of, a studio in Dublin, Ireland, specialising in voiceover recording. Paul has been engineering, directing and post-producing voiceover audio for various sectors since the 1990s and supplies audio to a number of customers worldwide