Phonology involves two studies: the study of the production, transmission and reception of speech sounds, a discipline known as ‘phonetics’, and the study of the sounds and sound patterns of a specific language, a discipline known as ‘phonemics’.


When we speak we produce a stream of sound, which is extremely difficult to examine because it is continuous, rapid and soon gone. The linguist has therefore to find a way to break down the stream of speech so that the units may be studied and described accurately. In studying speech we divide these stream into small pieces that we call segments. The word ‘man’ is pronounced with a first segment [ m ], a second segment [ æ ]  and a third segment  [ n ]. it is not always easy to decide on the number of segments. To give a simple example, in the word ‘mine’ the first segment is   [ m ]  and the last is [ n ] , as in the word ‘man’ discussed above. But should we regard the [ ai ] in the middle as one segment or two? We will return to this question.

Human beings are capable of producing an infinite number of sounds but no language uses more than a small proportion of this infinite set and no two human languages make use of exactly the same set of sounds. When we speak, there is continuous movement of such organs as the tongue, the velum (soft palate), the lips and the lungs. We put spaces between individual words in the written medium but there are no similar spaces in speech. Words are linked together in speech and are normally perceived by one who does not  know the language (or by a machine) as an uninterrupted stream of sound. We shall, metaphorically, slow the process down as we examine the organs of speech and the types of sound that result from using different organs.


Linguists use a phonetic alphabet for the purpose of recording speech sounds in written or printed form. A phonetic alphabet is based on the principle of one letter per sound, so that people know which sound we are referring to when we use a certain letter. Such an alphabet provides a quick and accurate way of writing down the pronunciation of individual words and of showing how sounds are used in connected speech. It must be remembered, however, that a phonetic alphabet does not teach sounds, nor is it necessary to use phonetic transcription in teaching pronunciation.

Many systems of phonetic transcription have been invented. The International Phonetic Association (IPA) transcription represents British pronunciation, the discussion is based n the Received Pronunciation, (RP), while the  Trager-Smith transcription represents American pronunciation.

Figure 1 shows the main organs of speech: the jaw, the lips, the teeth, the teeth ridge (usually called the alveolar ridge), the tongue, the hard palate, the soft palate (the velum), the uvula, the pharynx, the larynx and the vocal cords. The mobile organs are the lower jaw, the lips, the tongue, the velum, the uvula, the pharynx and the vocal cords and although it is possible to learn to move each of these at will, we have most control over the jaw, lips and tongue.


The organs of Speech and English consonants 
The tongue is so important in the production of speech sounds that, for ease of reference, it has been divided into four main areas, the tip, the blade (or lamina), the front and the back as shown in Fig. 2

Sounds could not occur without air. The air required for most sounds comes from the lungs and is thus aggressive (‘going out’). Certain sounds in languages can, however, be made with air sucked in through the mouth. Such sounds are called ingressive (‘going in’). The sound of disgust in English, a click often written ‘Tch!’, is made on an ingressive air stream. Coming from the lungs the air stream passes through the larynx, which is popularly referred to as the ‘Adam’s apple’. Inside the larynx are two folds of ligament and tissue which make up the vocal cords.


Sounds can be divided into two main types. A vowel is a sound that needs an open air passage in the mouth. The air passage can be modified in terms of shape with different mouth and tongue shapes producing different vowels. Consonant is formed when the air stream is restricted or stopped at some point between the vocal cords and the lips. The central sound in the word ‘cat’ is a vowel. The first and the third sounds are consonants. More will be said about vowels and consonants in the course of this chapter but these rough definitions will serve our purpose temporarily.


The sounds of speech can be studied in three different ways. Acoustics phonetics is the study of how speech sounds are transmitted. Auditory phonetics is the study of how speech sounds are heard. Articulatory phonetics is the study of how speech sounds are produced by the human apparatus. This approach in speech analysis is the one most useful for a language teacher, since he/she needs to know how individual sounds are made in order to help her students produce the desired sounds.

In the production of speech sounds, the organs in the upper part of the mouth may be described as places or points of articulation and those in the lower part of the mouth as articulators. When we produce speech sounds, the airflow is interfered with by the articulators in the lower part of the mouth. The resulting opening is called the manner of articulation of the speech sound.


Just as each language uses a unique set of sounds from the total inventory of sounds capable of being made by humans, so to each group of speakers has a preferred pronunciation. In English, the most frequently used consonants are formed on or near the alveolar ridge; in French, the favored consonants are made against the teeth; whereas in India many sounds are made with the tip of the tongue curling towards the hard palate, thus producing the retroflex sounds so characteristic of Indian languages. The most frequently occurring sounds in a language help to determine the position of the jaw, tongue, lips and possibly even body stance when speaking. A speaker will always sound foreign in his or her pronunciation of a language if the articulatory setting of its native speakers has not been adopted.


The answers to the four questions can tell us how the consonants are produced and also help us to classify or describe them.

  1. Are the vocal cords vibrating? The answer to this question tells us whether the sound is voiced or voiceless.
  2. What point of articulation is approached by the articulator? The answer gives the adjective in naming the consonant. For example, if the upper lip is approached by the lower lip, the sound is bilabial, e.g. [m, b]. If the upper teeth are approached by the lower lip, the sound is labiodental, e.g. [f, v].
  3.  What is the manner of articulation? The answer supplies the noun in naming consonant, e.g. stops, fricatives, affricates.
  4. Is the air issuing through the mouth or nose? The answer tells us whether it is an oral or a nasal sound. This may be taken with (3) as another manner of articulation as it supplies another noun in the naming of consonants, i.e. nasal

We will group the consonant sounds of English according to their manner of articulation in the following discussion.


The ears can judge sounds very precisely, distinguishing the pure resonance of a tuning fork from the buzzing sound of a bee or the sharp report of a gun. More important for speech, perhaps, we can also distinguish between the voiceless sounds like ‘p’ and ‘t’ in ‘pat’ and the voiced sounds like ‘b’ and ‘d’ in ‘bad’ or between the voiceless ‘p’ in ‘pat’ and the nasal ‘m’ in ‘mat’. Speech exploits all these abilities and many more and scholars have devised ways of classifying sounds according to the way they are made.

The first obstacle the air meets in the vocal cords may be open, in which case the sound will be voiced. The vocal tract is resonance chamber and different sounds can be produced by changing the shape of the chamber. If you study the various types of closure below, it will help you to describe the different types of sound you can make.

Plosives:   These involve complete closure at some point ion the mouth. Pressure builds up behind the closure and when the air is suddenly released a plosive is made. In English, three types of closure occur resulting in three sets of plosives /p/* and /b/; it can be made by the tongue pressing against the alveolar ridge, producing the alveolar plosives /t/ and /d/ and it can be made by the back of the tongue pressing against the soft palate, producing the velar plosives /k/ and /g/.

Fricatives: These sounds are the result of incomplete closure at some point in the mouth. The air escapes through a narrowed channel with audible friction. If you approximate the upper teeth to the lower lip and allow the air to escape you can produce the labio-dental fricatives /f/ and /v/. Again, if you approximate the tip of the tongue to the alveolar ridge, you can produce the alveolar fricatives /s/ and /z/.

Trills: These involve intermittent closure. Sounds can be produced by tapping the tongue repeatedly against a point of contact. If you roll the /r/ at the beginning of a word saying.


      …  r.r.r.roaming  …

      You are tapping the curled front of the tongue against the alveolar ridge producing a trill which is, for example, characteristic of some Scottish pronunciations of English.

Lateral: These sounds involve partial closure in the mouth. The air stream is blocked by the tip of the tongue but allowed to escape around the sides of the tongue. In English, the initial /l/ sound in ‘light’ is a lateral; so is the final sound in ‘full’.

Nasals: These sounds involve the complete closure of the mouth. The velum is lowered, diverting the air through the nose. In English, the vocal cords vibrate in the production of nasals and so English nasals are voiced. The three nasals in English are /m/ as in ‘mat’, /n/ as in ‘no’ and /ŋ/ as in ‘sing’.

Affricates: Affricates are a combination  of sounds. Initially there is complete closure as for a plosive. This is then followed by a slow release with friction, as for fricative. The sound at the beginning of ‘chop’ is a voiceless affricate represented by the symbol /t∫/. We make the closure as for /t/ and then  release the air slowly. The sound at the beginning and end of ‘judge’ is a voiced affricate, represented by the symbol /ʤ /.

Semi-vowels: The sounds that begin the words ‘you’ and ‘wet’ are made without closure in the mouth. To this extent, they are vowel-like. They normally occur at the beginning of a word or syllable; however, and thus behave functionally like consonants. The semi-vowels are represented  by the symbols /j/ and /w/.

All sounds can be subdivided into continuants, that is, sounds which can be continued as long as one has breath: vowels, fricatives, laterals, trills, frictionless continuants; and non-continuants, that is, sounds which one cannot prolong: plosives, affricates and semi-vowels.


The eight commonest places of articulation are:

Bilabial: Where the lips come together as in the sounds /p/, /b/ and /m/

Labiodental: Where the lower lip and the upper teeth come together, as for the sounds /f/ and /v/.

Dental: Where the tip or the blade of the tongue comes in contact with the upper teeth as in the pronunciation of the initial sounds I ‘thief’ and ‘then’, represented by the symbols /θ/ and /ð/.

Alveolar: Where the tip or blade of the tongue touches the alveolar ridge which is directly behind the upper teeth. In English, the sounds made in the alveolar region predominate in the language. By this we mean that the most frequently occurring consonants /t, d, s, z, n, l, r/ are all made by approximating the tongue to the alveolar ridge.

Palato-alveolar: As the name suggests, there are two points of contact for these sounds. The tip of the tongues is close to the alveolar ridge while the front of the tongue is concave to the roof of the mouth. In English, there are four palato-alveolar  sounds, the affricates /t∫/ and  /ʤ / and the fricatives /∫/ and  /ʒ /, the sounds that occur, respectively, at the beginning of the word ‘shut’ and in the middle of the word ‘measure’.

Palatal: For palatal sounds, the front of the tongue approximates to the hard palate. It is possible to have palatal plosives, fricatives, laterals and nasals, but in English the only palatal is the voiced semi-vowel /j/ s in ‘you’.

Velar: For velars, the back of the tongue approximates to the soft palate. As with other points of contact, several types of sound can be made here. In English there are four consonants made in the velar region, the plosives /k, g/ , the nasal /ŋ/ and the voiced semi-vowel /w/ as in ‘woo’.

Uvular, pharyngeal and glottal sounds occur frequently in world languages. They are not, however, significant in English and so will not be described in detail.


Since vowels are produced with free passage of the air stream, they are less easy to describe and classify than consonants. The two articulatory organs to be considered are the tongue and lips for these two organs can mould and change the shape of the vocal tract by their movements in the production of vowels. It is the general shape of the vowel tract that gives the distinctive quality of sound of any vowel.


The symbols for these vowels  and their placing on the vowel chart are shown in Figure 3. The vowels [i] and [ɑ] were chosen first as representing the closest front and the openest back vowel respectively. Then [e, Ɛ , a] were determined auditory to occupy positions at equal intervals from each other. The same procedure was used in choosing [ɒ, o, u]. These are eight vowels of fixed quality to which we can compare any new vowel. By listening to the cardinal vowel we can soon tell, for example, whether the new vowel is half-way between [i] and [e] or one-quarter of the way from [Ɛ] to [ɒ]. The vowels of English have been plotted on to the diagram in Figure 3 to show their relationship to the cardinal vowels.

Fig. 1.3    The cardinal vowels with the English vowels
represented by crosses.

It may prove useful to offer a summary to guide the reader in the techniques to select in describing sounds. In describing a vowel it is important to state:

(1)      the length of the vowel, that is, whether it is long or short.

(2)      whether the vowel is oral or nasal* (*All English vowels are oral)

(3)      the highest point of the tongue

(4)      the degree of closeness

(5)      the shape of the lips

Thus the vowel sound in ‘tree’ would be classified as a long, oral, front, close, unrounded vowel. The vowel in ‘doom’ would be a long, oral, back, close rounded vowel. It is well to remember that when the front of the tongue is raised towards the hard palate we have a front vowel. When the back of the tongue is raised towards the soft palate, we have a back vowel. If the centre of the tongue is raised towards the juncture between the hard and soft palates, then we have a central vowel. The vowel sound in the word ‘the’ is a central vowel and would be described as short, oral, central, half-open, with neutrally spread lips.

       In describing consonants, one should state:

(1)      the type  of air stream used (in English all speech sounds are made on an egressive air stream although certain sounds of disgust and annoyance are made on an ingressive air stream)

(2)      the position of the vocal cords (apart for voiceless sounds, approximated and vibrating for voiced sounds)

(3)      the position of the velum (raised for oral sounds, lowered for nasal; that is, we must state whether a consonant is oral or nasal)

(4)      the manner of articulation (for example plosive, fricative, and so on)

(5)      the place of articulation (for example bilabial, alveolar and so on)

Thus, if we were asked to describe the initial sound in ‘buy’ and the final sound in ‘tin’ we would say that /b/ is made on an aggressive air stream and is voiced, oral, plosive and bilabial, and that /n/ is also uttered on an aggressive air stream and is voiced, nasal and alveolar.


In addition to finding the consonant and vowel segments (the segmental phonemes), the linguist must also identify the suprasegmental phonemes used in a language system. They include things like pitch, stress, intonation and juncture. They are called “suprasegmental” because they can occur only with the segmental phonemes, they are imposed on the segmental phonemes. Basically, the method used is the same as that employed in looking for the segmental phonemes. That is, whether a certain feature contrasts with another and whether the contrasts exists in minimal pairs. The analysis of suprasegmental phonemes is more complicated that segmental phonemes and linguists tend to differ in their analysis.

PITCH differences may result in differences of meaning at the word level in the tone languages like Thai and Chinese, the height and/or direction (up-sown contrast level) of pitch can distinguish words. In Chinese, for example, there are four tones which can distinguish words. If you say / t∫u/ with a high level pitch it means “lord”. A beginner of Chinese may therefore wrongly say “I praise the pig”, when he means “I praise the lord”. Pitch is therefore phonemic in Chinese, because it can distinguish between pairs of words like the above.

STRESS is the degree of loudness given to some syllables in relation to others. The significance of stress at the word level can be illustrated by its use in English. Minimal pairs like the following show that stress is phonemic in English.

Phonemic Transcription          Noun                           Verb

/insens/                                   incense                                    incense

/pƏmit/                                   permit                          permit

/insʌlt/                                    insult                           insult

/ridʒekt/                                  reject                           reject

In the above pairs, it is the difference in stress that makes a difference in meaning.

It is usual to describe English as having four degrees of stress., but for the purpose of teaching English as second language there is a simpler analysis. According to this analysis, there are two kinds of stress: fixed stress and variable stress. Words of more than one syllable have fixed stress while monosyllabic words have variable stress.