Word division and Clitics

Word is the smallest of the LINGUISTIC UNITS which can occur on its own in speech or writing. It is difficult to apply this criterion consistently. For example, can a FUNCTION WORD like the occur on its own? Is a CONTRACTION like can’t (“can not”) one word or two? Nevertheless, there is evidence that NATIVE SPEAKERs of a language tend to agree on what are the words of their language.

       In writing, word boundaries are usually recognized by spaces between the words. In speech, word boundaries may be recognized by slight pauses. (Longman Ditionary)

Three differences senses of the word “word.”


Senses and b  are concerned with words as objects within a grammar. This chapter is concerned with clarifying sense c, that is, with finding word boundaries in specific sentence. We will be asking the following types of questions:

(2)   a. How do we make an analysis of boundaries between words in a sentence.

  1. How do we decide if  a morpheme is a word or an affix. It is bound or free?
  2. How many terminal nodes do we need in a tree? How do we distribute the

            phonological material of a sentence among them?

  1. What do we treat as syntax and what as morphology?
  2. Where do we write spaces in a practical orthography?

There are a few guidelines, both syntactic and phonological, to use in answering these questions. In most cases analyzing word boundaries is straightforward, but sometimes the syntactic guidelines suggest different word breaks than the phonological ones. Thus, we must distinguish PHONOLOGICAL WORD BOUNDARIES and SYNTACTIC WORD BOUNDARIES; these are discussed in the next two sections.


The traditional conception of word boundaries is phonological—a word in sense (1c) is a minimal utterance.

Boundaries are divisions between linguistic units.

There are different types of boundaries. For example, boundaries may be

between words, e.g. the##child

between the parts of a word such as STEM1 and AFFIX, e.g. kind#ness

between SYLLABLES, e.g. /beI + bi/ baby

(3)   Phonological Guidelines #1

        No utterance can be shorter than a single word.

If native speakers do not recognize a morpheme out of context or would never pronounce it by itself, then it is not a word. If they can, it probably is. For example, the minimal reply to a question is single word, not part of a word.

(4)   How many children do you have?

          Possible answers:               Impossible answers:

          Nineteen                           *-teen

          Lots                                 *-s

When you learn a word in isolation (which is normally what happens),  this guideline alone can give you a pretty good idea about word boundaries. When you hear the same word in a sentence, a reasonable hypothesis is that there are word boundaries on either side of it (allowing for any affixes, of course).

      A second phonological guideline is that pauses are generally impossible inside words.

(5)   Phonological guidelines #2

        Pauses are only possible at word boundaries.

If there is a pause at a certain point in a sentence or if a pause is possible there, then that point is probably a phonological word boundary. For example, consider how an irritated elementary school teacher might speak to the class, pausing for effect between each word, but not within words.

(6)   a.  [boIz… ænd  gɜ:lz]           ‘boys and girls’

  1. *[ boI…   z… ænd…  gɜ:l…..z]

the problem with (6b) is that the plural suffixes, which are bound morphemes, have been pronounced as separate words.

       Once you begin to apply these guidelines in a language, you may find phonological rules whose behavior is affected by word boundaries. You can use them as additional diagnostic tests.

(7)    Phonological guidelines #3

         Look for phonological rules that provide information about word boundaries, then use them to help define word boundaries in unclear cases.

For example, the rules in English which account for the variation in voicing of the plural suffix –s do not operate across word boundaries. There is good reason to believe that the underlying form of this suffix is /-z/. there is a phonological rule that causes /z/ to devoice after a voiceless segment (as in bats), but a /z/ which begins a separate word (as in the bat zoomed) does not devoice in that environment.

(8)                                                      ‘bats’            ‘the bat zoomed’

          Input to phonological rules:     bæt-z           ðƏ bæt  zumd

          Output of phonological rules: bæt-s           ðƏ bæt  zumd

So, if you find a /z/ at the beginning of some morpheme which sometimes becomes [s] because of this rule, then the rule is treating the morpheme as a suffix, not a separate word. That is, the rule provides evidence that the morpheme is a suffix.


       When we turn to syntactic word boundaries, we are essentially trying to determine what to put on separate terminal nodes in a tree. Our first hypotheses about syntactic word boundaries are based on what we know about phonology.

(11)   Syntactic guidelines #1

         Any phonological word break is generally also a syntactic word break.

So, once we have decided that there is a phonological word boundary before zoomed in the bat zoomed, we largely committed to recognizing a syntactic word boundary there, too, and putting zoomed on a separate terminal node. (warning: the opposite is not true, as discussed below; some syntactic word breaks are not phonological ones)

       But there are purely syntactic reasons for making syntactic word breaks too. One of the strongest is based on the overall structure of the tree.

(12)   Syntactic guidelines #2

         Any major constituent break (e.g., the beginning or end of a phase) is also syntactic word break.

       Between almost every pair of morphemes is a phrase boundary, hence also a syntactic word boundary.

(13)   [NP the [QP two] [AP [DegP very] little] dogs]

       Other guidelines may also be helpful in deciding whether a morpheme is an affix or a separate syntactic word.

(!4)   Syntactic guidelines #3

        Affixes tend to occur next to only a single type of word (their stems) and in a fixed order; words occur more freely in various combination with each other.

 This means words can often be moved with respect to each other; this is usually not possible for morphemes within a word.

(15)   a. I see the tiny, little people down on the ground.

  1. I see the little, tiny people down there on the ground.

(16)   a. teach-er-s

  1. *teach-s-er

it also means a syntactic word or phrase can usually be inserted between two other morphemes only at syntactic word breaks; trying to insert a word between a stem and an affix usually results in the affix being attached to the wrong type of word. So, for example, to find out if the English plural marker –(e)s is a word or affix, we could try inserting adjectives (which we already know to be separate syntactic words) and PPs between the noun and the plural marker.

(17)   a.   the dog-s

  1.  *the dog-big-s
  2.  *the dog [in the manger]-s

this doesn’t work, so according to both phonological and syntactic guidelines, -(e)s is bound. Finally, one can often tell the difference between words and affixes by sniffing out the irregular forms in the language.

(18)   Syntactic guidelines #4

         Morphology often shows great irregularities, while combinations of separate syntactic words do not.

For example, there are irregular noun plurals like oxen in place of *oxes, but there are not any irregular combinations of the definite article the with particular nouns. So by this guideline, the plural –(e)is an affix and the is probably not (although we can’t tell for sure, because it might be a perfectly regular affix).


A clitic is a grammatical form which cannot stand on its own in an utterance. It needs to co-occur with another form which either precedes or follows it.

Some languages have clitic pronoun forms which are attached to the verb. In English, n’t the contracted form of not in couldn’t, isn’t, and don’t can be considered a clitic.

(Longman Ditionary)

Word breaks are not always obvious. Sometimes the different guidelines don’t even agree; some guidelines whether a morpheme is bound, while others may suggest that it is free. If it is unclear whether a morpheme is a word or an affix, it is generally called a CLITIC. In some ways (especially by phonological guidelines), clitics are like affixes; in other ways (especially by syntactic guidelines), they are like words.

      As examples, consider two homophonous morphemes in English, both of which are spelled ‘s:

(20)   Contracted ‘s (from is)

  1.  hu-z gɔn                         who’s gone?
  2.  hwƏt-s  ðæt                    What’s that?
  3.  hwit∫-iz  jɔ:(r)z                 which’s yours?

      Possessive ‘s

  1.  ðƏ mæn-z Ə`pinjƏn                   the man’s opinion
  2.  ði kæt in ðƏ hæt-s Ə`pinjƏn       the cat in the hat’s opinion
  3.  ðƏ pit∫-iz  pit                                the peach’s pit.

Should we analyze them as affixes or words? Phonological guidelines suggest that both are affixes.

(21)   a.  They are never pronounced in isolation.

  1. They cannot be preceded by pause.
  2. They do not contain vowels (at least in some environments), unlike clear cases of words.
  3. They undergo devoicing, like the plural suffix in (8)

syntactic guidelines suggest that they are separate words.

(22)  a. Contracted ‘s functions as the verb in a clause (as a contraction of is or has)

  1. Possessive ‘s always attached to the end of a noun phrase, not always to the head noun (look again at [20e])
  2. both can attach to a variety of word types (see [24] below)
  3. Contracted ‘s is completely regular; there is no stem suppletion.

On the other hand, at least one syntactic guideline suggests that possessive ‘s is an affix.

(23)  There are irregular possessive forms of pronouns, like my, which are found in place of regular combinations like *I’s.

These morphemes are not clearly words or affixes, but have some characteristics of each. They are PHONOLOGICALLY BOUND like affixes and (at least partially) SYNTACTICALLY FREE like words. So, in order to have some label for them until we figure out exactly what they are , we call them clitics.