Phonotactic constraints

Tom eats nutsten
*Tom nuts eats*tne
*nuts eats Tomnet
nuts Tom eats*nte
*eats Tom nuts*etn
*eats nuts Toment

Just as some combinations of words are possible sentences, while others are not, some combinations of sounds are possible words, others are not. Three words or three sounds can combine in six different ways. Possible combinations are shown in green, impossible ones in pink. Here we are only concerned with the column on the right. Native speakers of English know not only that there does not happen to be a word tne, but also that there could not be such a word in English, since plosive+nasal clusters do not occur at the beginning (or end) of any word in this language. This would violate the phonotactic constraints of English.

Phonotactic constraints define what sound sequences are possible and what other sound sequences are not possible in a given language. These constrains are based on an examination of what sequences occur and what sequences do not occur in that language. This scheme works as far as patterns either really never occur or, if they do, they are very common. The problem is that some patterns occur, but are very rare, and it is not immediately clear whether we should take them to be allowed or disallowed by our phonotactic constraints. As an example, think of ʃw in English: this sound sequence occurs in compound words (like dishwasher díʃwoʃə), some names (like Gershwin gə́ːʃwin, Schwartzenegger ʃwóːtsənegə, or Schweppes ʃwéps) and a single one-morpheme common noun, schwa ʃwáː, which is a linguistic technical term. This is significant because neither morphologically complex words, nor names are very relevant in identifying phonotactic constraints, so there remains a single “real” representative of this cluster, the realness of which is jeopardized by its being unknown to very many speakers of English.

If we construe phonotactic constraints that allow ʃw, we will be faced with the question why there are so few instances of this cluster in English. The common practice is to not allow ʃw, and list it as an exception. The reason for doing so is our desire to formulate phonotactic constraints in terms of natural classes: we will see that these constraints involve not random sets of sounds, but groups that belong together because of their inherent properties (like place or manner of articulation or voicing).

Because phonotactic constraints are formulated in broad terms, they cannot predict exactly what sound strings are and what sound strings are not available in a language. There will be items sticking out in both directions. The following table shows this.

existingnonexistent
possiblebrickblick
impossibleschwabnick

The green cells contain the obvious cases: words that could and do exist, like brick, and words that could not and do not exist, like bnick. The latter type is referred to as a systematic gap: this item is ruled out by the system which disallows any plosive+nasal cluster at the beginning of a word. The pink cells are both deviant: blick is a commonly cited word that could exist, since words can begin with bl and they can end in ik, but this one just happens not to exist.In fact, blick does exist in dictionaries, but is unknown to most speakers. For whom, then, it does not exist. This is an accidental gap. And we have already discussed the case of schwa, which is ruled out by phonotactic constraints, and yet is part of the vocabulary of a small subset of native speakers of English.Again, for those speakers, who do not know of this word, it does not exist.

constraints on phonotactic constraints

We have already touched upon two limitations on phonotactic constraints. One is that — at least in the case of English — phonotactic constraints work only within morphs. We have seen that ʃw is very rare within a morpheme, but there is no reason why this cluster should not occur across a morpheme boundary, where the first morph ends in ʃ and the following one begins with w (like in brushwood, dishwasher, freshwater, meshwork, etc). It is not only compound words that show this “freedom of combination”, but even nonsyllabic suffixes create clusters that do not occur within any single morph: eg loves ləvz, smoothed smuwðd, washed woʃtThe status of ʃt is like that of ʃw: it is practically nonexistent within a morph. end in such clusters.

We have also seen that phonotactic constraints typically refer not to single segments, but to larger groups that belong together by some phonetic property, ie natural classes. Thus if we find that pw is not a possible initial cluster in English, it does not come as a surprise that bw and fw are also not possible initial clsuters, since p, b, and f are all labial consonants. We will see this tendency in the constraints detailed below.

Another important property of phonotactic constraints is that they usually affect neighbouring segments. Segments that are not adjacent rarely affect each other. When they do (like in vowel harmony, for example), the analysis always involves some special machinery. We can also observe that interaction between two neighbouring consonants or two neighbouring vowels is much more common than that between a vowel and the following consonant. Constraints between a consonant and the following vowel are also unusual.

When gauging the possibility of adjacent sounds, we must also take word boundaries into consideration. Some consonants do not occur at the beginning of a word, others do not occur at the end of a word. That is, we will not only have constraints that ban the adjacency of two consonants, but also other that ban the adjacency of a consonant before or after a word-boundary symbol.

the beginning of the word

Any vowel may occur word initially, although u, , and uw are rare in this position compared to other vowels (umlaut, Uzbek, Uhrsprache, oddles, ooze, Uber are a few examples, many of them not native words).

Of consonants ŋ does not occur at the beginning of words at all. This fact fits in well with the restricted distribution of the velar nasasl: there exist accents of English in which ŋ does contrast with n at all, but even in accents where it does it only occurs word finally (sing siŋ), before velar plosives (sink siŋk, finger fiŋgə), and rarely before unstressed ə (gingham giŋəm).In singing síŋiŋ both ŋ’s are word final. Word-initial ʒ is rare and ð only occurs in function words (eg this, thus, though).

sʃz
pspotspiel
tstop
stew(stew)
kScot
f(sphere)
msmock(schmooze)
nsnot(schnapps)
lslot(schlep)(zloty)
r(Srinagar)shred
wswatschwa(Zwingli)
j(suit)(Zeus)

Word-initial consonant clusters follow three patterns in English. Many of these clusters begin with s (marginally ʃ or z) with a plosive or a sonorant after them. (It is not at all obvious if the plosive after s is “voiceless” or “voiced”. The more wide-spread — and probably mistaken — tradition is to take these to be “voiceless”, this is what we will do here.) Accordingly, we have the following clusters: sp (spot), st (stop), stʃ (stew stʃuw in BrE, but stuw in AmE), sk (Scot), sm (smock), sn (snot), sl (slot), sw (swat), marginally sf (sphere), or sr (Srinagar). Some speakers still have sj word initially (eg in suit sjuwt), but most omit the j here (suwt). Of clusters beginning with ʃ only ʃr (shred) is common, others are rare. We have already discussed ʃw, this and others feature mostly in Yiddish loanwords (eg ʃm schmooze, ʃn schnapps, ʃp spiel, ʃl schlep). Interestingly, more and more speakers palatalize s before , yielding ʃtʃ (stew stʃuw > ʃtʃuw). Likewise zl (zloty), zw (Zwingli), zj (Zeus) are untypical clusters word initially. We can put these aside as “impossible”, but exisitng.

Another group of two-member word-initial consonant clusters are composed of a nonsibilant obstruent (here symbolized by T) followed by a sonorant (here symbolized by R), except j, which we will discuss below. The set of obstruents occurring in this type of cluster is limited to plosives (excluding the two affricates, tʃ dʒ, which are sibilants), ie p t k b d g and the fricatives f v θ, but not s ʃ z ʒ, which are sibilants, or ð, which does not occur in word-initial clusters, at all. The set of sonorants involved in these clusters excludes the nasals and j h, so we are talking about l r w. The following chart shows all the possibilities, with an example for those that are allowed by phonotactic constraints.

lrw
pplankprank(pueblo)
bblackbrack(bwana)
fflankfrank(phwoar)
v(vlog)(vroom)(Vuitton)
ttraptwain
ddrilldwell
θthrackthwack
kclackcrackquack
ggladgrad(guacamole)

There is a significant tendency that we can detect in these patterns: those clusters where the two members have the same place of articulation do not occur. We have concluded earlier that despite the minor phonetic differences in their place of articulation, the bilabial plosives (p b), the labiodental fricatives (f v), and the labiovelar glide (w) are all broadly labial. We are now corroborated in our earlier conclusion: p b f w all have the “same” place of articulation, they are homorganic. In the same way, the dental θ and the alveolar t d l are also homorganic, which explains why tl, dl, and θl do not occur. Since r does occur after each of these consonants, it seems that postalveolar/palatal consonants are not homorganic with dentals and alveolars, and, of course, they are not homorganic with labials and velars either. It is surprising then that gw is very rare (guacamole, Guam, Gwen, etc), while kw is a well-attested cluster and elsewhere we do not see any difference between the combinatorial possibilities of voiceless and voiced plosives. Marginally other, homorganic clusters also occur: pw (pueblo), bw (bwana), etc, but we exclude these clusters, like we have excluded ʃw.

oldnew
purepjoː
beautybjuwtij
fewfjuw
viewvjuw
mutemjuwt
Thuleθjuwlθuwl
suitsjuwtsuwt
Zeuszjuwszuws
Lukeljuwkluwk
newnjuwnuw
tubetjuwbtʃuwb
dunedjuwndʒuwn
ruleruwl
chewtʃuw
juicedʒuws
chuteʃuwt
jupeʒuwp
cutekjuwt
guegjuw
hugehjuwdʒ

Although j is a sonorant, just like l r w, we have excluded clusters with it above, because j behaves differently from the other sonorants. Its occurrence after other consonants is much less restricted. Cj clusters are subject to the ban on homorganicity, but the first member of these clusters may not only be an obstruent, but also a sonorant. There is some variation in which Cj clusters are possible and which are not, British English is currently undergoing a change in this respect, so we give two possibilities in the following chart introducing these clusters. The variant in the “old” column is becoming less common, the one in the “new” column is becoming more common in British English. We can see that j is stable after labials and velars and that it is stably absent after palatals. Alveolar/dental+j clusters are currently being simplified in British English, either losing the j (Thule, suit, Zeus, Luke, new) or merging it with the preceding plosive into a palatal affricate (tube, dune).

Three-member consonant clusters also occur word initially. The first in these is always s (for some speakers ʃ), while the second and third member of these clusters are also possible as two-member clusters. That is, skr (screw) or smj (smew) are possible word-initial clusters, since kr and mj are also possible, but spw or stl are not, just as pw and tl are not possible at the beginning of a word either. The inference does not work in the other direction: for example, tw (twig) and, for some speakers, nj (new) are possible two-member clusters, but their three-member versions with s, stw and snj are not found at all.

spl splitspr springspj spew
str/ʃtʃr strap*stwstj stew
(skl sclerosis)skr scratchskw squadskj skew
*sfl(sfr sphragistic)*sfj
(smj smew)
*snj
(slj sluice)

the end of the word

We have already seen that the occurrence of vowels at the end of words is restricted. It is a defining property of checked vowels — short vowels apart from unstressed ə — that they do not occur word finally. We only find unstressed ə, long monophthongs (ie R vowels), and diphthongs (ie free vowels) here.

In nonrhotic accents, like CUBE, r is also absent word finally, as is h in most varieties of English. Whether j and w occur in this position depends on our analysis of diphthongs. If they are taken to be vowel clusters, then these two glides are also excluded from word-final position. An increasing number of CUBE speakers also lack word final l, but they will then have w instead: tell tel > tew. Nasal consonants and obstruents all occur word finally, though ʒ is not common: ram, ran, rang, wrap, rat, ratch, rack, crab, bad, badge, back, caff, math, bass, ash, have, with, has, beige.

The most common word-final two-consonant cluster in English consists of a nasal (N) followed by an obstruent (T). While word-initial TR and Cj clusters could not be homorganic, word-final NC clusters are obligatorily homorganic. The following chart contains the available combinations.

labialdentalalveolarpalatalvelar
hemp mp bent nt bench ntʃ bank ŋk
(corymb mb) bend nd sponge ndʒ (langue ŋg)
nymph mf month pence ns (avalanche )
*mv * bronze nz (mélange )

There is no obstruent for the grey cells: no dental plosives or velar fricatives in English. The pink boxes show that mb ŋg mv nð are impossible word finally.Since we are used to seeing the spelled form of words, it may be surprising that jamb and jam are homonyms: dʒam. Yet, very marginally mb and ŋg do occur word finally in some very rare words. The clusters and are also not common, we only have recent loans to exemplify them.

Although nonhomorganic NT clusters do occur word finally, but they almostThere are some apparently monomorphemic counterexamples, like James dʒejmz or Thames temz. never belong to the same morpheme: trimmed trim#d, songs soŋ#z.

The liquid l may also occur in the first position of a word-final consonant cluster. (r may not in a nonrhotic accent, like CUBE.) As opposed to NT clusters, the second member of an lC cluster may be a nasal, not only an obstruent. As already mentioned, there are more and more speakers of British English, who replace l by w in this position, thus losing lC clusters. The next chart contains the possible word-final lC clusters.

labialdentalalveolarpalatalvelar
help lp belt lt belch ltʃ bulk lk
(bulb lb) held ld bulge ldʒ (Glenelg lg)
shelf lf filth else ls (Welsh )
twelve lv * Charles lz *
film lm kiln ln *

The clusters lb and provide hardly any example, hence the parentheses around them. Glenelg (a village in Scotland) is even more marginal, probably we should claim that lg# is not grammatical.

The last type of two member word-final clusters is composed of two obstruents. These clusters are exceptionlessly voiceless and the fricative in them is typically s, with f in one cluster. The possibilities are shown in the next chart.

lisp sp list st risk sk lapse ps quartz ts fix ks
*fp lift ft *fk script pt *tt fact kt

We see that either the second member of a TT cluster is the alveolar t or s, or the first member is the alveolar s — or exceptionally t in ts, which is not very common within a morpheme.It is across a morpheme boundary: cats kats. It is also noteworthy that TT clusters are available in both orders: sp and ps, sk and ks, st and ts. This is not found for other clusters.

If we take diphthongs to be vowel+consonant sequences, there are many types of CCC# clusters in English: eg kind kajnd, post wst, etc. If, however, we analyse them as vowels, as we did in this course, there aren’t very many other three-member clusters left. The following occur: prompt prompt, instinct instiŋkt, glimpse glimps, lynx liŋks (and even for these alternative analyses are available, but not discussed here), and there are two clusters with one example each: sculpt skəlpt, mulct lkt.

constraints on sonorants

Just as h does not occur word finally, it also does not occur before a consonant in almost all current accents of English. In fact, even being before a vowel is not “enough” for h to “survive”: it is typically pronounced before a stressed vowel (Manhattan manhátən) and at the beginning of a word (horizon hərájzən). This consonant rarely occurs before an unstressed vowel within a word (maharaja máːhəráːdʒə).

The distribution of r is well-known: in nonrhotic accents it occurs only before vowels (and sonorant consonants: barrel bárəl or bárl̩). For many speakers there is no such restriction on the distribution of l, it cocurs freely before vowels and consonants, as well as word finally. (Except for speakers who replace l’s not followed by a vowel by w, which we will return to in topic 12.)

In the analysis we follow in this course, the distribution of j and w is similar to that of r. However, if diphthongs were analysed as vowel+consonant sequences, these two glides would be available in any position in a word: word finally (boy boj, now naw) and before a consonant (voice vojs, crowd krawd).

We saw that the distribution of ŋ is limited: it occurs practically only before k, g, and word finally. The other two nasals also occur before nonhomorganic consonants (damsel dámzəl, convoy kónvoj) and vowels (map map, nap nap).

constraints between vowels and consonants

Earlier we have mentioned that phonotactic constraints between a vowel and a consonant are not as common as those between two consonants. We will not look at what there is.

The checked vowel u itself is rathare rare, and it happens never to occur before , θ, or ð. Word finally we do not find or either, but these gaps probably are due to the overall rarity of ð.

We find more systematic constraints on long vowels and consonants following them. Three of the R vowels — the broad ones: aː ə: oː – occur freely before a word-final consonant (heart haːt, horse hoːs, hurt həːt), but the other three — the smooth ones: iː eː uː — are not common in this position, though they do occur in a few words (weird wiːd, scarce skeːs). The only consonant before which these two vowels are common is rBut r, recall, does not occur word finally. (hero hiːrəw, vary veːrij, fury fjuːrij).

Broad vowels also occur before consonant clusters (past paːst, launch loːntʃ, excerpt eksəːpt), but smooth vowels never occur here.Except in morphologically complex words: pierced piːst, which we ignore in setting up phonotactic patterns.

Diphthongs generally occur freely before consonants with the exception of ŋ and r. We have seen earlier that the diphthongs ej əw ij uw do not occur before r. Diphthongs occur before consonant clusters of the TR type (April ejprəl), and are rather common before NT, lC, and sC clusters provided that both consonants are coronal (kind kajnd, faint fejnt, wound wuwnd, colt kəwlt, hold həwld, waste wejst, most məwst, oust awst, etc), but not before ft, where the first, sp, where the second, or mp, where both consonants are noncoronal. The fact that ŋ does not occur after diphthongs makes it behave like a noncoronal consonant cluster, say, ŋg (from which it historically derives). There are two rather unexpected further sets of constraints. The diphthong aw only occurs before coronal consonants: shout ʃawt, loud lawd, couch kawtʃ, gouge gawdʒ, house haws, arouse ərawz, south sawð, Louth lawð, brown brawn, owl awl;Again names from other languages may behave exceptionally: Lauper lawpə. but not before noncoronal consonants, ie no words with awp, awm, or awk.

The distribution of oj is even narrower: it only occurs before alveolar consonants. We have exploit əksplojt, void vojd, voice vojs, noise nojz, coin kojn, coil kojl. Neither noncoronals (p b f v m k g ŋ), nor coronals that are not alveolar but dental (θ ð) or palatal (tʃ dʒ ʃ ʒ) are typically possible after oj.

There are no systematic constraints between a consonant and the following vowel in English, but we do find two strong tendencies after two types of consonant clusters. Cj clusters are typically followed by uw (cute kjuwt), its pre-R version, , , or əː (cure kjuː/kjoː/kjəː), u (accurate ákjurət), or ə (ákjərət, million míljən). Cj occurs before other vowels only in loans like (pinyin pinjin). Cw, on the other hand,Words in which we expect Cwuw to occur based on their etymology have lost w: two twuw (cf twice), who hwuw (cf when hwen). occurs before any vowel (qualm kwaːm, dwell dwel, twig twig, quad kwod, twirl twəːl), but not uw or u. Interestingly, this constraint does not hold for sw: swoon swuwn, etc. Note that without the initial consonant, ji and wu are possible: yid jid, wolf wulf.

language change

One aspect of language change is sound change. We know, for example, that the diphthong aw was earlier ow and ever before that uw, so in Old English mouse was muws, which then became mows and later maws. The short u has also changed in most words: OE hunt hunt became hənt. We also know that word-final noncoronal voiced plosives were lost when preceded by a nasal, so OE siŋg became siŋ and dumb became dəm.As the b was lost, people became uncertain of the spelling, eg they introduced a b in the written form of thumb, which in fact never ended in b. Such sound changes also affect phonotactic constraints, for example, they resulted in nd being the only word final nasal+voiced plosive cluster possible, since both mb and ŋg were lost at the end of words.

The distribution of h has also become radically reduced: in Old English this consonant occurred word finally and before consonants (eg night was niht, sigh was sih, rough was ruh).We can see that the spelling still indicates this by gh. It also disappeared in word-initial consonant clusters: OE hriŋg is now riŋ (ring), hlaːf is ləwf (loaf), and hwaːl is wejl (whale). It seems like the only cluster beginning with h that’s left in English is hj (eg in huge hjuwdʒ), but in fact this cluster did not exist in Old English, together with all Cj clusters it is a new development.

Besides Cj clusters, Old English also lacked the contrast between voiceless and voiced fricatives. So the homorganic fricative pairs fv, θð, and szThe palatal ʃ and ʒ did not alternate in this way. were allophonic variants, the latter ones occurring within a word if not adjacent to a voiceless plosive, the former ones elsewhere. This ancient distribution is still reflected in modern English word pairs like five–fifty, bathe–bath, graze–grass, where the word-final -e’s represented a pronounced vowel in OE — preserved in the conservative spelling — so the voiced fricatives were not word final back then. With the loss of these word-final vowels the voiced fricatives were stranded and came to contrast with the voiceless fricatives that were word final earlier too, eg in believe–believe bəlijfbəlijv, wreath–wreathe rijθrijð, or close kləwskləwz.

loanword adaptation

Loanword adaptation often involves modification of the sound shape of words to fit the phonotactics of the receiving language. For example, English cannot have short vowels word finally (except for unstressed schwa). So any such vowel of donor languages will be repaired: typically diphthongized. French café kafé was adapted as káfej, Italian spaghetti spagétti as spəgétij, putto pútto as pútəw,The geminate (long) tt’s of the two Italian words are also simplified in English, which does not have such constructions. Polynesian tabu as təbúw, etc, but short a was lengthened, eg Spanish panamá is pánəmaː.

We have seen above that some voiced fricatives came to occur word finally because the vowel after them was lost. But fv and sz also contrast word initially in English, which is not due to vowel loss, but to French and Greek loanwords, respectively. So the large majority of words beginning with v come from French (eg very, a minimal pair of ferry), and those beginning with z from Greek (eg zeal, a minimal pair of seal). Similarly, all English words beginning with or ending in ʒ are mostly loanwords from French or Russian.

We may conclude that loanwords are adapted to satisfy the phonotactic patterns of the host language, but they also may force changes, introducing new phonotactic patterns.

show me the questions
let’s go back to the contents page