Zoltán G. Kiss

The eﬀect of homophony avoidance in voicing*

1. Introduction It has been long acknowledged that the production and perception of speech are aﬀected by the presence or absence of higher levels of linguistic information, too. The recoverability of meaning heavily relies on semantic context (Ganong 1980); similarly, the precision of articulation is inversely proportional to the presence of semantic information (Goldrick et al. 2013; Kitahara et al. 2019). Diachronic phonological processes, for example, are often reported to seek homophony avoidance (see, e.g., Silverman 2012). The question arises whether homophony avoidance is actively present in synchronic language use, too, and how (if at all) it interacts with phonological contrast maintenance or neutralisation. A number of studies demonstrate that laryngeal processes previously considered to be neutralising (e.g., word-ﬁnal devoicing, voicing assimilation) are not completely neutralizing phonetically. An underlyingly voiced obstruent often contains more phonation in devoicing contexts than an underlyingly voiceless obstruent, or if this is not the case, other phonetic features like the length of the preceding vowel, or the vowel/consonant duration ratio are systematically diﬀerent, thereby maintaining the underlying laryngeal contrast (Bárkányi and G. Kiss 2019 provide an overview in Hungarian of this). The present study seeks to explore to what extent a particular lexical factor, homophony avoidance, i.e., whether or not a word forms a minimal pair with another word in the lexicon (“minimal pairhood”), aﬀects the realisation of the primary laryngeal feature, the amount of phonation, in the word-ﬁnal alveolar stops /t/ and /d/ and the fricatives /s/ and /z/ in potentially neutralising and nonneutralising contexts in the speech of Hungarian native speakers. To this end, acoustic experiments were carried out with test words ending in these obstruents in minimal pairs and non-minimal pairs that were placed in various phonetic environments.

I would like to thank the two anonymous reviewers for their valuable comments and suggestions. This research was supported by grant nr. NKFIH K 142498 of the Hungarian National Research Fund (principal investigator: Péter Szigetvári).

The eﬀect of homophony avoidance in voicing 17

The paper is structured as follows. First, I provide a brief overview of the laryngeal opposition and voicing assimilation of Hungarian obstruents in particular (section 2) and of homophony avoidance in general (section 3). In the second half of the paper, I present the details of the acoustic-production experiments (section 4), and their results (section 5), while I discuss the relevant conclusions of the experiments in section 6.

2. Voicing contrast and voicing assimilation in Hungarian Based on vocal fold vibration and the timing of supraglottal articulatory gestures, distinct phonetic properties arise that languages use diﬀerently in their laryngeal oppositions. Most languages display a binary opposition. In Hungarian, for instance, the two values of the feature [±voice] are in contrast. Because the timing of the abduction and adduction of the vocal folds may vary, several types of voiced and voiceless articulation are possible, and therefore, various types of laryngeal articulation may be associated with the [±voice] feature. In Lisker and Abramson’s (1964; 1967) classical work, these types of laryngeal diﬀerences in the production of initial stops are based on three phonetic categories according to the onset of periodic vocal fold vibration following the closure phase (referred to as Voice Onset Time, VOT): (i) negative VOT, where phonemically voiced stops are realised with voicing lead like in Hungarian; (ii) short-lag VOT, where phonation starts short after closure release, e.g., lenis stops in English; (iii) long-lag VOT where voicing starts at least 35–60 ms after closure release, e.g., voiceless aspirated stops in English. Languages where the contrast is based on negative VOT vs. zero/short-lag VOT are called voicing or true voice languages, while languages where the contrast is based on short-lag VOT vs. long-lag VOT are called aspirating languages. Hungarian belongs to the former group. According to traditional descriptions (see the summary in Jansen 2004; 2007 for example), the voicing properties of true voice languages are manifested not only in the VOT of stops, but also in that the [±voice] feature of all obstruents (including aﬀricates and fricatives) participates actively in phonological processes such as voicing assimilation, and that voicing contrast is typically preserved in absolute word ﬁnal (prepausal) position as well. Phonation, however, cannot be fully maintained in all phonetic contexts. Due to aerodynamic reasons, voicing contrast is fragile and may (partially or completely) disappear in absolute word-ﬁnal position and before another obstruent. This is especially true for fricatives as they require high supraglottal pressure to maintain turbulent noise and high subglottal pressure to ensure voicing (Ohala 1983). According to the Hungarian descriptive tradition, adjacent obstruents cannot diﬀer in their voice feature, thus within words, as well as across a morpheme or word boundary obstruents must agree in voicing (akta

The eﬀect of homophony avoidance in voicing 18

‘ﬁle’, labda ‘ball’), unless a pause intervenes. Consequently, the laryngeal contrast of Hungarian obstruents is completely neutralised before another obstruent. Thus, while for example, /s/ and /z/ are in contrast word-initially (szár /saːr/ ‘stem’ – zár /zaːr/ ‘lock’), in intervocalic position (mészig /meːsiɡ/ ‘lime.terminative’ – mézig /meːziɡ/ ‘honey.terminative’) and word-ﬁnally (mész /meːs/ ‘lime’ – méz /meːz/ ‘honey’), the contrast is thought to be completely lost before another obstruent: the /z/ in méztől ‘honey.ablative’ is claimed to be phonetically identical to the /s/ in mésztől ‘lime.ablative’. The same is true for regressive voicing: the /s/ in mészből ‘lime.elative’ is claimed to be phonetically identical to the /z/ in mézből ‘honey.ellative’, thus, voicing contrast is neutralised (this traditional view is described in, e.g., Siptár & Törkenczy 2000). (1) Regressive voicing assimilation: spread of voicing /t/+/b/ → [db]: e.g., hát-ba ‘back-ill’; két#barát ‘two friends’ /ʃ/+/b/ → [ʒb]: e.g., hús-ba ‘meat-ill’; hús#beszerzése ‘supply of meat’ (2) Regressive voicing assimilation: spread of voicelessness /b/+/t/ → [pt]: e.g., láb-tól ‘foot-abl’, láb#tisztítása ‘cleaning of foot’ /z/+/t/ → [st]: e.g., víz-től ‘water-abl’, víz#tárolása ‘storage of water’ According to the traditional descriptions, regressive voicing assimilation also aﬀects consonant clusters (i.e., the rule is “iterative”, works from right to left, from one segment to the preceding one): (3) Regressive voicing assimilation in consonant clusters /st/+/b/ → [zdb]: e.g., kereszt-ben ‘cross-iness’ /ɡd/+/p/ → [ktp]: e.g., smaragd#pénzértéke ‘value of emerald’ Sonorants and vowels do not trigger regressive voicing assimilation in standard Hungarian: (4) /p/+/n/ → [pn] (*[bn]): e.g., kép-nél ‘picture-ades’ /s/+/n/ → [sn] (*[zn]): e.g., rész-nél ‘part-ades’ Over the past two decades an increasing number of studies have shown that phonological processes believed to be neutralising are often not (completely) neutralising in speech production. In acoustic studies, several authors have pointed out that regressive voicing assimilation in Hungarian is partially contrast preserving. Jansen (2004) found that /k/–/ɡ/ and /ʃ/–/ʒ/ systematically differ in voicing before voiced obstruents, and vowel length before /ʃ/–/ʒ/ is also diﬀerent. Gráczi (2010), examining nonsense words, found that in word-ﬁnal

The eﬀect of homophony avoidance in voicing 19

position the vowel/consonant duration ratio diﬀers according to the underlying voicing of the consonant. Markó et al. (2010) argue that although voicing assimilation seems to be obligatory, it is a gradient rather than a categorical process (unlike the present paper, the authors also examined environments where there was a pause between the target and the triggering consonant). Bárkányi and G. Kiss (2015) found a signiﬁcant diﬀerence in vowel length before voiced and voiceless fricatives in regressive voicing assimilation contexts. Bárkányi and G. Kiss (2020) also found partial contrast preservation in the voicing of three-member consonant clusters. The authors also showed that stops and fricatives display a diﬀerent behaviour in assimilation contexts. However, none of these studies address potential lexical eﬀects – such as the existence of close lexical competitors, minimal pairhood, homophony avoidance, or wordedness – in the production of obstruents in Hungarian. The present study aims to ﬁll this gap by examining the voicing of alveolar obstruents in minimal pairs vs. nonminimal pairs in various phonetic environments.

3. Homophony avoidance It has been long observed that speakers maintain a comfortable buﬀer zone between a certain value of a segment and its immediate systemic neighbours (Martinet 1952), while listeners tolerate deviations from the intended value and the actual realisation as long as they perceive it as unintended coarticulation due to the phonetic context (e.g., Ohala 1981), a kind of compensatory eﬀect. It is difﬁcult to initiate and maintain voicing in obstruents in word-ﬁnal and preobstruent position, thus keeping such a buﬀer zone is not easy, which may lead to homophony (like in the above-mentioned case of méztől–mésztől). It has also been long acknowledged that languages try to avoid homophony resulting from sound change mostly by morphosyntactic or lexical means (see the classic study of Gilliéron 1910 for French, and Silverman 2021 for Korean, for instance). Wedel et al. (2013) statistically analysed neutralising sound changes in nine languages and concluded that the number of minimal pairs distinguished by a pair of phonemes made signiﬁcant predictions as to whether the contrast between them would be neutralised. Although the role of homophony avoidance in sound change remains debated (e.g., Sampson 2013), the question arises whether speakers seek to avoid homophony in synchronic language use as well. In a study of the masculine noun paradigms of Russian, Munteanu (2021) concluded that the language displays a synchronic restriction against homophonous forms within the same paradigm. In nouns where the singular genitive would coincide with the plural nominative and where singular dative and prepositional cases would be identical, stress shift is much more frequent. So, the potentially homophonic forms are distinguished by their prosodic

The eﬀect of homophony avoidance in voicing 20

properties. Yin and White (2018) in an artiﬁcial language learning experiment with native English speakers show that learners are less likely to learn neutralising phonological rules than non-neutralising ones, but only if these create homophony between lexical items that came up during learning. In the artiﬁcial language in their experiment plural was marked by /i/ which palatalised the ﬁnal alveolar fricatives and stops of the singular forms. The process was either neutralising or created allophones. Charles-Luce (1993), investigating regressive voicing assimilation in Catalan, observed that there is more likely to be incomplete neutralisation – as opposed to complete neutralisation – in contexts that would otherwise be semantically ambiguous. The author found that the length of the preceding vowel distinguished voiced and voiceless obstruents signiﬁcantly more often in minimal pairs than in non-minimal pairs. Kharlamov (2014), examining word-ﬁnal devoicing in Russian, claims that lexical competition and lexical density play an important role in partial contrast preservation. Thus, in shorter (monosyllabic) words and minimal pairs the author found greater acoustic diﬀerences between the voiced and voiceless ﬁnal obstruents than in longer words and nonminimal pairs. Baese-Berk and Goldrick (2009) point out that word-initial voiceless stops in English are realised with longer VOT in words belonging to minimal pairs (e.g. cod–god) than in words that do not have such close competitors in the lexicon. Goldrick et al. (2013) reached a similar conclusion regarding word-ﬁnal stops: the vowel is much longer before voiced stops in words like bud (forming a minimal pair with but) than in words with no such lexical neighbour. All these studies suggest that lexical and phonetic-phonological properties closely interact. No similar studies have been made for Hungarian to the best of my knowledge, and so in this paper I will aim to compare the voicing of ﬁnal alveolar obstruents in words forming a minimal pair with those which are in words that do not belong to a minimal pair. The hypothesis I will test is that based on the literature briefly overviewed above, in devoicing environments (in absolute word-ﬁnal position/utterance-ﬁnally, and before another voiceless obstruent), the underlyingly voiced obstruents will contain signiﬁcantly more voicing in a word that forms a minimal pair with another word than in non-minimal pairs (and consequently, the diﬀerence in voicing production will not or only partially neutralise). Similarly, I hypothesise that in voicing contexts (before another voiced obstruent), the underlyingly voiceless obstruents will contain signiﬁcantly less voicing in minimal pairs than in non-minimal pairs – and thus the voicing contrast will be more readily maintained in the minimal pair group.

The eﬀect of homophony avoidance in voicing 21

4. Subjects, material, method The target consonants of the production experiments were word-ﬁnal /s/–/z/ and /t/–/d/. The segments of both pairs were analysed in two lexical groups. In the ﬁrst group, the ﬁnal consonants in the words did not have existing counterparts with a voiceless or voiced ﬁnal alveolar stop or fricative, i.e., these words did not form minimal pairs: szesz /sɛs/ ‘alcohol’, mez /mɛz/ ‘kit’, net ‘net’, led ‘led lamp’ (i.e., words such as “szez” /sɛz/, “mesz” /mɛs/, “ned” /nɛd/ and “let” /lɛt/ do not exist in Hungarian). I will refer to these words as the “non-minimal pair group” below. The other group consisted of the minimal pair mész–méz /meːs/– /meːz/ ‘lime’–‘honey’ and vét–véd /veːt/–/veːd/ ‘make an error’–‘protect’. I will refer to this group as the “minimal pair group”. The two groups involved diﬀerent participants, and therefore, the group contrasts discussed in the following sections should be interpreted as comparisons between two diﬀerent sets of participants. The target words were investigated in the following environments: (5) a. b. c. d. e. f.

absolute word-ﬁnal position (before a pause) across a word boundary before /p/ across a word boundary before /b/ across a word boundary before the sonorant consonants /m/ and /l/ across a word boundary before the vowel /ɛ/ intervocalically: /ɛ//ɛ/ and /eː//ɛ/ (here there was no word boundary after the target consonant)

No signiﬁcant diﬀerences were found between the measured acoustic parameters in the presonorant and prevocalic environments, and so they were placed in the same group, which I will refer to simply as the “presonorant” environment, and will present the results of the statistical analysis for this uniﬁed group. In the minimal pair experiment the word pairs for the intervocalic position were veszek el–vezekel /vɛsɛkɛl/–/vɛzɛkɛl/ ‘I take away’–‘atone’; vétek–védek /veːtɛk/–/veːdɛk/ ‘I make an error’–‘I protect’. The sentences with the target words that the experiment participants read out can be found in the Appendix. The non-minimal group was analysed in a previous experiment, whose results were partially published (Bárkányi & G. Kiss 2015; Bárkányi & G. Kiss 2019). However, here I will compare that group with the minimal pair group, and will focus not only on certain environments but all of the above in (5), aiming to provide a more comprehensive picture. The statistical analysis will also be presented in a uniﬁed framework for both lexical groups. In the ﬁrst experiment involving the non-minimal pairs, six participants took part, while in the minimal pair experiment there were ten subjects. In both production experiments the participants were university students whose age

The eﬀect of homophony avoidance in voicing 22

ranged between 19 and 30 years (means: 21±3.2 years). They read out every sentence (including ten ﬁllers) ﬁve times. The sound ﬁles of the ﬁrst round were excluded from the ﬁnal analysis as these ﬁrst-round readings are usually less natural due to the unusual experimental circumstances, and there is a greater chance of reading errors. Altogether thus four rounds were used for each participant. Overall then, 4 rounds of 28 sentences of 6 subjects were analysed in the minimal pair group, amounting to 672 data points (for each target sound there were 24 observations in the non-presonorant environments, and 72 in the presonorant one). In the minimal pair group 4 rounds of 28 sentences of 10 subjects were analysed, which amounted to 1120 data points (in this group then, for each target sound there were 40 observations in the non-presonorant environments, and 120 in the presonorant one). The complete data set that was analysed consisted of 672 + 1120 = 1792 observations. The sentences were recorded using SpeechRecorder (Draxler & Jänsch 2004). The sentences were randomised by the program, and it was these randomised sentences that the participants read out from a monitor screen. The amount of time available for each sentence was four seconds, which secured a relatively uniﬁed speech rate, which was neither too rapid nor too slow. An Audix f50 microphone and an Art USB Dual Pre preampliﬁer were used to make the recordings in a noise-free room at the Department of English Linguistics of Eötvös Loránd University. The sound recordings were processed and analysed in Praat (Boersma & Weenink 2021). The segment boundaries and the voicing intervals were marked manually, using the methods discussed in for example G. Kiss (2013). In the case of /t/ and /d/, separate intervals were marked for the closure and (if there was one) the release. For the presence of voicing in the stops, only the closure interval was used, not the release. The boundaries of the fricatives were placed between the start and ending of the constriction phase (visible as aperiodic noise in the spectrograms and waveforms). It is in this interval that the proportion of voicing was measured. So that the presence of vocal fold vibration could be speciﬁed more securely, the frequencies above 300–500 Hz (depending on the given participant) were ﬁltered out, the duration of voicing was measured on these ﬁltered waveforms based on the presence of periodic vibrations. The end of the voicing interval was marked when the periodicity was no longer visible. The duration of the intervals was measured automatically with the help of a Praat script, which created the data tables that the statistical analyses used. In this paper, I will only focus on the voicing durations because the length of the pre-target vowels signiﬁcantly diﬀered in all environments (the underlying vowel was short /ɛ/ in the non-minimal pair group, while it was long /eː/ in the minimal pair group) and thus it was not possible to systematically compare the vowel durations, and based on that, the vowel/consonant duration ratios.

The eﬀect of homophony avoidance in voicing 23

The statistical analysis (including the generation of the various plots) was carried out in R (version 4.0.2, R Core Development Team 2020) using various tidyverse packages (Wickham et al. 2019), as well as the patchwork package (Pedersen 2020) during the composition of the plots. Linear mixed eﬀects models were used to model the data, using the package lme4 (v. 1.1.27.1, Bates et al. 2015). To specify the p-values, the degrees of freedom were calculated using the Satterthwaite approximation available in the lmerTest package (v. 3.1.3, Kuznetsova et al. 2017). The ﬁxed eﬀects of the models were the underlying voicing of the obstruents (voiceless vs. voiced) as well as the minimal pairhood (non-minimal pair vs. minimal pair). The random eﬀect structure contained the subjects. Random intercepts and random slopes were ﬁtted for the proportion of voicing varying across participants. If the slopes for subjects did not improve model ﬁt relative to intercepts only, they were removed from the ﬁnal model, and only random intercepts were retained. If a model did not converge with the default Nelder-Mead optimizer, “BOBYQA” optimizing (Bound Optimization by Quadratic Approximation) was employed. If a model failed to converge even with this setting or if there was no signiﬁcant variability between subjects for the given acoustic parameter in a given group (“singularity” issue), then the random slope for subjects was taken out of the model, in which case the models always converged. Due to the relatively low number of participants, to avoid further convergence issues, simpler models were ﬁtted, i.e., instead of including three factors (underlying voicing, minimal pairhood, environment), and their interaction, plus including them as by-subject random eﬀects, the models investigated the eﬀect of voicing of the target sound (/t/ vs. /d/; /s/ vs. /z/) and their minimal pairhood in the four environments separately. I will refer to the marginal and conditional R-squared eﬀect sizes below as “R2m” and “R2c” respectively. These values were calculated using the MuMIn package (Bartoń 2020). So that the components of the ﬁnal models can be presented in a tabular format, the broom.mixed package (Bolker & Robinson 2021) was used which extracted the model parameters. The tables summarising the linear mixed-eﬀects models in the following sections contain the terminology for the intercept as given by the lme4 output (“(Intercept)”), the names of the slope coefﬁcients (the bs) are “sound” (the sounds compared) and “minpair” (the lexical/minimal pair groups compared).

The eﬀect of homophony avoidance in voicing 24

5. Results Figure 1 shows the proportions of voicing in the two obstruent pairs in ﬁve environments, separately for non-minimal pairs and minimal pairs. The same data have been rearranged in Figure 2 in a way that the underlyingly voiceless sounds and their voiced counterparts appear in separate rows, while the proportions of voicing in a given sound in non-minimal pairs and in minimal pairs are shown together so that the sound pairs can be compared visually more easily with respect to their lexical group membership. The corresponding descriptive statistics can be found in Table 1. The detailed results for the diﬀerent environments will be presented in the following sections: absolute word-ﬁnal position (section 5.1), before /p/ (5.2), before /b/ (5.3), before the sonorants (5.4), and between two vowels (5.5).

Figure 1: Proportion of voicing in /s/, /z/, /t/ and /d/ (the rectangles in the boxplots represent the means)

The eﬀect of homophony avoidance in voicing 25

Figure 2: Proportion of voicing in /s/, /z/, /t/ and /d/ in non-minimal pairs and in minimal pairs. Abbreviations: “s-n, z-n, t-n, d-n” = word-ﬁnal and intervocalic /s z t d/ in words that do not form a minimal pair with another word (szesz, mez, net, led); “s-m, z-m, t-m, d-m” = word-ﬁnal and intervocalic /s z t d/ in words that form a minimal pair with another word (mész, méz, vét, véd, veszek el, vezekel); the rectangles in the boxplots represent the means. Table 1: Proportion of voicing in /s/, /z/, /t/, /d/ – descriptive statistics (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair)

Lexical group non-mp

Environment abs. word-ﬁnal

before /p/

before /b/

Sound /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/

Mean 10.95 17.23 10.99 66.49 15.13 25.07 17.09 23.81 65.39 93.22 98.38 98.78

SD 9.83 10.65 6.10 32.67 6.65 13.72 12.78 26.11 38.06 18.65 7.94 5.95

Median 7.69 17.09 10.36 71.84 17.23 24.87 17.50 13.86 83.50 100.00 100.00 100.00

Min Max SE 0.00 31.50 2.01 0.00 46.34 2.17 4.08 33.72 1.25 0.00 100.00 6.67 3.45 25.82 1.36 0.00 55.51 2.80 0.00 47.06 2.61 0.00 78.12 5.33 9.98 100.00 7.77 32.59 100.00 3.81 61.11 100.00 1.62 70.83 100.00 1.21 (continued on next page)

The eﬀect of homophony avoidance in voicing 26 Lexical group non-mp

Environment presonorant

intervocalic

mp

abs. word-ﬁnal

before /p/

before /b/

presonorant

intervocalic

Sound /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/ /s/ /z/ /t/ /d/

Mean 11.89 70.47 18.23 97.24 18.28 80.87 20.64 100.00 6.71 34.71 9.76 76.00 20.42 36.57 21.00 40.24 39.43 94.71 63.96 95.61 16.70 77.97 22.18 93.42 17.27 88.25 16.62 90.70

SD 6.57 33.09 18.87 10.24 5.35 30.73 16.87 0.00 8.06 24.42 14.36 27.53 13.33 22.14 23.83 21.20 26.28 14.33 32.05 10.89 14.08 25.01 21.21 14.43 12.96 21.02 12.29 13.81

Median 11.26 100.00 15.91 100.00 17.12 100.00 16.66 100.00 3.92 28.51 0.00 83.75 17.81 29.16 16.55 35.00 33.41 100.00 63.68 100.00 14.07 100.00 15.99 100.00 14.16 100.00 12.71 100.00

Min 0.00 16.27 0.00 56.67 8.32 23.71 0.00 100.00 0.00 0.00 0.00 12.31 0.00 12.12 0.00 0.00 0.00 48.57 8.93 59.52 0.00 14.06 0.00 22.41 0.00 24.53 0.00 54.76

Max 32.57 100.00 100.00 100.00 28.46 100.00 68.63 100.00 29.50 100.00 46.84 100.00 56.86 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 47.83 100.00 53.06 100.00

SE 0.77 3.90 2.22 1.21 1.09 6.27 3.44 0.00 1.27 3.86 2.27 4.35 2.11 3.50 3.77 3.35 4.16 2.27 5.07 1.72 1.29 2.28 1.94 1.32 2.05 3.32 1.94 2.18

5.1. Absolute word-ﬁnal position As Figure 1 and Table 1 show, the mean proportion of voicing in /z/ and /d/ in absolute word-ﬁnal position is higher than that of their voiceless counterpart. The diﬀerence is signiﬁcantly greater between /t/ and /d/ in both lexical groups, and it is greater between /s/ and /z/ in the case of minimal pairs, i.e., while in the case of the non-minimal pair (szesz–mez) the diﬀerence between the fricatives is small (which is indicative of neutralisation, at least as far as this acoustic parameter is concerned), in the case of the minimal pair (mész–méz), the diﬀerence is larger (only 6.71% voicing on average with 8.06% standard deviation in /s/, while 34.71% with 24.42% standard deviation in /z/). This larger diﬀerence was conﬁrmed by the linear mixed eﬀects models, too (see Table 2). In the case of the non-minimal pair the diﬀerence between the voicing proportions of /s/– /z/ was non-signiﬁcant, whereas it was signiﬁcant in the case of the minimal pair. The marginal eﬀect size R2m for the minimal pair was 0.38, the conditional eﬀect size R2m was 0.81, that is, the ﬁxed eﬀect (the underlying voicing

The eﬀect of homophony avoidance in voicing 27

of the fricative) largely explains the data, while the random eﬀects (random intercept and slope by subject) signiﬁcantly improve the model’s total explanatory power. The diﬀerence between the mean voicing proportions of /t/–/d/ was signiﬁcant in both lexical groups and the eﬀect sizes were also relatively high (Table 2), i.e., both the ﬁxed and the random eﬀects signiﬁcantly explain the variance. The /d/ tokens in absolute word-ﬁnal position had similar voicing proportions in both groups (led, véd) (Figure 2), their diﬀerence was not signiﬁcant according to the model ﬁtted (Table 2). This, however, was not the case for /z/ in this environment: the mean voicing proportion in the minimal pair /z/ was signiﬁcantly larger than in the non-minimal pair /z/ (only 17.23±10.65% voicing in mez, but 34.71±24.42% in méz). That is, minimal pair /z/ is signiﬁcantly diﬀerent not only from its voiceless counterpart but also from its non-minimal pair counterpart (it contains more voicing than either). R2c was high (0.72), while R2m was lower (0.15), which suggests that the random intercept and slope by subjects ﬁtted to the data largely contribute to the explanatory power of the model; however, in addition to the ﬁxed eﬀect (belonging to a minimal pair or not) other factors also contribute to /z/ having more voicing in absolute ﬁnal position, but minimal pairhood also contributes to this eﬀect. Overall then we can say that in absolute word-ﬁnal position, the diﬀerence in the voicing proportions of the obstruent pairs examined is maintained, except between /s/ and /z/ if they are parts of words that do not form a minimal pair. We can thus observe a strong minimal pairhood/allophony avoidance eﬀect in this environment. Table 2: Summary of the mixed eﬀects models, outcome variable: voicing proportion, environment: absolute word-ﬁnal (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair) Contrast /s/–/z/, non-mp /t/–/d/, non-mp /s/–/z/, mp /t/–/d/, mp non-mp–mp, /z/ non-mp–mp, /d/

Coefﬁcient (Intercept) sound-z (Intercept) sound-d (Intercept) sound-z (Intercept) sound-d (Intercept) minpair-mp (Intercept) minpair-mp

b 10.95 6.28 10.99 55.51 6.71 28.00 9.76 66.24 17.23 17.49 66.49 9.51

SE 3.04 4.30 4.03 8.43 2.05 6.32 3.56 5.84 3.05 7.38 8.16 10.98

t 3.60 1.46 2.73 6.59 3.27 4.43 2.74 11.34 5.65 2.37 8.15 0.87

df 6.00 6.00 27.94 7.03 10.00 10.00 10.00 10.00 6.00 13.58 6.00 14.10

p 0.0113 0.1949 0.011 0.0003 0.0084 0.0013 0.0209 <.0001 0.0013 0.0332 0.0002 0.4011

R2m 0.09

R2c 0.46

0.6

0.72

0.38

0.81

0.7

0.88

0.15

0.72

0.02

0.45

The eﬀect of homophony avoidance in voicing 28

5.2. Before /p/ Before /p/ (where voicing neutralisation is expected in the direction of devoicing) the mean voicing proportion of the underlyingly voiced obstruents was always larger than that of their voiceless counterparts, similarly to the absolute word-ﬁnal position (see Figure 1 and Table 1). This diﬀerence between the means turned out to be signiﬁcant for the fricatives in both lexical groups (see Table 3); however, the size of the eﬀect was relatively small: marginal Rsquared was low in both groups (R2m values: 0.18 and 0.17), conditional Rsquared was smaller in the non-minimal group (R2c = 0.41; minimal pair: R2c

0.53). This suggests that the size of the diﬀerence is smaller in the case of the

non-minimal pair (szesz–mez). The mean voicing proportion measured in /z/ in the minimal pair group was greater than in the non-minimal group. In non-minimal pair mez, voicing proportion ranged between 0% and 55.51% (mean: 25.07%, SD: 13.72%), while in minimal pair méz there were no values below 12.12%, and there were data points well over 55%, for example, fully voiced /z/ before /p/ also occurred in the data (mean: 36.57±22.14%). However, the diﬀerence between the two groups did not turn out to be signiﬁcant based on the ﬁtted mixed eﬀects model (Table 3). These results suggest that the diﬀerence between /s/ and /z/ is signiﬁcant within both groups but in the minimal pair group (mész–méz) the magnitude of the diﬀerence is even larger, i.e., we can observe the eﬀect of homophony avoidance before /p/ just like in the absolute word-ﬁnal position. As far as the stops are concerned before /p/, the voicing proportion diﬀerence of /t/ and /d/ in the non-minimal pair group (net–led) did not turn out to be signiﬁcant (Table 3). However, in the minimal pair group (vét–véd), the diﬀerence was signiﬁcant. Thus, in the environment before /p/ – a potentially voicing neutralising environment – the mean proportion of voicing of /d/ in véd was signiﬁcantly greater than in vét (/t/: 21.00±23.83%, /d/: 40.24±21.20%). The magnitude of the ﬁxed eﬀect was relatively low (R2m = 0.16) but this factor deﬁnitely contributes to the observed diﬀerences, as well as the random eﬀects (R2c = 0.46). The voicing proportions measured in /d/ before /p/ in the two lexical groups were diﬀerent, the mean was higher in the minimal pair group, just like in the case of /z/ (non-minimal pairs: 23.81±26.11%, minimal pairs: 40.24±21.20%); however, this diﬀerence was not signiﬁcant (Table 3) just as it was not signiﬁcant for pre-/p/ /z/. Similarly to /z/ then, in the minimal pair group, /d/ contained more voicing before /p/ than in the non-minimal pair group, and this additional amount of voicing was enough to bring about a signiﬁcant diﬀerence within the minimal pair group. Thus, we can deﬁnitely observe the eﬀect of minimal pairhood in the case of the stops, too.

The eﬀect of homophony avoidance in voicing 29

Table 3: Summary of the mixed eﬀects models, outcome variable: voicing proportion, environment: before /p/ (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair) Contrast /s/–/z/, non-mp /t/–/d/, non-mp /s/–/z/, mp /t/–/d/, mp non-mp–mp, /z/ non-mp–mp, /d/

Coefﬁcient (Intercept) sound-z (Intercept) sound-d (Intercept) sound-z (Intercept) sound-d (Intercept) minpair-mp (Intercept) minpair-mp

b 15.13 9.94 17.09 6.72 20.42 16.15 21.00 19.24 25.07 11.50 23.81 16.43

SE 2.92 2.59 3.25 10.14 3.08 4.40 5.47 5.16 6.03 7.63 7.69 9.73

t 5.18 3.84 5.26 0.66 6.63 3.67 3.84 3.73 4.16 1.51 3.10 1.69

df 9.21 42.00 6.00 6.00 10.00 10.00 10.00 10.00 16.00 16.00 16.00 16.00

p 0.0005 0.0004 0.0019 0.5319 0.0001 0.0043 0.0033 0.0039 0.0007 0.1513 0.0069 0.1106

R2m 0.18

R2c 0.41

0.03

0.75

0.17

0.53

0.16

0.46

0.08

0.5

0.11

0.63

5.3. Before /b/ Before /b/, a potentially voicing environment, underlyingly voiceless /s/ contained less voicing on average than its voiced counterpart in both lexical groups. Figure 1 shows how wide a range values populated in the case of the non-minimal pair (szesz–mez) (we can observe values between 9.98% and 100%, with a mean of 65.39±38.06%). /s/ contained even less voicing before voiced /b/ in the minimal pair group (mész–méz), here in the lower quarter we can even ﬁnd completely voiceless values (mean: 39.43±26.28%). Underlyingly voiced /z/, as expected, was largely voiced in this position (non-minimal pairs: 93.22±18.65%, minimal pairs: 94.71±14.33%). Based on this, it is not surprising that the mean voicing proportions of /s/ vs. /z/ were signiﬁcantly diﬀerent in both lexical groups (see Table 4). The eﬀect size was larger in the minimal pair group based on the R-squared values. This indicates that contrast preservation in the minimal pair group is more likely than in the non-minimal pair group. If we compare /s/ in the non-minimal group (szesz) with /s/ in the minimal group (mész), we ﬁnd that the latter was produced by the participants with less voicing on average before /b/, that is, minimal pairhood seems to decrease the proportion of voicing (non-minimal pairs: 65.39±38.06%, minimal pairs: 39.43±26.28%). Based on the ﬁtted mixed eﬀects models, the diﬀerence was close to being signiﬁcant (p = 0.0575). We note that if instead of Satterthwaite approximation Wald approximation was used to calculate the degrees of freedom (using the parameters function of the parameters R-package, see Makowski et al. 2021), the value of p was 0.045. Based on the R-squared values, we can say that in addition to the ﬁxed eﬀect (minimal pairhood) other factors also aﬀect the variability of /s/’s voicing but this variable also contributes to it;

The eﬀect of homophony avoidance in voicing 30

on the other hand, the model’s total explanatory power is relatively large. Overall then, the voicing proportion in /s/ did not only show signiﬁcant diﬀerence before /b/ within the lexical groups (it contained much less voicing compared to /z/) but it was also signiﬁcant between the lexical groups (minimal pair /s/ contained much less voicing). That is, the minimal pair eﬀect can be observed doubly. The results were interesting for the pre-/b/ stops. While in the non-minimal pair group (net–led) both sounds were produced almost always 100% voiced, in the minimal pair group (vét–véd) /t/ contained much less voicing, as the boxplot in Figure 1 shows. In this group the values ranged between 9.93% and 100%, with a mean of 63.96±32.05%. The ﬁtted models conﬁrmed these observations (Table 4). The diﬀerence between /t/ and /d/ was not signiﬁcant in the nonminimal pair group, but the proportion of voicing was signiﬁcant in these sounds in the minimal pair group. The amount of voicing in /t/ before /b/ clearly diﬀered in the two lexical groups (see Figure 2): the /t/ in vét (which forms a minimal pair with véd) contained much less voicing than the /t/ in net, which does not form a minimal pair with another word (means: net: 98.38±7.94%, vét: 63.96±32.05%). This diﬀerence turned out to be signiﬁcant, too, with relatively large eﬀect sizes. Overall then, we can say that there is a strong homophony-avoidance eﬀect for both the fricatives and the stops before /b/; in essence, the eﬀect is responsible for maintaining the underlying laryngeal contrast. Table 4: Summary of the mixed eﬀects models, outcome variable: voicing proportion, environment: before /b/ (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair) Contrast /s/–/z/, non-mp /t/–/d/, non-mp /s/–/z/, mp /t/–/d/, mp non-mp–mp, /s/ non-mp–mp, /t/

Coefﬁcient (Intercept) sound-z (Intercept) sound-d (Intercept) sound-z (Intercept) sound-d (Intercept) minpair-mp (Intercept) minpair-mp

b 65.39 27.83 98.38 0.40 39.43 55.28 63.96 31.65 65.39 −25.96 98.38 −34.42

SE 7.23 7.79 1.40 1.98 7.13 7.13 8.44 7.69 10.03 12.68 8.66 10.96

t 9.04 3.57 70.16 0.20 5.53 7.75 7.58 4.12 6.52 −2.05 11.36 −3.14

df 11.63 42.00 48.00 48.00 10.00 10.00 10.00 10.00 16.00 16.00 16.00 16.00

p <.0001 0.0009 <.0001 0.839 0.0003 <.0001 <.0001 0.0021 <.0001 0.0575 <.0001 0.0063

R2m 0.19

R2c 0.31

0

0.64

0.83

0.31

0.74

0.15

0.59

0.3

0.72

5.4. Before sonorants In this environment (which, as we said above, also contained the prevocalic position) voicing neutralisation was not expected, and this expectation was conﬁrmed by the results (see Figure 1 and Table 1). /s/ and /z/ signiﬁcantly diﬀered

The eﬀect of homophony avoidance in voicing 31

with respect to the voicing ratio within both lexical groups, and the eﬀect size was also substantial (Table 5). It is interesting to note that the voicing of underlyingly voiced /z/ displayed a relatively large variation in this phonetically optimal environment for voicing contrast maintenance: there were subjects that almost always produced /z/ here almost voiceless. On the whole, however, the voicing proportions of /s/ and /z/ were saliently diﬀerent (non-minimal pair: 11.89±6.57% vs. 70.47±33.09%; minimal pair: 16.70±14.08% vs. 77.97± 25.01%). No minimal pairhood eﬀect was observed in this environment, i.e., the minimal pair membership did not signiﬁcantly aﬀect the proportion of voicing: /s/ and /t/ were similarly voiceless in both groups, and /z/ and /d/ similarly voiced (see Figure 2 and Table 5). Table 5: Summary of the mixed eﬀects models, outcome variable: voicing proportion, environment: presonorant (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair) Contrast /s/–/z/, non-mp /t/–/d/, non-mp /s/–/z/, mp /t/–/d/, mp non-mp–mp, /s/ non-mp–mp, /t/ non-mp–mp, /z/ non-mp–mp, /d/

Coefﬁcient (Intercept) sound-z (Intercept) sound-d (Intercept) sound-z (Intercept) sound-d (Intercept) minpair-mp (Intercept) minpair-mp (Intercept) minpair-mp (Intercept) minpair-mp

b 11.89 58.58 18.23 79.01 16.70 61.27 22.18 71.24 11.89 4.80 18.23 3.95 70.47 7.49 97.24 −3.82

SE 3.63 3.82 2.49 2.40 2.12 4.13 4.44 4.04 2.21 2.79 5.07 6.42 6.43 7.74 2.93 3.71

t 3.28 15.32 7.32 32.92 7.86 14.85 4.99 17.64 5.39 1.72 3.59 0.62 10.95 0.97 33.19 −1.03

df 11.42 138.00 10.13 138.00 10.00 10.00 10.00 10.00 16.00 16.00 16.00 16.00 6.00 11.22 16.00 16.00

p 0.007 <.0001 <.0001 <.0001 <.0001 <.0001 0.0005 <.0001 0.0001 0.1047 0.0024 0.5472 <.0001 0.3533 <.0001 0.3176

R2m 0.61

R2c 0.63

0.87

0.88

0.7

0.76

0.8

0.87

0.04

0.17

0.01

0.33

0.02

0.21

0.02

0.26

5.5. Intervocalic position Similarly to the presonorant position, no voicing neutralisation was expected in the word-internal intervocalic position, and this was conﬁrmed by the results (see Figure 1). The voicing diﬀerence between /s/–/z/ and /t/–/d/ was signiﬁcant with a large eﬀect size in this environment (Table 6). Just like before sonorants, the voicing proportion in /z/ displayed relatively large variation, especially in the non-minimal pair group, but despite this, the average was around 80% and so the values showed a clear separation from those of /s/ (non-minimal pairs: 18.28±5.35% vs. 80.87±30.73%; minimal pairs: 17.27±12.96% vs. 88.25± 21.02%).

The eﬀect of homophony avoidance in voicing 32

Only /d/ displayed a minimal pairhood eﬀect: it was signiﬁcantly less voiced in the minimal pair group than in the non-minmal pair group. However, this result is not surprising considering that only 100% voiced /d/ tokens were found in the non-minmal pair group and so only a slight deviation from this proportion can result in a signiﬁcant diﬀerence. And indeed, despite the statistically signiﬁcant diﬀerence, the eﬀect sizes were relatively small (see the R-square values in Table 6). The mean proportion of voicing of /d/ in the minimal pair group was also rather large (90.70±13.81%); therefore, the diﬀerence between the /d/’s in the two groups is in fact small. Table 6: Summary of the mixed eﬀects models, outcome variable: voicing proportion, environment: intervocalic (“non-mp” = word not belonging to a minimal pair; “mp” = word belonging to a minimal pair) Contrast /s/–/z/, non-mp /t/–/d/, non-mp /s/–/z/, mp /t/–/d/, mp non-mp–mp, /s/ non-mp–mp, /t/ non-mp–mp, /z/ non-mp–mp, /d/

Coefﬁcient (Intercept) sound-z (Intercept) sound-d (Intercept) sound-z (Intercept) sound-d (Intercept) minpair-mp (Intercept) minpair-mp (Intercept) minpair-mp (Intercept) minpair-mp

b 18.28 62.59 20.64 79.36 17.27 70.97 16.62 74.08 18.28 −1.01 20.64 −4.02 80.87 7.38 100.00 −9.30

SE 5.67 5.51 2.58 3.28 2.58 4.75 3.03 3.34 2.76 3.49 3.93 4.98 9.11 10.42 2.94 3.72

t 3.23 11.36 8.01 24.23 6.70 14.95 5.49 22.21 6.63 −0.29 5.25 −0.81 8.88 0.71 34.01 −2.50

df 10.15 42.00 15.86 42.00 10.00 10.00 10.00 10.00 16.00 16.00 16.00 16.00 6.00 9.72 16.00 16.00

p 0.0089 <.0001 <.0001 <.0001 0.0001 <.0001 0.0003 <.0001 <.0001 0.7756 0.0001 0.4306 0.0001 0.4956 <.0001 0.0236

R2m 0.68

R2c 0.75

0.92

0.93

0.81

0.88

0.89

0.93

0

0.21

0.02

0.32

0.02

0.44

0.15

0.37

6. Discussion This paper hypothesised that in potentially devoicing environments (in absolute word-ﬁnal position and before voiceless obstruents), the underlyingly voiced obstruents will contain more voicing in the case of minimal pairs than in the case of non-minimal pairs, and consequently, minimal pairs are less likely to completely neutralise in speech production. Similarly, the underlyingly voiceless obstruents in minimal pairs are assumed to be less voiced before voiced obstruents than in non-minimal pairs, thus the former group is more likely to preserve the voicing contrast. This hypothesis was indeed supported by the production experiments: the amount of voicing in /s/–/z/ and /t/–/d/ was systematically diﬀerent between the minimal pair and non-minimal pair group word-ﬁnally and in regressive voicing assimilatory contexts. Minimal pairhood clearly acted against voicing neutralisation in the following ways.

The eﬀect of homophony avoidance in voicing 33

In utterance-ﬁnal position, the fricatives in the non-minimal pairs did not diﬀer in voicing, while in the minimal pairs they did. To initiate and maintain voicing in fricatives requires active articulatory eﬀort since simultaneous turbulent noise and vocal fold vibration are aerodynamically difﬁcult (see, e.g., Stevens 1998). According to Myers (2012), if word-ﬁnal devoicing appears in a language, it generally starts with fricatives in word-/utterance-ﬁnal position and propagates over time to other obstruents and other domain-ﬁnal environments (utterance-ﬁnal > word-ﬁnal > syllable-ﬁnal position). Hungarian fricatives in non-minimal pairs might have taken the ﬁrst step towards word-ﬁnal obstruent devoicing as both /s/ and /z/ were produced with little voicing. Minimal pairs, however, seem to defy this as the voiced–voiceless categories were clearly kept apart in the production experiments. This suggests that lexical factors, such as homophony avoidance, can override the aerodynamically based phonetic eﬀect of devoicing in ﬁnal position. While the utterance-ﬁnal position triggers devoicing in phonetic terms, the position before voiceless obstruents – in this study across a word boundary before /p/ – triggers devoicing also in phonological terms and is expected to create homophony. The acoustic analysis showed that the voicing contrast between /s/–/z/ was attested in both lexical groups, but it was more pronounced in minimal pairs than in non-minimal pairs. As far as /t/ and /d/ are concerned, the diﬀerence between them was actually clearly maintained in words forming a minimal pair. This indicates that homophony avoidance counteracted phonetic/ aerodynamic eﬀects (i.e., the fact that maintaining voicing is relatively difﬁcult before another obstruent) and phonological rule application in this environment, too. In the third potentially neutralising environment – across a word boundary before /b/ – both lexical and phonetic eﬀects could be observed, and both counteracted complete neutralisation. Underlyingly voiceless /s/ in the non-minimal group did not become fully voiced even in this environment that favours phonetic voicing. This eﬀect, which can be explained with aerodynamic reasons again, was further enhanced by the lexical eﬀect since in minimal pairs /s/ was even less voiced. The most salient homophony-avoidance eﬀect was observed in the case of /t/–/d/: while in non-minimal pairs the diﬀerence was neutralised (both were voiced to a similar degree), the diﬀerence was upheld in the minimal pairs in spite of the fact that maintaining devoicing before a voiced obstruent is relatively difﬁcult phonetically. The voicing-contrast maintenance in the minimal pair group in the three potentially neutralising environments is in accordance with the H&H theory (Lindblom 1990), according to which speakers take into account the purpose of speech production and the speciﬁcs of the communication situation (in this case the risk of ambiguity caused by the presence of minimal pairs), and adjust their

The eﬀect of homophony avoidance in voicing 34

speech accordingly, which may lead to hyper-articulation. The speaker’s aim to maintain contrast of course does not necessarily mean that listeners will always perceive the intended diﬀerences (see, e.g., Costa and Mattingly’s 1981 description of an Eastern New England dialect of English whose speakers make a systematic distinction in vowel duration for the words cod and card despite the fact that they are unable to discriminate between the two tokens in perception experiments). The fact that the experiments presented in this paper included onesyllable words may also have contributed to the observed diﬀerences. Kharlamov (2014) demonstrates that word reading, the presence of minimal pairs, and short, monosyllabic words are more likely to induce the partial contrast preservation of laryngeal features. Naturally, the question arises whether or not the measured acoustic diﬀerences are mirrored in perception, and if they are, to what degree. The role of perception in phonological contrast and its neutralisation is well known. This paper has brought up evidence that voicing diﬀerences in speech production/acoustics can remain in neutralising environments due to lexical reasons such as homophony avoidance; however, this does not necessarily mean that these acoustic diﬀerences will translate into perceptual – and consequently phonological – diﬀerences. There has been some perceptual research involving minimal pairs (e.g., Bárkányi & G. Kiss 2019; 2021) but future research must look into their systematic comparison with non-minimal pairs. The ﬁndings in this paper provide further evidence for phonetically-based, functional phonological models according to which there is a direct link between phonetics (speech production/perception), phonology, and grammar, unlike in representational models which exclude such an active interface, and which therefore cannot adequately explain the inﬂuence of extra-grammatical factors such as homophony avoidance or the aerodynamics of voicing production on phonological processes (such as partial voicing neutralisation). Finally, these results highlight the importance of lexical factors in experimental design, too: choosing the type of lexical item can greatly inﬂuence the phonetic implementation of the sounds it contains. Ignoring such lexical factors can lead to misleading results.

The eﬀect of homophony avoidance in voicing 35

Appendix The test sentences used in the production experiments were the following (the test words are marked with bold): Minimal pair group 1. Addigra a mész már régen elfogyott. 2. Gyógyításra a mész lehet a legjobb. 3. A mész elrablásával foglalkozott az egész sajtó. 4. A mész pénzértéke ezután csökkenni kezdett. 5. Sajnos a mész belefolyt a szemébe. 6. Sokféle felhasználásra alkalmas a mész. 7. Addigra a méz már régen elfogyott. 8. Gyógyításra a méz lehet a legjobb. 9. A méz elrablásával foglalkozott az egész sajtó. 10. A méz pénzértéke ezután csökkenni kezdett. 11. Sajnos a méz belefolyt a szemébe. 12. Sokféle felhasználásra alkalmas a méz. 13. Mindig vezekel az összes regényhős. 14. Egy inget veszek el a közös szekrényből. 15. Azóta vét minden idegen szabály ellen. 16. Gyakran vét legalább két intézkedés ellen. 17. Biztos nem vét ellenük ilyen módon. 18. Ez a műszer mindig vét párás időben. 19. A berendezés vét borús időben is. 20. A jól felszerelt készülék is vét. 21. Azóta véd minden idegen szabály ellen. 22. Gyakran véd legalább két intézkedés ellen. 23. Biztos nem véd ellenük ilyen módon. 24. Ez a műszer mindig véd párás időben. 25. A berendezés véd borús időben is. 26. A jól felszerelt készülék is véd. 27. Ezt ma vétek lenne elszalasztanunk. 28. Ma én védek a barátságos meccsen. Non-minimal pair group 1. 2. 3.

Azóta net már van a lakásban. A net lehetett az oka a megakadásnak. A net este mindig sokkal lassabb.

The eﬀect of homophony avoidance in voicing 36

4. Egy net probléma lépett fel. 5. A net beállításokon múlik az egész. 6. A végére meg elromlott a net. 7. A netet tartják az évezred találmányának. 8. A vörös led már megint nem ég. 9. Egy kis led lámpát szereltek az oldalára. 10. A világító led erre nagyon jó. 11. Vészhelyzetben a led pirosan ég. 12. A led biztosítja a sötétben a világítást. 13. Semmi más nem világított csak a led. 14. A ledet kiszorítják a fénycsöves lámpák. 15. A bulin a szesz már éjfélre elfogyott. 16. Fertőtlenítésre a szesz lehet a legjobb. 17. Ilyenkor a szesz eredete a kérdés. 18. A szesz pirosra színezte a főzetet. 19. Sajnos a szesz belefolyt a szemébe. 20. Ipari felhasználásra is alkalmas a szesz. 21. A szeszes italok körében jól ismert. 22. Az új mez már ott várta a játékosokat. 23. Futás közben a mez lecsúszott a válláról. 24. A mez előtt hevert a stoplis cipő. 25. A futball mez párosával van csomagolva. 26. A foci mez belseje mikroszálas. 27. Szerencsét hozott a válogatott mez. 28. A játékos meze kétszer annyiért kelt el.

References Baese-Berk, Melissa and Matthew Goldrick. 2009. Mechanisms of interaction in speech production. Language and Cognitive Processes 24 (4): 527–54. https://doi.org/10.1080/01690960802299378. Bárkányi, Zsuzsanna and Zoltán G. Kiss. 2015. Why do sonorants not voice in Hungarian? And why do they voice in Slovak? In: Katalin É. Kiss, Balázs Surányi and Éva Dékány (eds.). Approaches to Hungarian 14: Papers from the 2013 Piliscsaba conference. Amsterdam & Philadelphia: John Benjamins. 65–94. https://doi.org/10.1075/atoh.14.03bar Bárkányi, Zsuzsanna and G. Kiss Zoltán. 2019. A fonetikai korrelátumok szerepe a zöngekontraszt fenntartásában. Beszédprodukciós és észleléses eredmények [The role of phonetic correlates in voicing contrast. Results from speech production and perception]. Általános Nyelvészeti Tanulmányok 31: 57–102.

The eﬀect of homophony avoidance in voicing 37 Bárkányi, Zsuzsanna and Zoltán G. Kiss. 2020. Neutralisation and contrast preservation: Voicing assimilation in Hungarian three-consonant clusters. Linguistic Variation 20: 56–83. https://doi.org/10.1075/lv.16010.bar Bárkányi, Zsuzsanna and Zoltán G. Kiss. 2021. The perception of voicing contrast in assimilation contexts in minimal pairs: Evidence from Hungarian. Acta Linguistica Academica 68: 207–229. https://doi.org/10.1556/2062.2021.00473 Bartoń, Kamil. 2020. MuMIn: Multi-model inference. https://CRAN.R-project.org/package=MuMIn. R package version 1.43.17. Bates, Douglas, Martin Maechler, Ben Bolker and Steve Walker. 2015. Fitting linear mixedeﬀects models using lme4. Journal of Statistical Software 67: 1–48. https://doi.org/10.18637/jss.v067.i01 Boersma, Paul and David Weenink. 2021. Praat: Doing phonetics by computer. Computer programme. Version 6.2.01. Bolker, Ben and David Robinson. 2021. Broom.mixed: Tidying methods for mixed models. https://CRAN.R-project.org/package=broom.mixed. R package version 0.2.7. Charles-Luce, Jan 1993. The eﬀects of semantic context on voicing neutralisation. Phonetica 50: 28–43. https://doi.org/10.1159/000261924 Costa, Paul and Ignatius Mattingly. 1981. Production and perception of phonetic contrast during phonetic change. Haskins Laboratories Status Report on Speech Research SR 67/68. 191– 196. https://doi.org/10.1121/1.386167 Draxler, Christoph and Klaus Jänsch. 2004. SpeechRecorder – A universal platform independent multi-channel audio recording software. Proceedings of the 4th International Conference On Language Resources And Evaluation, Lisbon. 559–562. G. Kiss, Zoltán. 2013. Measuring acoustic correlates of voicing in stops and fricatives. In: Péter Szigetvári (ed.). VLLXX: Papers presented to László Varga on his 70th birthday. Budapest: Department of English Linguistics, Eötvös Loránd University & Tinta Könyvkiadó/Tinta Publishing House. 289–311. Ganong, William F. 1980. Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance 6(1): 110–125. http://doi.org/10.1037/0096-1523.6.1.110 Gilliéron, Jules. 1910. Étude de géographie linguistique XII – mots en collision. Le coq et le chat. Revue de philologie française 4: 278–288. Goldrick, Matthew, Charlotte Vaughn and Amanda Murphy. 2013. The eﬀects of lexical neighbors on stop consonant articulation. The Journal of the Acoustical Society of America 134(2): 172–177. https://doi.org/10.1121/1.4812821 Gráczi, Tekla Etelka. 2010. A spiránsok zöngésségi oppozíciójának néhány jellemzője [Some properties of the voicing opposition of spirants]. Beszédkutatás 2010: 42–56. Jansen, Wouter. 2004. Laryngeal contrast and phonetic voicing: A laboratory phonology approach to English, Hungarian, and Dutch. Doctoral dissertation. Rijksuniversiteit Groningen. Jansen, Wouter. 2007. Phonological ‘voicing,’ phonetic voicing and assimilation in English. Language Sciences 29: 270–293. https://doi.org/10.1016/j.langsci.2006.12.021

The eﬀect of homophony avoidance in voicing 38 Kharlamov, Viktor. 2014. Incomplete neutralisation of the voicing contrast in word-ﬁnal obstruents in Russian: Phonological, lexical, and methodological inﬂuences. Journal of Phonetics 43: 47–56. https://doi.org/10.1016/j.wocn.2014.02.002 Kiss, Zoltán. 2007. The phonetics–phonology interface: Allophony, assimilation and phonotactics. Doctoral dissertation. Eötvös Loránd University (ELTE). Kiss, Zoltán and Zsuzsanna Bárkányi. 2006. A phonetically-based approach to the phonology of [v] in Hungarian. Acta Linguistica Hungarica 53: 175–226. https://doi.org/10.1556/ALing.53.2006.2-3.4 Kitahara, Mafuyu, Keiichi Tajima and Kiyoko Yoneyama. 2019. The eﬀect of lexical competition on realization of phonetic contrasts: A corpus study of the voicing contrast in Japanese. In: Sasha Calhoun, Paola Escudero, Marija Tabain and Paul Warren (eds.). Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019. Canberra: Australasian Speech Science and Technology Association. 2749–2752. Kuznetsova, Alexandra, Per B. Brockhoﬀ and Rune H. B. Christensen. 2017. lmerTest package: Tests in linear mixed eﬀects models. Journal of Statistical Software 82: 1–26. https://doi.org/10.18637/jss.v082.i13 Lindblom, Björn. 1990. Explaining phonetic variation: A sketch of the H&H theory. In: William J. Hardcastle and Alain Marchal (ed.). Speech production and speech modelling. Dordrecht: Kluwer. 403–440. https://doi.org/10.1007/978-94-009-2037-8_16 Lisker, Leigh and Arthur Abramson 1964. A cross-language study of voicing in initial stops: Acoustical measurements. Word 20: 384–422. https://doi.org/10.1080/ 00437956.1964.11659830 Lisker, Leigh and Arthur Abramson 1967. Some eﬀects of context on voice onset time in English stops. Language and Speech 10: 1–28. https://doi.org/10.1177/002383096701000101 Makowski, Dominique, Mattan S. Ben-Shachar, Indrajeet Patil and Daniel Lüdecke. 2021. Automated results reporting as a practical tool to improve reproducibility and methodological best practices adoption, v. 0.5.0. CRAN. https://github.com/easystats/report Markó, Alexandra, Tekla Etelka Gráczi and Judit Bóna. 2010. The realization of voicing assimilation rules in Hungarian spontaneous and read speech: Case studies. Acta Linguistica Hungarica 57: 210–238. https://doi.org/10.1556/aling.57.2010.2-3.3 Martinet, André. 1952. Function, structure, and sound change. Word 8: 1–32. https://doi.org/10.1080/00437956.1952.11659416 Munteanu, Andrei. 2021. Homophony avoidance in the grammar: Russian nominal allomorphy. Phonology 38: 401–435. https://doi.org/10.1017/S0952675721000257. Myers, Scott. 2012. Final devoicing: Production and perception studies. In: Tony Borowsky, Shigeto Kawahara and Mariko Sugahara (eds.). Prosody matters: Essays in honor of Elisabeth Selkirk. London: Equinox Press. 148–180. Ohala, John J. 1981. The listener as a source of sound change. In: Carrie S. Masek, Roberta A. Hendrik and Mary Frances Miller (eds.). Papers from the Parasession on Language and Behaviour conference (CLS 17). Chicago: Chicago Linguistics Society. 178–203.

The eﬀect of homophony avoidance in voicing 39 Ohala, John J. 1983. The origin of sound patterns in vocal tract constraints. In: Peter F. MacNeilage (ed.). The production of speech. New York: Springer-Verlag. 189–216. https://doi.org/10.1007/978-1-4613-8202-7_9 Pedersen, Thomas Lin. 2020. Patchwork: The composer of plots. https://CRAN.R-project.org/package=patchwork. R package version 1.1.0. R Development Core Team. 2020. R: A language and environment for statistical computing; 4.0.2. Vienna, Austria: R Foundation for Statistical Computing. Sampson, Geoﬀrey. 2013. A Counterexample to homophony avoidance. Diachronica 30: 579– 91. https://doi.org/10.1075/dia.30.4.05sam. Silverman, Daniel. 2012. Neutralisation. Cambridge, MA: Cambridge University Press. Siptár, Péter and Miklós Törkenczy. 2000. The phonology of Hungarian. Oxford: Oxford University Press. Wedel, Andrew, Abby Kaplan and Scott Jackson. 2013. High functional load inhibits phonological contrast loss: A corpus study. Cognition 128: 179–86. https://doi.org/10.1016/j.cognition.2013.03.002. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang et al. 2019. Welcome to the tidyverse. Journal of Open Source Software 4(43): 1686. https://doi.org/10.21105/joss.01686 Yin, Sora Heng and James White. 2018. Neutralisation and homophony avoidance in phonological learning. Cognition 179: 89–101. https://doi.org/10.1016/j.cognition.2018.05.023

Zoltán G. Kiss Eötvös Loránd University, School of English and American Studies, English Linguistics Department ORCID 0000-0001-5224-3563 gkiss.zoltan@btk.elte.hu