المرجع الالكتروني للمعلوماتية
المرجع الألكتروني للمعلوماتية

English Language
عدد المواضيع في هذا القسم 6238 موضوعاً
Grammar
Linguistics
Reading Comprehension

Untitled Document
أبحث عن شيء أخر المرجع الالكتروني للمعلوماتية
ضرب الأولاد بين الحرمة والجواز
2025-01-18
قواعد النجاح في الحياة / الانتباه والحذر
2025-01-18
قواعد التعامل مع الآخرين (فن التواصل) / بادر الى مساعدتهم دون مقابل
2025-01-18
لماذا تفشل الأسرة
2025-01-18
Pulmonary hypertension
2025-01-18
كيف تكف عن الصياح في وجه طفلك
2025-01-18


Employing the OED on CD: Practical problems  
  
27   08:45 صباحاً   date: 2025-01-18
Author : Ingo Plag
Book or Source : Morphological Productivity
Page and Part : P100-C5


Read More
Date: 2023-10-16 587
Date: 2024-02-01 758
Date: 2023-10-10 820

Employing the OED on CD: Practical problems

Having established the versatility of the OED for productivity studies, we may now turn to the investigation itself. The relevant data have been extracted from the OED on CD with the help of the query language that comes with it, which enables the user to carry out complex search routines across the different sections of the dictionary (e.g. headwords, part of speech, etymology, definitions, quotation date, author of quotation, etc.).1 For the purposes of this study the software was programmed to search for the relevant strings of letters in combination with the first dates of attestation.2 The twentieth century (i.e. 1900 to 1985, where the OED coverage ends) was chosen as the relevant period because it is large enough to yield a sufficient number of neologisms if a rule is to some extent productive, and small enough to exclude major diachronic developments within that period. The resulting files of raw data had to be thoroughly scrutinized for forms that did not belong to the derivational category in question. The remaining words, which are given in appendix 1, formed the basis of the productivity counts and of the structural analysis. For the measure of productivity presented below, the affixes were not distinguished by the category of the base word. Overall, there is a clear dominance of denominal formations over deadjectival ones with practically all processes.

 

Let me illustrate the sampling procedure with the suffix -ize. The compact disc version of the OED was searched for all forms ending in the strings <ize> or <ise> with the first citation dating from 1900 to 1985. The resulting list of words contained a considerable amount of forms that, by all accounts, did not feature the suffix -ize (such as compounds involving the second element -wise, or borrowings like decatise), and which needed to be excluded.

 

Furthermore, all forms were removed from the lists of raw data that were derived by the prefixation of already existing words, and all forms derived by parasynthesis. A form was classified as prefixed if the stem was an existing complex verb of the relevant category, and if the derivative was also semantically and phonologically transparent as a prefixed complex verb (e.g. repolarize). A form was classified as parasynthetic if the stem was attested earlier than the derivative, the derivative was semantically transparent and was not a prefixed verbal derivative. An example of such a parasynthetic form is decaffeinate.

 

Under the strict application of the criteria just mentioned, the determination of relevant forms proved fairly unproblematic with -ize, -ify, eN-, be and -en, but the processes of -ate suffixation and conversion turned out to be more difficult.

 

As I will show below, especially the processes by which -ate verbs are derived are rather diverse and include many cases of back formation, conversion (with or without reassigning stress), and local analogies. In fact, suffixation is not even the most frequent of the processes which give rise to -ate verbs. In the count given below, the derivatives were counted by only analyzing the derivative (i.e. the output), disregarding its particular derivational history. Thus, if a form contained, for example, the morph -ate and a discernible stem, it was included in the count. The resulting figure is therefore much larger than the number of forms that were presumably coined by the suffixation of -ate to a given stem. This effect is much less significant with the other overt suffixes.

 

Equally problematic were the zero-affixed verbs, which were extracted from the OED by searching for all verbs with their first attestation between 1900 and 1985 that do not contain any of our verbal affixes. The result file of raw data contained more than one thousand verbs, of which 488 made it into the final list. I excluded all forms derived by the truncation of a suffix or other material (e.g. to bibliograph < bibliography, to bolsh < Bolshevik), by prefixation (e.g. defocus) or compounding (e.g. pistol-whip), and those that could be determined as loan words. All unclear cases were included.

 

A few remarks are in order concerning the accuracy of the search procedure. In principle, we can expect the software to find all neologisms of the pertinent kind that are listed in the OED, if the query language is used properly. However, the use of the OED for the purposes revealed that, due to occasional inconsistencies in the programming of the entries, this is not the case (see also Plag 1996). There are sometimes words that do not make it into the appropriate result file, inspite of the fact that they correctly match the search string.3 Their number seems to be negligible, though. In the case of -ize, for example, I have detected only one form, scenarioize, that should have been retrieved by the query software, but was not.4

 

Another problem emerging from the OED as a data base is the fact that some of the derivatives involving overt suffixes are only listed as participles, i.e. with the inflectional endings -ing and -ed, which are used as adjectives or nouns in the quotations given in the OED. Whatever the reason for this peculiarity may be. The reason for this decision was that this is a study of derived verbs, and not of deverbal nouns and adjectives, even if the latter might have a lot in common with their base verbs. However, for the present purpose of determining the productivity of the verb-forming affixes, the numbers of new participles are included in the frequency tables given below. This decision had to be taken in order to insure that the dictionary-based figures are comparable to the text-based ones.

 

1 The possibility of combining different search criteria makes the OED a powerful electronic data base for all kinds of lexicological studies. See, for example, Jucker (1994) for an overview.

2 The dates of first attestation given in the OED are of course only the first attestations in written documents as they were detected by the lexicographers or other people studying the phenomena. Thus, a word may have been in usage before the first documentation, or, conversely, it may not have gained any currency after its first documentation. It is unlikely that these unavoidable problems cause any serious defects in our investigation since all processes under discussion should be affected to the same degree.

3 These bugs in the program are officially acknowledged by the OED staff, and will possibly be eliminated in the next edition.

4 For the investigation of the phonological properties of -ize derivatives I carried out all kinds of searches with more specified search strings and compared the results with my original files. Of all the neologisms I found, only scenarioize had escaped the original search.