المرجع الالكتروني للمعلوماتية
المرجع الألكتروني للمعلوماتية

English Language
عدد المواضيع في هذا القسم 6137 موضوعاً
Grammar
Linguistics
Reading Comprehension

Untitled Document
أبحث عن شيء أخر المرجع الالكتروني للمعلوماتية
القيمة الغذائية للثوم Garlic
2024-11-20
العيوب الفسيولوجية التي تصيب الثوم
2024-11-20
التربة المناسبة لزراعة الثوم
2024-11-20
البنجر (الشوندر) Garden Beet (من الزراعة الى الحصاد)
2024-11-20
الصحافة العسكرية ووظائفها
2024-11-19
الصحافة العسكرية
2024-11-19

مقابلات التقارير الإخبارية
25-10-2020
السيد محمد باقر الرضوي
26-1-2018
عِيص بن القاسم
11-9-2016
أبو عبد اللّه بن نصر اللّه الزنجاني
27-7-2016
الإنتاج العالمي من الحبوب
29-12-2022
تفسير الاية (86-89) من سورة الأسراء
23-8-2020

corpus, plural corpora (n.)  
  
903   10:06 صباحاً   date: 2023-07-28
Author : David Crystal
Book or Source : A dictionary of linguistics and phonetics
Page and Part : 117-3


Read More
Date: 2024-02-01 688
Date: 14-1-2022 747
Date: 2023-09-20 862

corpus, plural corpora (n.)

A collection of LINGUISTIC DATA, either written texts or a TRANSCRIPTION of recorded speech, which can be used as a starting-point of linguistic description or as a means of verifying hypotheses about a LANGUAGE (corpus linguistics). Linguistic DESCRIPTIONS which are ‘corpusrestricted’ have been the subject of criticism, especially by GENERATIVE GRAMMARIANS, who point to the limitations of corpora (e.g. that they are samples of PERFORMANCE only, and that one still needs a means of PROJECTING beyond the corpus to the language as a whole). In fieldwork on a new language, or in HISTORICAL study, it may be very difficult to get beyond one’s corpus (i.e. it is a ‘closed’ as opposed to an ‘extendable’ corpus), but in languages where linguists have regular access to NATIVE-SPEAKERS (and may be native-speakers themselves) their approach will invariably be ‘corpus-based’, rather than corpus-restricted. Corpora provide the basis for one kind of COMPUTATIONAL LINGUISTICS. A computer corpus is a large body of machine-readable texts. Increasingly large corpora (especially of English) have been compiled since the 1980s, and are used both in the development of natural language processing software and in such applications as lexicography, speech recognition, and machine translation.