Guidelines for the Transcription of Arabic Dialects (EARS)
Tim Buckwalter, Mohamed Maamouri
Arabic Treebank Project
LDC, University of Pennsylvania
Oct. 8, 2004
GUIDELINES FOR TRANSCRIBING LEVANTINE ARABIC:
MSA-BASED TRANSCRIPTION
(1) General spelling
The Guidelines can be summarized in the following general statement:
Adhere as closely as possible to unvocalized MSA spelling and word segmentation in all cases.
The following example illustrates the application of these principles: the Levantine utterance /?ultil:ak/
("I told you") will be transcribed as MSA قلت لك unvocalized, spelled as two words,
using accepted MSA orthography.
There are three notable exceptions to these rules:
-
If the word is listed as a high-frequency colloquial word
(see below, Appendix A: High-Frequency Dialectal Words)
then it should be spelled as indicated in the list and no attempt
should be made to render it in MSA.
-
If the word is a colloquial verb whose morphology
deviates substatially from that of its MSA equivalent,
then it should be written as indicated in the conjugation paradigms
of colloquial verbs.
(see below Perfect Verb Paradigms and Imperfect Verb Paradigms).
-
Nunation will be transcribed when it is recorded in speech.
(see below Short vowels, diacritics, and Nunation)
(2) MSA and LA phonological differences
The following regular phonological differences between MSA and LA
(and many other dialects as well) do not justify departing from MSA orthography when
transcribing the colloquial form:
- MSA
interdental fricative /θ/ → LA dental stop
/t/. Examples: مثل (MSA /miθla/, LA
/mitl/), أكثر (MSA /?akθar/,
LA /?aktar/)
- MSA
interdental fricative /θ/ → LA sibilant /s/.
Examples: مثلا (MSA /maθalan/,
LA /masalan/)
- MSA
velar /q/ → LA glottal stop /?/ or velar stop /k/. Examples: قصة
(MSA /qiSSa/, LA /?iSSa/ or /kiSSa/)
- MSA
velar stop /k/ → LA palato-alveolar affricate /č/. Examples: كلب
(MSA /kalb/, LA (in some village dialects) /čalb/)
- MSA
interdental fricative /ð/ → LA dental stop /d/.
Examples: تأخذي (MSA /ta?xuði/, LA تاخذي
/ta:xudi/), خذ (MSA /xuð/, LA /xud/), هذا
(MSA /ha:ða/, LA /ha:da/)
- MSA
interdental fricative /ð/ → LA voiced alveolar
fricative /z/. Examples: كذاب (MSA /kað:a:b/,
LA /kaz:a:b/)
- MSA
velarized voiced alveolar stop /D/ → LA velarized
voiced alveolar fricative /Z/. Examples: مضبوط
(MSA /maDbu:T/, LA /maZbu:T/), بالضبط
(MSA /bi-D-DabT/, LA /bi-Z-ZabT/)
- MSA
voiced palatal affricate /ğ/ → LA voiced palatal
spirant /ž/. Examples: مجنون (MSA /mağnu:n/,
LA /mažnu:n/)
(3) Ta marbuta
The ta marbuta should be written always
with the two dots (ـة) whether it is pronounced /a/ or /t/.
(4) Short vowels, diacritics, and Nunation
Do not transcribe the short vowels /a/, /u/, /i/, or
the diacritics indicating zero-vowel ("sukun") and gemination ("shadda").
When Nunation ("tanwin") is recorded in speech it should be transcribed.
The most common examples all involve use of "fatHatan"
(e.g., /?ahlan wa-sahlan/ أهلاً وسهلاً), although
some uses of "kasratan" are recorded in educated or elevated speech
(e.g., /?ila Had:in ma:/ إلى حدٍ ما).
Note that the "fatHatan" may occur without an alif chair:
(e.g., /xa:S:atan/ خاصةً),
(5) Glottal stop (hamza)
All glottal stops should be written when and where they occur.
For example:
/ğara:?id/ جرائد (contrast with: /ğara:yid/ جرايد)
/fa:?iz/ فائز (contrast with: /fa:yiz/ فايز)
/ra?s/ رأس (contrast with: /ra:s/ راس)
/bi:r/ بئر (contrast with: /bi:r/ بير)
(6) Verbal prefixes and suffixes
Colloquial verbs follow MSA paradigms, but
with modifications to the underlying tri-consonantal radicals and
with the addition of purely colloquial clitics.
The
colloquial Perfect Verb follows MSA suffixation
orthography except for the following:
- the
MSA 2nd pers. fem. sg. marker /-ti/ is written ـتي
in colloquial Arabic (note: this parallels the colloquial
orthography of the pronoun /?inti/ إنتي).
Example: شو عملتي يا ماما؟
- the
MSA 2nd pers. masc. pl. subject marker /-tum/ is shortened to /-tu/
in colloquial and written ـتوا (note:
this parallels the orthography of the Imperfect بيكتبوا).
Example: وين كنتوا إمبارح يا شباب؟
- the
MSA 2nd pers. masc. pl. direct object marker /-kum/ is shortened to /-ku/
in colloquial and written ـكوا (note:
this parallels the orthography of the Pronoun إنتوا as well as that
of the Perfect كتبتوا and Imperfect بيكتبوا).
Example: ما شفناكوا من زمان
- the
colloquial negative particle /-, -i/ ـش
is appended to the perfect verb subject or object marker
(note: the concatenation sequence is as follows: perfect
verb stem + subject marker + object marker + negative
particle). Examples: /ma ruHti/ ما رحتش,
/ma ufthum/ ما شفتهمش.
The
colloquial Perfect Verb follows MSA stem
orthography except for the following:
- MSA
doubled verbs are treated as finally-weak verbs in
colloquial. Examples: /HaT:e:t/ حطيت,
/Zal:e:t/ ظليت
- MSA
verbs with hamza as third radical are treated as finally-weak
verbs in colloquial. Example: /?are:t/ قريت
(MSA /qara?tu/ قرأت) Note: this rule
is identical to the earlier-cited rule of MSA glottal
stop → /y/ or vocalic length in various environments
- Hollow
MSA verbs with hamza as third radical are resyllabicated
in LA in the conjugation of the perfect. Example: /?iğa/ إجى
(MSA /ğa:?a/ جاء), /?iğu:/ إجوا
(MSA /ğa:?u:/ جاؤوا)
PERFECT VERB PARADIGMS
| جاش |
قراش |
شافش |
كتبش |
هو ما |
|
إجى |
قرى |
شاف |
كتب |
هو |
| جوش |
قروش |
شافوش |
كتبوش |
هم ما |
|
إجوا |
قروا |
شافوا |
كتبوا |
هم |
| جتش |
قرتش |
شافتش |
كتبتش |
هي ما |
|
إجت |
قرت |
شافت |
كتبت |
هي |
| جيتش |
قريتش |
شفتش |
كتبتش |
إنت ما |
|
جيت |
قريت |
شفت |
كتبت |
إنت |
| جيتيش |
قريتيش |
شفتيش |
كتبتيش |
إنتي ما |
|
جيتي |
قريتي |
شفتي |
كتبتي |
إنتي |
| جيتوش |
قريتوش |
شفتوش |
كتبتوش |
إنتوا ما |
|
جيتوا |
قريتوا |
شفتوا |
كتبتوا |
إنتوا |
| جيتش |
قريتش |
شفتش |
كتبتش |
أنا ما |
|
جيت |
قريت |
شفت |
كتبت |
أنا |
| جيناش |
قريناش |
شفناش |
كتبناش |
إحنا ما |
|
جينا |
قرينا |
شفنا |
كتبنا |
إحنا |
The
colloquial Imperfect Verb follows MSA
orthography except for the following:
- the
colloquial verb particle /b-/ بـ
is prefixed to the imperfect verb subject marker prefixes y-, t-, and n- (note:
the concatenation sequence is as follows: colloquial verb
particle /b-/ + MSA subject marker /y-,t-,n-/
+ imperfect verb stem). Examples: بيعرف,
بتعرف, بنعرف
- the
colloquial verb particle /b-/ بـ
is prefixed directly to the imperfect verb stem when the subject is 1st person singular.
(Note: The alif of the 1st pers. sg. form is only a chair for the glottal stop.
And because the glottal stop is elided in Levantine Arabic, the alif chair is not written).
The concatenation sequence is as follows: colloquial verb
particle /b-/ + imperfect verb stem). Examples: أنا بعرف,
أنا بحب, أنا بروح
- the
colloquial verb particle /m-/ مـ
is prefixed to the imperfect verb 1st pers. pl. subject
marker prefix (note: the concatenation sequence is as
follows: colloquial verb particle /m-/ + MSA
subject marker /n-/ + imperfect verb stem).
Examples: منعرف, منشوف,
منعمل. (Note: in some versions of Levantine Arabic both prefixation forms are used;
for example: إحنا بنشوف and إحنا منشوف)
- the
colloquial negative particle /-, -i/ ـش
is appended to the imperfect verb subject or object
marker (note: the concatenation sequence is as follows:
imperfect verb stem + subject number marker + object
marker + negative particle). Examples: ما
بيعرفش, ما بيعرفهاش
IMPERFECT VERB PARADIGMS
| بيجيش |
بيقراش |
بيشوفش |
بيكتبش |
هو ما |
|
بيجي |
بيقرى |
بيشوف |
بيكتب |
هو |
| بيجوش |
بيقروش |
بيشوفوش |
بيكتبوش |
هم ما |
|
بيجوا |
بيقروا |
بيشوفوا |
بيكتبوا |
هم |
| بتيجيش |
بتقراش |
بتشوفش |
بتكتبش |
هي ما |
|
بتيجي |
بتقرى |
بتشوف |
بتكتب |
هي |
| بتيجيش |
بتقراش |
بتشوفش |
بتكتبش |
إنت ما |
|
بتيجي |
بتقرى |
بتشوف |
بتكتب |
إنت |
| بتيجيش |
بتقريش |
بتشوفيش |
بتكتبيش |
إنتي ما |
|
بتيجي |
بتقري |
بتشوفي |
بتكتبي |
إنتي |
| بتيجوش |
بتقروش |
بتشوفوش |
بتكتبوش |
إنتوا ما |
|
بتيجوا |
بتقروا |
بتشوفوا |
بتكتبوا |
إنتوا |
| بجيش |
بقراش |
بشوفش |
بكتبش |
أنا ما |
|
بجي |
بقرى |
بشوف |
بكتب |
أنا |
| منيجيش |
منقراش |
منشوفش |
منكتبش |
إحنا ما |
|
منيجي |
منقرى |
منشوف |
منكتب |
إحنا |
(7) Pronominal suffixes and verbal objects (prepositional phrases and direct objects)
MSA orthography should be followed in all cases.
Note, especially, the following examples:
-
The 3rd pers. masc. sg. possessive pronoun should be written ـه,
and not ـو. For example, /kita:bo/ should be written
كتابه, and not كتابو)
-
Verbal prepositional phrase objects should be written as in MSA
(i.e., not cliticized). For example, /?ultil:ak/ should be written
قلت لك, and not قلتللك or قلتلك.
There are three notable exceptions to the preceding MSA-based rule:
-
Although the 3rd pers. masc. sg. direct object clitic should be written
as in MSA (e.g., أنا شفته and أنا بسمعه), when the negative
particle ـش is appended, the concatenation is pronounced either
/-u:/ or /-hu:/ and should be written accordingly (e.g.,
أنا ما شفتوش or أنا ما شفتهوش, and أنا ما بسمعوش or أنا ما بسمعهوش).
Note this variation in the frequent prepositional phrases مالوش and مافيهوش.
-
The 2nd pers. fem. sg. direct object clitic /-ki/ should be written ـكي.
Example: /ufna:ki/ شفناكي.
This applies as well to the pronoun suffix /-ki/:
/ma`a:ki ?inti/ معاكي إنتي "with you".
-
The 2nd pers. pl. direct object clitic /-ku/ should be written ـكوا.
Example: /ufna:ku/ شفناكوا.
This applies as well to the pronoun suffix /-ku/:
/Ha:ritku ?intu/ حارتكوا إنتوا "your neighborhood".
(8) Pronominal suffixes and active participles
Unlike MSA, LA active particles may take pronominal suffixes.
Note the following examples:
- /mu
ayfi:nak/ مش شايفينك
- /?ana
`arfo min zama:n/ انا عارفه من زمان
(9) Numerals
Except for the numerals 11-19 (see below), follow MSA orthography (or parallel MSA forms) in all cases.
Note, especially, the following examples:
- /?itne:n
dunam/ إثنين دنم
- /?is-sa:`a
tinte:n/ الساعة اثنتين
- /tala:t-irTa:l/
ثلاثة ارطال
- /xamst-iy:a:m/
خمسة ايام
-
/mi:t di:na:r/ مية دينار
-
/mi:te:n do:la:r/ ميتين دولار
-
/tala:t mi:t li:ra/ ثلاث مية ليرة
-
/xamst-a:la:f riya:l/ خمسة آلاف ريال
The Levantine Arabic numerals 11-19 show considerable deviation
from MSA pronunciation and word segmentation. The orthography of these
colloquial words varies considerably, as observed in informal writing
on the Web. The following recommended orthography is based on a
survey of Google frequency counts of all known orthographic variants.
Note that the word-final /r/ in all these forms is to be transcribed
even in cases where it is not pronounced.
11 إحدعشر
12 إثنعشر
13 ثلاثطعشر
14 أربعطعشر
15 خمسطعشر
16 سطعشر
17 سبعطعشر
18 ثمانطعشر
19 تسعطعشر
(10) Days of the week
Sunday يوم الأحد
Monday يوم الإثنين
Tuesday يوم الثلاثة
Wednesday يوم الأربعة
Thursday يوم الخميس
Friday يوم الجمعة
Saturday يوم السبت
(11) Months of the year
Eastern:
January شهر كانون الثاني / شهر واحد / شهر يناير
February شهر شباط / شهر إثنين / شهر فبراير
March شهر آذار / شهر ثلاثة / شهر مارس
April شهر نيسان / شهر أربعة / شهر إبريل
May شهر أيار / شهر خمسة / شهر مايو
June شهر حزيران / شهر ستة / شهر يونيو
July شهر تموز / شهر سبعة / شهر يوليو
August شهر آب / شهر ثمانية / شهر أغسطس
September شهر أيلول / شهر تسعة / شهر سبتمبر
October شهر تشرين الأول / شهر عشرة / شهر أكتوبر
November شهر تشرين الثاني / شهر إحدعشر / شهر نوفمبر
December شهر كانون الأول / شهر إثنعشر / شهر ديسمبر
Islamic:
Muharram محرم
Safar صفر
Rabie I ربيع الأول
Rabie II ربيع الثاني
Jumada I جمادى الأول
Jumada II جمادى الثاني
Rajab رجب
Shaaban شعبان
Ramadan رمضان
Shawwal شوال
Dhul Qada ذو القعدة
Dhul Hijja ذو الحجة
(12) Foreign words and placenames
Most foreign words and placenames already
have established MSA spellings (e.g., Washington واشنطن,
Los Angeles لوس انجلوس). In cases where
the MSA spelling has regional variants, follow the Levantine
spelling. Note the following examples:
- "garage"
(Levantine: كراج -- contrast with Egyptian: جراج)
- "congress"
(Levantine: كونغرس -- contrast with Egyptian: كونجرس)
Words that are not attested in MSA should be transcribed as expected in
MSA, but according to Levantine orthography. Note that although
many computers are able to display "extended" Arabic
characters, such as the Persian letters /p/ پ,
/č/ چ, /ž/ ژ, and /g/ گ,
few sytems provide the user with an easy way to actually type
these characters on the keyboard. So, although these letters are
potentially available for representing foreign sounds, the
convention in MSA orthography is to substitute the corresponding
and easily-available Arabic letters instead.
Therefore, according to Levantine practice,
- for
/p/ use ب (e.g., "Pam" بام),
- for
/č/ use تش (e.g., "Chet" تشيت),
- for
/ž/ use ج (e.g., "Mirage" ميراج),
- for
/g/ use غ (e.g., "Gilbert" غيلبرت).
APPENDIX A: HIGH-FREQUENCY DIALECTAL WORDS
In order to maintain lexical consistency
the transcriber should consult regulary the following list of
high-frequency dialectal words and their prescribed orthographic forms.
أتاري /?ata:ri/ "it turned out to me" (also pronounced /?aθa:ri/--perhaps
as a "classicisation"--among non-urban speakers)
إحنا /?iHna/ "we"; cf. نحنا /niHna/
إشي /?ii:/ "something"; بدك إشي؟ "you want something?";
الإشي الجديد هذا "this new thing" (cf. شي /i:/)
إلـ /?il-/ used primarily in possessive constructions, it attaches directly to a following clitic
(e.g. أنا إلي /?ana ?ili:/ "I have", كم أخ إلك /kam ?ax ?ilak/ "How many brothers do you have?".
Note also uses with preposition: لإله /la?ilo/ "for him", لإلها /la?ilha/ "for her"
اللي /?il:i:/ "who; which" (with prepositions: باللي /bil:i:/, للي /lil:i:/)
أكمن /?akam:an, ?akam:in/ "how many"; هالأكمن /ha:l?akam:an, ha:l?akam:in/ "this many"
ألو /?alu:, ?alo:/ "hello" (in telephone conversations only)
إمبارح /?imba:riH/ "yesterday"
إمبيرح /?imbe:riH/ "yesterday"
إمراة /?imra:/ "woman; wife" (إمراتك /?imra:tak/ "your wife") see also مرة /mara/
إنتي /?inti:/ "you" (fem.sg.)
إنتوا /?intu:/ "you" (masc.pl.)
أنو /?anu:/ "which" من أنو جامعة؟ "from which university?"
أني /?ani:/ (1) "I" أني معاك "I'm with you"
(2) "which" (fem. variant of أنو, see above) حضرتك من أني عائلة؟ "which family do you come from?"
أوضة /?o:Da/ "room"
أوكي /?o:ke:/ "O.K."
إيد /?i:d/ "hand"
إيش /?e:/ "what"; cf. ليش /le:/ "why"
إيمتى /?e:mta/ "when"; also إيمتىً /?e:mtan/
إيه /?e:h, ?e:/ "what?"; cf. ليه /le:h, le:/ "why?"
أيواً /?ay:uwan/ "yes"
أيوه /?ay:uwah/, also /?ay:uwa/ "yes"
بالله /bal:a/ "I swear; really?" (cf. والله)
بد /bidd-/ "want" (بدي /biddi/ "I want", بده
/biddo/ "he wants", بدها /bidda:, biddha/ "she wants"),
برا /bar:a/ "outside" (cf. جوا /juw:a/ "inside")
برضو /barDo/ "also"
بركي /barki/ "maybe" (variant of بلكي)
بس /bas:/ "only; just"
بعديـ /ba`di:-/ "after" بعديكوا (cf. قبليـ)
بعدين /ba`de:n/ "afterwards"
بكرة /bukra/ "tomorrow"
بلكي /balki/ "maybe" (variant of بركي)
بيـ /bi:-/ "with" (with following clitic: بيك /bi:k/ "to/with you (masc.sg.)",
بيكي /bi:ki:/ "to/with you (fem.sg.)"):
أهلاً بيكوا /?ahlan bi:ku/ "welcome!"
بيناتـ /be:na:t-/ "among; between" (with following clitic: بيناتهم /be:na:thum/ "among them"
تبع /taba`/ possessive function word, usually with following clitic:
الملك تبعكم "your king"; تبع مين هاي السيارة؟ "whose car is this?"
تع /ta`/ "come!" fem. تعي, pl. تعوا (short forms of تعال, تعالي, and تعالوا)
تم /tum:/ "mouth": /sak:ir tum:ak/ "shut up!"
جاج /ža:ž/ "chicken"
جوا /juw:a/ "inside" (cf. برا /bar:a/ "outside")
حد /Hadd/ (in the phrase ما حدش /ma: Had:i/ "nobody")
حدا /Hada/ (esp. in neg phrases such as ما حدا بيعرف /ma: Hada biya`ref/ "nobody knows")
دغري /dughri/ "straight; forward"
زاكي /za:ki:/ "delicious"
زلمة /zalamah, zalameh, zalami/ "guy; buddy; dude": زلمتنا /zilmitna/ "our friend; our colleague"
زي /zey:/ "like;as"
ساعيات (used with هالـ) /has:a:`iya:t/ هالساعيات "nowadays"
ش /-/ neg. part. used with verbs and function words: ما عنديش, ما فيش
شو /u:/ "what"
شوي /way:/ "a little bit" (fem. شوية /way:a, way:it/)
شي /i:/ "something"; cf. إشي /?ii:/
طب /Tab/ "okay"
عـ /`a/ a shortened form of على it attaches directly to a following particle (e.g. عكل حال /`a-kull
Ha:l/ "in any case"), or to a noun stem (e.g. علبنان /`a-lubna:n/ "to Lebanon") or to the definite article
and following noun (e.g. عالسفارة /`as-safa:ra/ "to the embassy")
عبين /`abe:n/ "until" (typically with ما):
استنى شوية عبين ما حدا يحكي معاك "wait a second till someone (comes to the phone) to speak with you";
استنيته عبين ما رجع "I waited for him till he returned"
عشان /`a:a:n/ "because; 'cause"
علشان /`ala:n/ "because; 'cause" (note: /`alaa:n/ should be transcribed as على شان)
عم /`am/ (imperfect verb particle: شو عم تحكي؟ "what are you saying?")
عمالـ /`am:a:l-/: (عماله /`am:a:lo/, عمالنا /`am:a:lna/) modal to express habitual action:
عمالنا نستنى فيه "we've been waiting a while for him"
فا /fa:/ "so, therefore" (written frequently as a separate word:
فا اللي يقدر يروح "so, whoever is able to go")
فيه /fi:, fi:h/ "there is/are" (note: transcribe /fiy:o/ as فيو; cf. هيو)
فيش /fi:/ "there isn't/aren't/ain't"; also فيشي /fi:i:/
فين /fe:n/ "where" (cf. وين)
قبليـ /?abli:-/ "before" قبليكوا (cf. بعديـ)
قديش /?ad:e:/ "how much"
قديه /?ad:e:h, ?ad:e:/ "how much"
كلياتـ /kul:iy:a:t-/ "all of" (with following clitic: كلياتهم /kul:iy:a:thum/ "all of them"
كمان /kama:n/ "also"
كويس /kway:is/ "good"
لأ /la?/ "no; nope" (follow the hamza transcription rule here: if you hear it, transcribe it)
لسة /lis:a/ "still, yet" (with following clitic لساتـ /lis:a:t/:
لساتك هون؟ /lis:a:tak ho:n/ "are you still here?" لساتني طفل صغير /lis:a:tni Tifl ?izgi:r/ "I'm still a small child")
له /lah/ "no; no way"
ليش /le:/ "why"; cf. إيش /?e:/ "what"
ماي /may/ "water"
ماية /may:a/ "water"
مبلى /mbala/ "yes, certainly"
مرة /mara/ "woman; wife" (مرتك /martak/ "your wife") see also إمراة /?imra:/
مش /mi, mu/ "not"
مشان /mia:n/ "in order to; for the sake of"
مظلش /maZal:i/ (contraction of ما ظلش) "not to remain": مظلش ولا واحد "nobody remained; not a single person remained"
معا /ma`a:/ "with" (with following clitic:
معاك /ma`a:k/ "with you (sg.)", معاكوا /ma`a:ku:/ "with you (pl.)", معاي /ma`a:y(a)/ "with me")
معليش /ma`le:/ "never mind, it's okay"
معليه /ma`le:h, ma`le:/ "never mind, it's okay"
معناة /ma`na:t-/ "meaning" (with following clitic:
معناته /ma`na:to/ "its meaning")
مليح /mli:H/ "good"
منو /manu:, minu:/ "who" (in both direct and indirect questions):
منو هي؟ /minu: hiy:e/ "who is she?"
منو معي؟ /minu: ma`i/ lit. "who is with me?" (on the phone line), i.e., "who am I speaking with?"
مني /mini:/ "who" (rare fem. variant of منو /manu:, minu:/ (see above)
منيح /mni:H/ "good"
مية /miy:e/ "one hundred" (/mi:t/ in const. case)
مين /mi:n/ "who"
نحنا /niHna/ "we"; cf. إحنا /?iHna/
نص /nuSS/ "half"
نيالـ /niya:l-/ "how good for" نيالك, /niya:lak, niya:lik/, نياله /niya:lo/, نيالكوا /niya:lku/)
هـ /ha/ (with following definite article: هالـ /ha:l/) "this"
(هالكتاب /hal-kta:b/ "this book"); with preposition: /li-ha-d-daraže/ لهالدرجة
هاذ /ha:ð/ "this (masc.sg.)"
هالو /ha:lu:, ha:lo:/ "hello" (in telephone conversations only)
هانا /ha:na/ (also هان /ha:n/) "here" (Bedouin speech)
هاي /ha:y/ "this (fem.sg.)"
هذاك /haða:k/ "that"
هذول /haðo:l, hado:l/ "those;these"
هذيك /haði:k/ "that" (rare fem.: هذيكة /haði:kt/ "that" )
هسع /his:a`/ "still, yet; now, right now"
هلق /halla?/ "now"
هنيك /hune:k, hne:k/ "there, over there"
هون /ho:n/ "here"
هيدي /haydi/ "this" (fem.sg.)
هيـ /hay:-/ "there!" (with following pronoun suffix:
هيو /hay:o/ "there he/it is!",
هيها /hay:ha/ "there she/it is!",
هيهم /hay:hum/ "there they are!")
هيك /he:k/ "like this/that" (also هيكة /he:ka/ and هيكي /he:ke/)
وإلا /wa?il:a/ "or" (also والا /wil:a/). Note the difference
between ولا /wala/ ("neither...nor", as in ولا بيهش ولا بينش)
and والا /wil:a/ ("otherwise", as in اسكت والا بضربك).
Follow the hamza transcription rule here: if you hear /wa?il:a/ write وإلا,
and if you hear /wil:a/ write والا.
والله /wal:a/ "I swear; really?" (cf. بالله)
وين /we:n/ "where" (cf. فين)
يالله /yal:a/ "c'mon! let's...!" (not the same as the two-word phrase
يا الله /ya: ?al:a:h/ "oh my goodness!")
يللي /yal:i:/ "who; which" cf. اللي /?il:i:/
PRACTICAL TIPS ON HOW TO APPLY THE GUIDELINES
In practical terms, for each item the
transcriber will ask the following question: is the item on the
list of frequent colloquial items? If Yes, follow the orthography
used in the list; if No, follow the orthography of the MSA
parallel form, with any necessary modifications (e.g., to reflect
the morphology of colloquial verbs). The transcriber should be
able to justify each transcription decision by citing that the
transcribed item either parallels the MSA form or that it is among the
high-frequency dialectal items listed in Appendix A. Any new dialectal item that
is identified will be added to this list. The following examples
illustrate a rigorous application of the Guidelines.
Utterance: /mu Hakeitlak ?inno fi: na:s ?ikti:r rayHi:n Eas:inama/
Transcription: مش حكيت لك إنه فيه ناس كثير رايحين عالسينما?
/mu/:
dialectal item listed in Appendix A (مش)
/Hakeitlak/:
the parallel MSA form is /Hakayt laka/ حكيت لك
/?inno/:
the parallel MSA form is /?innahu/ إنه
/fi:/:
dialectal item listed in Appendix A (فيه)
/na:s/:
the parallel MSA form is ناس
/?ikti:r/:
the parallel MSA form is كثير
/rayHi:n/:
the parallel MSA form is رايحين
/Eas:inama/:
the stem /sinama/ is transcribed as سينما
because this is the parallel MSA form; the clitic /`a-/ عـ
is a dialectal item listed in Appendix A