About Us
Home»About Us»Resources»Middle Eastern Language Issues

Middle Eastern Language Issues

  • Afghanistan’s Language and Culture: A Challenge for Security  The situation in Afghanistan is at the forefront of our national security initiatives. The increased turmoil has led many Afghans to migrate to neighboring countries while others are joining forces to help stabilize the country. These changes have had significant impact on the languages used within Afghanistan and the security implications of those languages. This talk for intelligence analysts explores the regional influences of Farsi and Urdu as well as the orthographic influences of Arabic and the importance of these languages for text mining and analysis. This presentation delves into linguistic details of these languages and explains how analyzing this data presents new challenges to intelligence gathering, and shows you the latest technology for text analysis of Afghan languages.

    Presentation by Steve Kearns and Zina Saadi at Basis Technology’s Government Users Conference in Chantilly, VA on June 8-9, 2010

  • You say “Jamāl”; he writes “Djamel”: Influences on Western Transliteration of Arabic Names  The proliferation of transliteration styles for Arabic names into Western languages is well known, but what are the factors that shape how names are represented across the Arabic world? This talk looks at examples of names influenced by formal languages and spoken in the region as well as how these languages influence the orthography of the names in Latin alphabet.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference in Chantilly, VA on June 8-9, 2010

  • The Names of Afghanistan: Understanding Pashto and Dari Names  This talk introduces naming practices in Afghanistan, following a primer on Pashto and Dari, the two major languages spoken in Afghanistan. The talk explores the linguistic attributes of Pashto and Dari names such as their influence by Arabic names, spelling variations, and morphology.

    Presentation by Bushra Zawaydeh at Basis Technology’s Government Users Conference in Chantilly, VA on June 8-9, 2010

  • The World of Arabic Nicknames  In the Arab culture, the number of nicknames for a person may seem endless. You often see them in chat, emails, or in oral communication. Dealing with multiple nicknames is a tricky problem for fields such as compliance, intelligence gathering and name resolution, since they could be used as aliases. This presentation desribes different types of Arabic nicknames and how they are used.

    Presentation by Bushra Zawaydeh at Basis Technology’s Government Users Conference on June 9, 2009.

  • Decoding Arabic Chat  KiLLeH Mn O5OoYuH e93’eeR!! :-) Cat walking on a keyboard, or Romanized Arabic chat? While transliterated Arabic poses its own issues of multiple standards and inconsistent use, asking linguistic software to make sense of Arabic chat is another matter entirely. How are words, parts of words, and sentence boundaries detected? What about non-linguistic expressions using mixed case letters, dialectical differences, and emoticons? This talk decodes the representation of Arabic sounds in the Romanized shorthand commonly used in chatrooms and blogs by presenting findings from field analyses of Egyptian, Gulf, Iraqi, and Levantine online dialects.

    Presentation by Bushra Zawaydeh at Basis Technology’s Government Users Conference in Chantilly, VA on June 8-9, 2010

  • One Language, Many Dialects: An Analysis of Arabic Dialects  This presentation discusses the similarities of many linguistic structures that define an Arabic dialect as well as the differences that draw non-geographical boundaries, and then show how this affects Arabic search.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference on June 9, 2009.

  • The Names of Afghanistan – Understanding Pashto and Dari Names  This presentation introduces naming practices in Afghanistan, following a primer on Pashto and Dari, the two major languages spoken in Afghanistan. It explores the linguistic attributes of Pashto and Dari names such as their influence by Arabic names, spelling variations, and morphology.

    Presentation by Bushra Zawaydeh at Basis Technology’s Government Users Conference on June 9, 2009.

  • You say “Jamāl”; he writes “Djamel”: Influences on Western Transliteration of Arabic Names  This presentation reviews examples of names influenced by formal languages and spoken in the region as well as how these languages influence the orthography of the names in Latin alphabet.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference on June 8, 2009.

  • Next Generation of Arabic Search: Linguistically Intelligent Retrieval  This presentation demonstrates how a search engine with knowledge of the linguistic components of Arabic – the roots, lemmas and stems – can greatly boost the relevancy of search results.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference in College Park, MD on May 20, 2008.

  • الأجيــال القادمة لتقنيات البــحث العربي  لقد أدى النمو السريع للمحتوى العربي على شبكة الإنترنت إلى الحاجة إلى جيل جديد من البحث النصي ذو تقنيات متقدمة لمعالجة تعقيدات اللغة العربية. هذا العرض يظهر كيف يمكن لمحرك البحث إستخدام المكونات اللغوية للغة العربية -- الجذور، الجذوع، والكلمات المعجمية -- ليعزز بشكل كبير ملاءمة لنتائج البحث .

  • A Linguistic Profile of the Persian Language and Dialects  This presentation is a brief history of the Persian language, its speakers, and its dialects. It compares Persian to other Arabic script languages such as Arabic, Pashto, and Urdu. It also delves into linguistic aspects of the language, which are important to natural language processing and analysis applications such as, orthography, typography rules, phonology, and spelling variants.

    Presentation by Bushra Zawadeh at Basis Technology’s Government Users Conference in College Park, MD on May 20, 2008.

  • A Profile of Arabic Script Languages  This presentation explores the history of the script in various Arabic script languages, the structure and characteristics of the Arabic alphabet, the alphabet used, the phonological structure, the borrowings, and the differences between Arabic and these languages.

    Presentation by Bushra Zawadeh at Basis Technology’s Government Users Conference in Washington D.C. on June 7, 2007.

  • Arabic, Farsi and Urdu Text Normalization for Natural Language Processing  This presentation suggests a multi-level normalization for handling various Arabic script orthographic variations that appear in current news corpora.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference in Washington D.C. on June 7, 2007.

  • Decoding Arabic Chat  This presentation decodes the representation of Arabic sounds in the Romanized shorthand commonly used in chatrooms and blogs by presenting findings from field analyses of Egyptian, Gulf, Iraqi, and Levantine online dialects.

    Presentation by Bushra Zawadeh at Basis Technology’s Government Users Conference in Washington D.C. on June 7, 2007.

  • What’s in a Persian Name?  This presentation begins with the basics of Persian phonology and name morphology, and delves into the rich influences of other languages; cultural naming preferences (such as the decline of Arabic-based names after the fall of the Shah in Iran); historical roots; and regional customs.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference in Washington D.C. on June 7, 2007.

  • Orthographic Variations in Arabic Corpora  This presentation discusses the different kinds of Arabic orthographic issues that Basis Technology’s Arabic linguists have encountered and handled while building various software solutions for Arabic text analysis.

    Presentation by Bushra Zawaydeh at Basis Technology’s Government Users Conference in Washington, D.C. on June 14, 2006.

  • Behind the Name: Etymology of Arabic Names  This presentation gives some samples of various linguistic rules that contributed to the evolution of certain famous Arabic names. It samples different types of names as well as the influence of various foreign languages; regional and social impacts; and language evolution.

    Presentation by Zina Saadi at Basis Technology’s Government Users Conference in Washington, D.C. on June 14, 2006.

  • Tailoring UAX #29 Word Breaking for Arabic Text  

    Presentation by Thomas Emerson at the 28th Internationalization & Unicode Conference in Orlando, FL on Sept. 8, 2005.