Products
Home»Products»Rosette Linguistics Platform»Language Identifier»Supported Languages & Encodings

Supported Platforms

Windows, Linux, Solaris, AIX, HPUX, and MacOS

Supported Languages & Encodings

Languages and Encodings Recognized by Rosette Language Identifier

  • Software Version: RLI 7.3
  • Profiles: 188 Language/Encoding pairs
  • Legacy Encodings: 44 (UTF-8 for all languages)
  • Languages: 55

Rosette Language Identifier recognizes the language and encoding of a document. If a document contains text in several languages, then the tool will identify all of the languages and the sections of the document in each language.

LanguageEncodings
AlbanianWindows-1252, ISO-8859-1
ArabicWindows-1256, ISO-8859-6, Windows-720, ISO-8859-6
Transliterated ArabicWindows-1252, Windows-1256, ISO-8859-1
BengaliISCII-Bengali
BulgarianWindows-1251, ISO-8859-5, KOI8-R
CatalanWindows-1252, ISO-8859-1
Simplified ChineseGB-18030, GB-2312, HZ-GB-2312, ISO-2022-CN
Traditional ChineseBig5
CroatianWindows-1250, ISO-8859-2
CzechWindows-1250, ISO-8859-2
DanishWindows-1252, ISO-8859-1
DutchWindows-1252, ISO-8859-1
EnglishWindows-1252, ISO-8859-1
EstonianWindows-1257, ISO-8859-13
FinnishWindows-1252, ISO-8859-1
FrenchWindows-1252, ISO-8859-1
GermanWindows-1252, ISO-8859-1
GreekWindows-1253, ISO-8859-7
GujartiISCII-Gujarti
HebrewWindows-1255, ISO-8859-8
HindiISCII-Devanagari
HungarianWindows-1250, ISO-8859-2
IcelandicWindows-1252, ISO-8859-1
IndonesianWindows-1252, ISO-8859-1
ItalianWindows-1252, ISO-8859-1
JapaneseEUC-JP, ISO-2022-JP, Shift-JIS, SJIS-2004
KannadaISCII-Kannada
KoreanEUC-KR, ISO-2022-KR
LatvianWindows-1257, ISO-8859-13
LithuanianWindows-1257, ISO-8859-13
MacedonianWindows-1251, ISO 8859-5
MalayWindows-1252, ISO-8859-1
MalayalamISCII-Malayalam
NorwegianWindows-1252, ISO-8859-1
PashtoWindows-1256
Transliterated PashtoWindows-1252, Windows-1256, ISO-8859-1
Farsi (Persian)Windows-1256
Transliterated Farsi (Persian)Windows-1252, Windows-1256, ISO-8859-1
PolishWindows-1250, ISO-8859-2
PortugueseWindows-1252, ISO-8859-1
RomanianWindows-1250, ISO-8859-2
RussianWindows-1251, ISO-8859-5, IBM-866, KOI8-R, Macintosh-Cyrillic
SerbianWindows-1251, ISO-8859-5
Transliterated SerbianWindows-1250, ISO-8859-2
SlovakWindows-1250, ISO-8859-2
SlovenianWindows-1250, ISO-8859-2
SomaliWindows-1252, ISO-8859-1
SpanishWindows-1252, ISO-8859-1
SwedishWindows-1252, ISO-8859-1
TagalogWindows-1252, ISO-8859-1
TamilISCII-Tamil
TeluguISCII-Telugu
ThaiWindows-874
TurkishWindows-1254, ISO-8859-9
UkranianWindows-1251, KOI8-U, ISO-8859-5
UrduWindows-1256
Transliterated UrduWindows-1252, Windows-1256, ISO-8859-1
UzbekWindows-1251, KOI8-R, ISO-8859-5
Transliterated UzbekWindows-1251
VietnameseVISCII, VPS, VIQR, TCVN, VNI