Rosette Language Identifier recognizes the language and encoding of a document. If a document contains text in several languages, then the tool will identify all of the languages and the sections of the document in each language.
| Language | Encodings |
|---|---|
| Albanian | Windows-1252, ISO-8859-1 |
| Arabic | Windows-1256, ISO-8859-6, Windows-720, ISO-8859-6 |
| Transliterated Arabic | Windows-1252, Windows-1256, ISO-8859-1 |
| Bengali | ISCII-Bengali |
| Bulgarian | Windows-1251, ISO-8859-5, KOI8-R |
| Catalan | Windows-1252, ISO-8859-1 |
| Simplified Chinese | GB-18030, GB-2312, HZ-GB-2312, ISO-2022-CN |
| Traditional Chinese | Big5 |
| Croatian | Windows-1250, ISO-8859-2 |
| Czech | Windows-1250, ISO-8859-2 |
| Danish | Windows-1252, ISO-8859-1 |
| Dutch | Windows-1252, ISO-8859-1 |
| English | Windows-1252, ISO-8859-1 |
| Estonian | Windows-1257, ISO-8859-13 |
| Finnish | Windows-1252, ISO-8859-1 |
| French | Windows-1252, ISO-8859-1 |
| German | Windows-1252, ISO-8859-1 |
| Greek | Windows-1253, ISO-8859-7 |
| Gujarti | ISCII-Gujarti |
| Hebrew | Windows-1255, ISO-8859-8 |
| Hindi | ISCII-Devanagari |
| Hungarian | Windows-1250, ISO-8859-2 |
| Icelandic | Windows-1252, ISO-8859-1 |
| Indonesian | Windows-1252, ISO-8859-1 |
| Italian | Windows-1252, ISO-8859-1 |
| Japanese | EUC-JP, ISO-2022-JP, Shift-JIS, SJIS-2004 |
| Kannada | ISCII-Kannada |
| Korean | EUC-KR, ISO-2022-KR |
| Latvian | Windows-1257, ISO-8859-13 |
| Lithuanian | Windows-1257, ISO-8859-13 |
| Macedonian | Windows-1251, ISO 8859-5 |
| Malay | Windows-1252, ISO-8859-1 |
| Malayalam | ISCII-Malayalam |
| Norwegian | Windows-1252, ISO-8859-1 |
| Pashto | Windows-1256 |
| Transliterated Pashto | Windows-1252, Windows-1256, ISO-8859-1 |
| Farsi (Persian) | Windows-1256 |
| Transliterated Farsi (Persian) | Windows-1252, Windows-1256, ISO-8859-1 |
| Polish | Windows-1250, ISO-8859-2 |
| Portuguese | Windows-1252, ISO-8859-1 |
| Romanian | Windows-1250, ISO-8859-2 |
| Russian | Windows-1251, ISO-8859-5, IBM-866, KOI8-R, Macintosh-Cyrillic |
| Serbian | Windows-1251, ISO-8859-5 |
| Transliterated Serbian | Windows-1250, ISO-8859-2 |
| Slovak | Windows-1250, ISO-8859-2 |
| Slovenian | Windows-1250, ISO-8859-2 |
| Somali | Windows-1252, ISO-8859-1 |
| Spanish | Windows-1252, ISO-8859-1 |
| Swedish | Windows-1252, ISO-8859-1 |
| Tagalog | Windows-1252, ISO-8859-1 |
| Tamil | ISCII-Tamil |
| Telugu | ISCII-Telugu |
| Thai | Windows-874 |
| Turkish | Windows-1254, ISO-8859-9 |
| Ukranian | Windows-1251, KOI8-U, ISO-8859-5 |
| Urdu | Windows-1256 |
| Transliterated Urdu | Windows-1252, Windows-1256, ISO-8859-1 |
| Uzbek | Windows-1251, KOI8-R, ISO-8859-5 |
| Transliterated Uzbek | Windows-1251 |
| Vietnamese | VISCII, VPS, VIQR, TCVN, VNI |