Languages/Scripts supported in different versions of Tesseract (original) (raw)
Languages
LangCode | Language | 3.02 | 3.04 | 4.00 | 4.0.0 | 4.0.0 | 4.0.0 |
---|---|---|---|---|---|---|---|
Nov. 2016 | tessdata | tessdata_best | tessdata_fast | ||||
afr | Afrikaans | x | x | x | x | x | x |
amh | Amharic | x | x | x | x | x | |
ara | Arabic | x | x | x | x | x | x |
asm | Assamese | x | x | x | x | x | |
aze | Azerbaijani | x | x | x | x | x | |
aze_cyrl | Azerbaijani - Cyrilic | x | x | x | x | x | x |
bel | Belarusian | x | x | x | x | x | x |
ben | Bengali | x | x | x | x | x | x |
bod | Tibetan | x | x | x | x | x | |
bos | Bosnian | x | x | x | x | x | |
bre | Breton | x | x | x | x | ||
bul | Bulgarian | x | x | x | x | x | x |
cat | Catalan; Valencian | x | x | x | x | x | x |
ceb | Cebuano | x | x | x | x | x | |
ces | Czech | x | x | x | x | x | x |
chi_sim | Chinese - Simplified | x | x | x | x | x | x |
chi_tra | Chinese - Traditional | x | x | x | x | x | x |
chr | Cherokee | x | x | x | x | x | x |
cos | Corsican | x | x | x | |||
cym | Welsh | x | x | x | x | x | |
dan | Danish | x | x | x | x | x | x |
dan_frak | Danish - Fraktur (contrib) | x | x | ||||
deu | German | x | x | x | x | x | x |
deu_frak | German - Fraktur (contrib) | x | x | ||||
deu_latf | German (Fraktur Latin) | x | x | x | x | ||
dzo | Dzongkha | x | x | x | x | x | |
ell | Greek, Modern (1453-) | x | x | x | x | x | x |
eng | English | x | x | x | x | x | x |
enm | English, Middle (1100-1500) | x | x | x | x | x | x |
epo | Esperanto | x | x | x | x | x | x |
equ | Math / equation detection module | x | x | x | x | x | |
est | Estonian | x | x | x | x | x | x |
eus | Basque | x | x | x | x | x | x |
fao | Faroese | x | x | x | |||
fas | Persian | x | x | x | x | x | |
fil | Filipino (old - Tagalog) | x | x | x | |||
fin | Finnish | x | x | x | x | x | x |
fra | French | x | x | x | x | x | x |
frk | German - Fraktur (now deu_latf) | x | x | x | x | x | x |
frm | French, Middle (ca.1400-1600) | x | x | x | x | x | x |
fry | Western Frisian | x | x | x | |||
gla | Scottish Gaelic | x | x | x | |||
gle | Irish | x | x | x | x | x | |
glg | Galician | x | x | x | x | x | x |
grc | Greek, Ancient (to 1453) (contrib) | x | x | x | x | x | x |
guj | Gujarati | x | x | x | x | x | |
hat | Haitian; Haitian Creole | x | x | x | x | x | |
heb | Hebrew | x | x | x | x | x | x |
hin | Hindi | x | x | x | x | x | x |
hrv | Croatian | x | x | x | x | x | x |
hun | Hungarian | x | x | x | x | x | x |
hye | Armenian | x | x | x | |||
iku | Inuktitut | x | x | x | x | x | |
ind | Indonesian | x | x | x | x | x | x |
isl | Icelandic | x | x | x | x | x | x |
ita | Italian | x | x | x | x | x | x |
ita_old | Italian - Old | x | x | x | x | x | x |
jav | Javanese | x | x | x | x | x | |
jpn | Japanese | x | x | x | x | x | x |
kan | Kannada | x | x | x | x | x | x |
kat | Georgian | x | x | x | x | x | |
kat_old | Georgian - Old | x | x | x | x | x | |
kaz | Kazakh | x | x | x | x | x | |
khm | Central Khmer | x | x | x | x | x | |
kir | Kirghiz; Kyrgyz | x | x | x | x | x | |
kmr | Kurmanji (Kurdish - Latin Script) | x | x | x | x | ||
kor | Korean | x | x | x | x | x | x |
kor_vert | Korean (vertical) | x | x | x | x | ||
kur | Kurdish (Arabic Script) | x | |||||
lao | Lao | x | x | x | x | x | |
lat | Latin | x | x | x | x | x | |
lav | Latvian | x | x | x | x | x | x |
lit | Lithuanian | x | x | x | x | x | x |
ltz | Luxembourgish | x | x | x | x | ||
mal | Malayalam | x | x | x | x | x | x |
mar | Marathi | x | x | x | x | x | |
mkd | Macedonian | x | x | x | x | x | x |
mlt | Maltese | x | x | x | x | x | x |
mon | Mongolian | x | x | x | x | ||
mri | Maori | x | x | x | x | ||
msa | Malay | x | x | x | x | x | x |
mya | Burmese | x | x | x | x | x | |
nep | Nepali | x | x | x | x | x | |
nld | Dutch; Flemish | x | x | x | x | x | x |
nor | Norwegian | x | x | x | x | x | |
oci | Occitan (post 1500) | x | x | x | x | x | |
ori | Oriya | x | x | x | x | x | |
osd | Orientation and script detection module | x | x | x | x | x | x |
pan | Panjabi; Punjabi | x | x | x | x | x | |
pol | Polish | x | x | x | x | x | x |
por | Portuguese | x | x | x | x | x | x |
pus | Pushto; Pashto | x | x | x | x | x | |
que | Quechua | x | x | x | x | ||
ron | Romanian; Moldavian; Moldovan | x | x | x | x | x | x |
rus | Russian | x | x | x | x | x | x |
san | Sanskrit | x | x | x | x | x | |
sin | Sinhala; Sinhalese | x | x | x | x | x | |
slk | Slovak | x | x | x | x | x | x |
slk_frak | Slovak - Fraktur (contrib) | x | x | ||||
slv | Slovenian | x | x | x | x | x | x |
snd | Sindhi | x | x | x | x | ||
spa | Spanish; Castilian | x | x | x | x | x | x |
spa_old | Spanish; Castilian - Old | x | x | x | x | x | x |
sqi | Albanian | x | x | x | x | x | x |
srp | Serbian | x | x | x | x | x | x |
srp_latn | Serbian - Latin | x | x | x | x | x | |
sun | Sundanese | x | x | x | x | ||
swa | Swahili | x | x | x | x | x | x |
swe | Swedish | x | x | x | x | x | x |
syr | Syriac | x | x | x | x | x | |
tam | Tamil | x | x | x | x | x | x |
tat | Tatar | x | x | x | x | ||
tel | Telugu | x | x | x | x | x | x |
tgk | Tajik | x | x | x | x | x | |
tgl | Tagalog (new - Filipino) | x | x | x | |||
tha | Thai | x | x | x | x | x | x |
tir | Tigrinya | x | x | x | x | x | |
ton | Tonga | x | x | x | x | ||
tur | Turkish | x | x | x | x | x | x |
uig | Uighur; Uyghur | x | x | x | x | x | |
ukr | Ukrainian | x | x | x | x | x | x |
urd | Urdu | x | x | x | x | x | |
uzb | Uzbek | x | x | x | x | x | |
uzb_cyrl | Uzbek - Cyrilic | x | x | x | x | x | |
vie | Vietnamese | x | x | x | x | x | x |
yid | Yiddish | x | x | x | x | x | |
yor | Yoruba | x | x | x | x |
Scripts
| | Script | 3.02 | 3.04 | 4.00 | 4.0.0 | 4.0.0 | 4.0.0 | | | --------- | ------------------------------------- | ---- | -------- | -------- | -------------- | -------------- | - | | | | | | Nov 2016 | tessdata | tessdata_best | tessdata_fast | | | arab | Arabic | | | | x | x | x | | armn | Armenian | | | | x | x | x | | beng | Bengali | | | | x | x | x | | cans | Canadian_Aboriginal | | | | x | x | x | | cher | Cherokee | | | | x | x | x | | cyrl | Cyrillic | | | | x | x | x | | deva | Devanagari | | | | x | x | x | | ethi | Ethiopic | | | | x | x | x | | frak | Fraktur | | | | x | x | x | | geor | Georgian | | | | x | x | x | | grek | Greek | | | | x | x | x | | gujr | Gujarati | | | | x | x | x | | guru | Gurmukhi | | | | x | x | x | | hans | HanS (Han simplified) | | | | x | x | x | | hans-vert | HanS_vert (Han simplified vertical) | | | | x | x | x | | hant | HanT (Han traditional) | | | | x | x | x | | hant-vert | HanT_vert (Han traditional vertical) | | | | x | x | x | | hang | Hangul | | | | x | x | x | | hang-vert | Hangul_vert (Hangul vertical) | | | | x | x | x | | hebr | Hebrew | | | | x | x | x | | jpan | Japanese | | | | x | x | x | | jpan-vert | Japanese_vert (Japanese vertical) | | | | x | x | x | | knda | Kannada | | | | x | x | x | | khmr | Khmer | | | | x | x | x | | laoo | Lao | | | | x | x | x | | latn | Latin | | | | x | x | x | | mlym | Malayalam | | | | x | x | x | | mymr | Myanmar | | | | x | x | x | | orya | Oriya(Odia) | | | | x | x | x | | sinh | Sinhala | | | | x | x | x | | syrc | Syriac | | | | x | x | x | | taml | Tamil | | | | x | x | x | | telu | Telugu | | | | x | x | x | | thaa | Thaana | | | | x | x | x | | thai | Thai | | | | x | x | x | | tibt | Tibetan | | | | x | x | x | | viet | Vietnamese | | | | x | x | x |
For detalls about the languages that each Script.traindata file supports, see the files that end with langs.txt (e.g. Latin.langs.txt) here.