dbo:abstract |
The Bijankhan corpus (Persian: پیکرهٔ بیجنخان) is a tagged corpus that is suitable for natural language processing (NLP) research on the Persian language. This collection is gathered from daily news and common texts. In this collection all documents are categorized into different subjects such as political, cultural, etc.; in about 4300 different subject categories. The corpus contains about 2.6 million manually tagged words with a tag set that contains 550 Persian part-of-speech tags. The Bijankhan corpus was created by the at the University of Tehran. The corpus is non-free in that it is not free for commercial use, although these restrictions vary by country. The Bijankhan corpus is named after Mahmood Bijankhan, professor of linguistics at the University of Tehran due to his contributions in this area. (en) |
dbo:thumbnail |
wiki-commons:Special:FilePath/Bijankhan_Corpus_Logo.gif?width=300 |
dbo:wikiPageExternalLink |
http://dbrg.ut.ac.ir/Bijankhan |
dbo:wikiPageID |
14570613 (xsd:integer) |
dbo:wikiPageLength |
1430 (xsd:nonNegativeInteger) |
dbo:wikiPageRevisionID |
1028184756 (xsd:integer) |
dbo:wikiPageWikiLink |
dbr:Natural_language_processing dbr:Persian_language dbr:University_of_Tehran dbr:Hamshahri_Corpus dbr:Mahmood_Bijankhan dbr:Text_corpus dbr:Part-of-speech_tagging dbc:Applied_linguistics dbc:Linguistic_research dbc:Persian_corpora dbr:Free_content dbr:Iran_and_copyright_issues dbr:Database_Research_Group dbr:File:Bijankhan_Corpus_Logo.gif dbr:Persian_Today_Corpus |
dbp:wikiPageUsesTemplate |
dbt:Ie-lang-stub dbt:Reflist dbt:Corpora-stub dbt:Corpus_linguistics |
dcterms:subject |
dbc:Applied_linguistics dbc:Linguistic_research dbc:Persian_corpora |
gold:hypernym |
dbr:Corpus |
rdf:type |
dbo:Work yago:WikicatCorpora yago:Abstraction100002137 yago:Assets113329641 yago:Capital113353607 yago:Possession100032613 yago:Principal113355868 yago:Relation100031921 |
rdfs:comment |
The Bijankhan corpus (Persian: پیکرهٔ بیجنخان) is a tagged corpus that is suitable for natural language processing (NLP) research on the Persian language. This collection is gathered from daily news and common texts. In this collection all documents are categorized into different subjects such as political, cultural, etc.; in about 4300 different subject categories. The corpus contains about 2.6 million manually tagged words with a tag set that contains 550 Persian part-of-speech tags. (en) |
rdfs:label |
Bijankhan Corpus (en) |
owl:sameAs |
freebase:Bijankhan Corpus yago-res:Bijankhan Corpus wikidata:Bijankhan Corpus dbpedia-fa:Bijankhan Corpus https://global.dbpedia.org/id/4Ydkc |
prov:wasDerivedFrom |
wikipedia-en:Bijankhan_Corpus?oldid=1028184756&ns=0 |
foaf:depiction |
wiki-commons:Special:FilePath/Bijankhan_Corpus_Logo.gif |
foaf:isPrimaryTopicOf |
wikipedia-en:Bijankhan_Corpus |
is dbo:wikiPageWikiLink of |
dbr:Hamshahri_Corpus dbr:Mahmood_Bijankhan dbr:Tehran_Monolingual_Corpus dbr:List_of_text_corpora |
is foaf:primaryTopic of |
wikipedia-en:Bijankhan_Corpus |