Home (original) (raw)
Welcome to Wikokit - the open-source Wiktionary parser.
This wiki is the main source of documentation for developers working with (or contributing to) the Wikokit project.
Quick navigation
Setup
★Getting started Wiktionary parser - how to convert the database of Wiktionary into the machine-readable format (parsed Wiktionary)
MySQL import - how to import Wiktionary database into local MySQL database.
★File wikt_parsed_empty_sql - how to create, edit and load empty Wiktionary parsed database into MySQL (./wikt_parser/doc).
Setup NetBeans for parsing - setup NetBeans environment for parsing, run parser.
Image.py postprocessing - get URLs of Wiktionary scaled images (Wikimedia thumbs) and write them to the local MySQL database.
Advanced setup
MySQL Workbench - how to create the empty SQL-file for the Wiktionary parsed database
SQLite - how to convert the Wiktionary parsed database (MySQL) into SQLite-file
Database
★Encoding - how to correctly setup database encoding, about character encoding.
[Index wordlist, index_native](Index wordlist index_native) - index wordlist for each language (tables index_native, index_de, index_fr, etc.)
Queries
★SQL examples - how to extract some information from the parsed Wiktionary database. SQL query examples.
MRDQuote - quote table (and tables related to quotations) in machine-readable dictionary. SPARQL and SQL queries to work with quotes.
d2rqMappingSPARQL - how to map Wiktionary parsed database (MySQL) to RDF database by D2RQ.
Developer
★One more Wiktionary - how to parse one more Wiktionary language edition.
JUnit - unit test requirements.
Todo list - list of improvements and modifications to be done.
Done
Context labels workplan - coordination of work devoted to an extraction of Context Labels from English Wiktionary and Russian Wiktionary (in Russian)
Links
- New Wiktionary parsed databases from this page.
magnetowordik
File mean_semrel_empty_sql - how to create, edit and load empty wikt_sem_rel Wiktionary parsed database with meaning and semantic relations into MySQL (wikt_parser/doc/parsed/mean_semrel/).