The EDICT Dictionary File (original) (raw)
[This page is quite out-of-date. More up-to-date documentation is available at
https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project.]
Welcome to the Home Page of the EDICT file within the JMdict/EDICT Project. This page has been written by Jim Breen (hereafter "I" or "me") and is intended as an overview of the file, with links to more detail elsewhere.
Background
Way back in 1991 I began to experiment with handling Japanese text in computer files, and decided to try writing a dictionary search program in Turbo C under DOS, which used a simple dictionary file contained in the MOKE (Mark's Own Kanji Editor) package. To make this program more useful, I began to expand the file itself. One thing led to another, until I ended up running a fairly major project which has taken over a large portion of my life. I must acknowledge that the EDICT project has depended on many people who have provided material and editorial assistance. A significant proportion of the compilation process has been carried out using electronic mail and file transfers, and indeed the project would never have occurred without the services provided by the Internet.
What is EDICT?
EDICT is a Japanese-English Dictionary file. For the full details, see thefull documentation, or theold documentation.
It is a plain text document in EUC-JP coding, with its own format (which has become known as "EDICT-format"). Originally it was compiled and edited in this format, but from 1999 it has been generated as a legacy file from an expanded database, along with the related JMdict (Japanese-Multilingual Dictionary) project. JMdict is an expanded file, containing French, German, Russian, etc. translation, and is in XML format and UTF-8 coding. In 2010 the maintenance was moved to an online database system.
There are now two EDICT versions:
- the plain EDICT file. This is the original format, where there is only one kanji form and one reading per entry. I regard this as a legacy format, and only provide it for older applications. PLEASE do not use this format for new applications, as I would like to withdraw it one day.
- the enhanced "EDICT2" format. This can have multiple kanji forms and readings in an entry, and also has other information such as cross-references, restrictions, etc. and also uses kanji from the extensions in the JIS X 0212 standard. It has almost all the information in the full JMdict format. This form should be used for all new applications.
The EDICT2 file currently has about 175,000 entries, and the legacy EDICT format has nearly 200,000 entries (many of which are duplicates as all the permutations of kanji and readings generate distinct entries.)
A short overview of the EDICT project asparallel English/Japanese text is available.
Download
You can download the files in various formats:edict2.gz, edict.gz, edict.zip These are all on the Monash ftp site.
You can also use EDICT2 online via myWWWJDIC server.
Is it Public Domain?
EDICT can be freely used provided satisfactory acknowledgement is made in any software product, server, etc. that uses it. There are a few other conditions relating to distributing copies of EDICT with or without modification. Copyright is vested in theEDRG (Electronic Dictionary Research Group) with the file available under a Creative Commons Attribution-ShareAlike Licence (V3.0). You can see the specificlicence statement at the Group's site.
Other Dictionary Files
A number of other dictionary files have been compiled by me and others as adjuncts or spin-offs from the EDICT file. I will list the major of these below. Another summary can be found be found in thedocumentation of my WWWJDIC server.
- the KANJIDIC kanji information file.(overview) (download) (documentation) This file has an entry for each of the 6,355 kanji in the JIS X 0208-1990 standard. (The KANJIDIC file was cited as a reference in the New Nelson character dictionary published in 1997.)
- A second file, KANJD212, which covers the 5,801 kanji in the JIS X 0212-1990 standard has been assembled, and was released early in 1996. (documentation) (download)
- the ENAMDICT/JMnedict files of proper names. These now have over 720,000 names. Downloads:(enamdict.gz) (JMnedict.xml.gz) or see the documentation.
- the COMPDIC file of computing and (tele)communications terminology. Has over 12,000 entries.(documentation) (download)
- the EDICLSD3 (Japanese-English Life Science dictionary), which is the EDICT-format version of a major file produced at Kyoto University by a project group coordinated by Professor Shuji Kaneko..
Software for using the EDICT files
- WWW
There are several WWW options:- my own WWWJDIC server, which has a number of mirrors in Canada, Japan, the US, etc. (Please note that the WWWJDIC program isnotavailable for download. There is no PC version.)
- the excellent Jisho.org server
- Jeffrey Friedl's server at sites inCanada (site 1), (site 2), the USA, etc.
There are many other WWW-based methods, and a larger list can be found on myonline dictionaries page.
A very useful site is Rikai, which massages WWW pages, placing popup translations from EDICT behind the Japanese text. As well there is a Rikai-basedMozilla Plugin that achieves the same without going to the server. Needs Firefox 0.8.
- Windows (This section is quite out of date.)
While I do not have a lot of direct experience (I don't use Windows much), the following appear to be the options:- use the JquickTrans program, also available from the Monash ftp site. Despite its name, it is a dictionary client.
- use the old WinJDic program, also available from the Monash ftp site. It has the limitation of not being able to handle more than one dictionary file.
- use the JWPce freeware wordprocessor, also available from the Monash ftp site. It has a good built-in dictionary function. The author, Glenn Rosenthal, has promised a stand-alone dictionary version soon. The older JWP wordprocessor, written by Stephen Chung, is also popular.
- another WP which uses the EDICT file is NJSTAR. NJSTAR comes with an early copy of EDICT. If you want to use a more recent copy, you'll need to create special index files. I think the utilities for this are in the DOS archive of NJSTAR, but I cannot confirm this.
- the Roboword program fromTechnocraft.
- use the DOS JDIC mentioned below.
All of the above work with just "English" versions of Windows.
- WindowsCE/Windows Mobile
- JWPce also has versions for handhelds.
- A new package is Sven Groot's Pocket Dictionary. It is accompanied by soft input panel software.
- Unix/Linux (X-Windows)
- My own xjdic (V2.4) which is available from the Monash ftp site. It needs to run in a kterm window, and has been used successfully on virtually every type of Unix & Linux system.
- the Gjiten package, which is very nice, and has its own flexible GUI. (Gnome)
- the newgWaei, which aims to be a "dictionary program for the Gnome desktop with support for regular expressions."
- the fast and light-weightKiten (KDE).
- for Emacs/XEmacs users there is **edict.el.**I don't know much about it, but I think it is included in the XEmacs_rpm._On the Monash ftp site I have someinformation and the latestrelease.
- Smartphones
For Android phones there are two main options:- the AEdict app. This excellent app uses copies of the dictionary files downloaded to the phone, so it works well offline.
- theWWWJDIC for Android app. This uses the WWWJDIC server via its API, and needs network access. It has the advantage of always being up-to-date as the dictionary is expanded and corrected.
For Apple iPhones two options are:
- theImawa app. (formerly Kotoba) which is very highly regarded. It is similar to AEdict, but uses the JMdict data and hence provides all the information from the dictionary.
- the newerEDICT with Grammar app. This uses the common words subset in the old EDICT format.
WWWJDIC itself has a simple mobile phone interface, which I developed for Japanese keitai many years ago. - Macintosh (This section is quite out of date.)
Mac users have a number of options if they have Japanese support with their OS (I think the support is standard for later versions):- the old MacJDic V1.3.4, available from theMonash ftp site. It is freeware, but can only handle one dictionary file. MacJDic was written byDan Crevier, based on xjdic. (I have a sample of some MacJDic screens available.)
- the commercial UniDict package, also developed by Dan Crevier.
- the highly regardedJEDict package developed by Sergey Kurkin.
- the OriDict package.
- theTensai Japanese language dictionary and study aid package.
- DOS
The two main main programs for DOS are: - Others
There are also programs for Amiga, BeoS, Palm Pilots, etc. Most can be obtained from the Monash ftp site.
There is a Jabber bot that does local EDICT lookups too. Romaji?
None of the files in the EDICT project use romanized Japanese. I get many requests for a romaji version of EDICT, however as I do not like romaji and do not want to encourage its use, I will not be producing romaji versions. There isa romaji version dating from 1997 on the Monash ftp site. This was prepared for a blind person who was using a non-Japanese Braille interface. I (foolishly) placed it on my ftp site, and I have had a lot of problems since it was not in step with the main file. That file is now withdrawn, and I am asking all sites carrying copies to withdraw it.
Publications
If you like, you can collect some papers I have written about the project:
- AN early technical report from 1993;(postscript)
- an overview paper from 1995;(html) (postscript)
- a 1999 conference paper about WWWJDIC;(postscript) (pdf) (html).
Other useful links can be found on my Japanese Page.
Jim Breen
December 2013
July 2017