Chemistry International (original) (raw)
> NIST Celebrates Centennial> 41st IUPAC Council Meeting
> IUPAC Prize
> Symposia Reports
> New Projects
> Provisional Recommendations
> New Books
> Commissions Reports
> Conference Announcements
> Conference Calendar
Download the May issue in pdf format. (637KB)
Download the May cover in pdf format. (86KB)
Chemistry International
Vol. 23, No.3
May 2001New Projects
IUPAC Chemical Identifier (IChI)
IUPAC has approved a project to establish a unique label, the IUPAC Chemical Identifier (IChI), as a non-proprietary identifier for chemical substances that could be used in printed and electronic data sources, thus enabling easier linking of diverse data and information compilations.
IChI will not require the establishment of a registry system. Unlike the CAS Registry System, it will not depend on the existence of a database of unique substance records to establish the next number for any new chemical substance being assigned an IChI. It will use a yet-to-be-defined set of IUPAC structure conventions, and rules for normalization and canonicalization of the structure representation to establish the unique label. It will thereby enable an automatic conversion of a graphical representation of a chemical substance into the unique IChI label, which can be performed anywhere in the world and which could be built into desktop chemical structure drawing packages (such as ChemDraw, ISIS/Draw, etc.) and online chemical structure drawing applets (such as ACD/Draw). IUPAC would define the process flow leading from input of structural information to the creation of the Identifier in three steps: definition of chemical structure input requirements, algorithms for generating a unique set of atom labels (canonicalization), and algorithms for conversion of these labels into the Identifier (serialization). Structure input and conversion to the structural format required by the IChI generator would be carried out with vendor-developed software.
The process would be reversible, so that the Identifier output could be used to regenerate structural input information. The Identifier would thus serve as the computer equivalent of the IUPAC name for a molecule.
This arrangement would facilitate searching the Internet and labeling information in electronic documents with the name of the chemical substance in question. A prototype algorithm with limited applicability is expected to be available for testing toward the end of 2001.
Comments from the chemistry community are welcome and should be addressed to the project coordinator, Dr. Alan McNaught, General Manager, Production Division, RSC Publishing, Royal Society of Chemistry, Thomas Graham House, Science Park, Milton Road, Cambridge CB4 0WF, England, UK, Tel.: +44 1223 432119, Fax: +44 1223 420247; E-mail: [email protected]. See http://www.iupac.org/projects/2000/2000-025-1-050.html for project description and update.