Resource Description and Classification (original) (raw)

SEARCH
Advanced Search


ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors


NEWS
Cover Stories
Articles & Papers
Press Releases


CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG


TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps


EVENTS


LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic

Resource Description and Classification

Resource Description and Classification. Being a collection of references on matters of Subject Classification, Taxonomies, Ontologies, Indexing, Metadata, Metadata Registries, Controlled Vocabularies, Terminology, Thesauri, Business Semantics


A collection of references and survey based upon links and cribbings from various resources on the Internet. An unfinished and non-authoritative reference document. The references cited in this document are only incidentally related to XML; the survey was conducted in connection with work on the OASIS Registry and Repository Technical Committee (Fall 1999/Spring 2000).

Contents


Descriptive Cataloging in Libraries

Library of Congress Classification and Subject Headings (LCC/LCSH)

Dewey Decimal Classification (DDC)

Universal Decimal Classification (UDC)

IFLA Section on Classification and Indexing


Standard Industry Taxonomies and Ontologies (Industries, Market Sectors, Products, Services, Functions)

North American Industry Classification System (NAICS)

UNSPSC - United Nations Standard Product and Services Classification

Standard Occupational Classification (SOC)

The Standard Occupational Classification (SOC) system from the U.S. Department of Labor, Bureau of Labor Statistics "will be used by all Federal statistical agencies to classify workers into occupational categories for the purpose of collecting, calculating, or disseminating data. All workers are classified into one of over 820 occupations according to their occupational definition. To facilitate classification, occupations are combined to form 23 major groups, 96 minor groups, and 449 broad occupations. Each broad occupation includes detailed occupation(s) requiring similar job duties, skills, education, or experience. [2003-02 statement]"

Central Product Classification (CPC)

STEPml product identification and classification

The STEPml specification [Revision 1.0 February 7, 2001] addresses the requirements to identify and classify or categorize products, components, assemblies (ignoring their structure) and/or parts. Identification and classification are concepts assigned to a product by a particular organization. This specification describes the core identification capability upon which additional capabilities, such as product structure, are based. Those capabilities are describe in other STEPml specifications and their use is dependent upon use of this specification...

GCL (Government Category List) from UK GovTalk e-Government Metadata Framework

Making it easier to find information is a key aim of Information Age Government, and is addressed in part by the e-Government Metadata Framework (e-GMF). The Government Category List (GCL) is a list of headings for use with the Subject element of the e-Government Metadata Standard (e-GMS). It will be seen in applications such as UK Online. Subject metatags drawn from the GCL will make it straightforward for website managers to present their resources in a directory structure using the GCL headings... The GCL is a living document which must evolve if it is to continue to serve the public in a world of changing technology and changing needs. Suggestions for improving it will be welcomed throughout its lifetime. During 2002, updates will be issued at four-monthly intervals.

NISO Electronic Thesaurus initiative

International Standard Industrial Classification of All Economic Activities (SIC)

Standard Industrial Classifications (SIC)

PRODCOM

ISO BSR - Basic Semantic Repository

Topic Maps Published Subjects

"A published subject is any subject for which a subject indicator has been made available for public use and is accessible online via a URI. [A subject indicator is a resource that is intended by the topic map author to provide a positive, unambiguous indication of the identity of a subject.]... The general intention behind published subjects is that topic maps interoperability needs non-ambiguous definition of subjects (reified by topics), that should be provided by trustable publishers, in resources available through stable URIs. Those addressable resources, called 'subject definition resources' will provide human-understandable and non-ambiguous definition of subjects, whereas their URIs will provide stable identifiers fit for computer processing, topic maps interoperability and merging, and many other foreseeable semantic applications... Since subject identity forms the basis for merging topic maps and interchanging semantics, authors are encouraged to always indicate the subject identity of their topics in the most robust manner possible, in particular through the use of standardized ontologies expressed as published subject indicators... [approximation, from the specs 2001/2002]"

References

BizCodes

Universal Data Element Framework (UDEF)

XBRL/AICPA Taxonomy for Commercial and Industrial Companies

DAML-ONT Ontology

The "DARPA Agent Mark Up Language (DAML)" is a new effort to "help bring the 'semantic web' into being, focusing on the eventual creation of a web logic language. DAML is being designed as an XML-based semantic language that ties the information on a page to machine-readable semantics (ontology). DAML represents joint work between DoD, industry and academia in both the US and the European Community and we hope it will lead to the eventual web standard in this area." The W3C mailing list 'www-rdf-logic@w3.org' hosts a very active discussion on the developing DAML Ontology Language Specification, released in October 2000. Several new resources are available from the project web sites. The DAML Ontology Library provides a summary submitted obtologies, sortable by URI, Submission Date, Keyword, Open Directory Category, Class, Property, Funding Source, and Submitting Organization.

IEEE Standard Upper Ontology (SUO)

Scope: "This standard will specify the semantics of a general-purpose upper level ontology. An ontology is a set of terms and formal definitions. This will be limited to the upper level, which provides definition for general-purpose terms and provides a structure for compliant lower level domain ontologies. It is estimated to contain between 1000 and 2500 terms plus roughly ten definitional statements for each term. Is intended to provide the foundation for ontologies of much larger size and more specific scope. (1) The standard will be suitable for automated logical inference to support knowledge-based reasoning applications. (2) This standard will enable the development of a large (20,000+) general-purpose standard ontology of common concepts to be developed, which will provide the basis for middle-level domain ontologies and lower-level application ontologies. (3) The ontology will be suitable for 'compilation' to more restricted forms such as XML or database schema. This will enable database developers to define new data elements in terms of a common ontology, and thereby gain some degree of interoperability with other compliant systems. (4) Owners of existing systems will be able to map existing data elements just once to a common ontology, and thereby gain some degree of interoperability with other representations which are compliant with the SUO. (5) Domain-specific ontologies which are compliant with the SUO will be able to interoperate (to some degree) by virtue of the shared common terms and definitions. (6) Applications of the ontology will include: (a) E-commerce applications from different domains which need to interoperate at both the data and semantic levels. (b) Educational applications in which students learn concepts and relationships directly from, or expressed in terms of, a common ontology. This will also enable a standard record of learning to be kept. (c) Natural language understanding tasks in which a knowledge based reasoning system uses the ontology to disambiguate among likely interpretations of natural language statements."

Upper Cyc Ontology

"Cycorp welcomes you to its first major public release: approximately 3,000 terms capturing the most general concepts of human consensus reality. We refer to this as the "upper Cyc ontology." The full Cyc knowledge base (KB) includes a vast structure of more specific concepts descending below this upper level. Over the past dozen years, we have also entered into Cyc literally millions of logical axioms -- rules and other assertions -- which specify constraints on the individual objects and classes found in the real world. Further specializations have been developed for our customers, especially in recent years, driven by their application needs..."

News Industry and Metadata Initiatives

IPTC [International Press Telecommunications Council] Subject Reference System

Resource Organisation And Discovery in Subject-based services (ROADS)

Development of a European Service for Information on Research and Education (DESIRE)

Social Science Information Gateway (SOSIG)

Dublin Core Metadata Project

The TEI Header

ACM Computing Classification System (CCS)

Engineering Information Classification Codes (Ei)

AGRICOLA Subject Category Codes


Internet Portals and their Search Interfaces

The subject taxonomies/hierarchies now used in Yahoo, AltaVista, and the Open Directory Project indexes (etc.) appear [?] to have been created ad hoc, and appear to change a lot over time. Presumably a large staff is needed to evolve the classification schemes as new categories become relevant to users.


General/Miscellaneous References and Authority Lists

I have not had time to properly organize these references, though some of them link to excellent resources.


Notes

[1] This reference document has been created by someone with minimal experience (and no formal training) in the science of ontology, classification, etc. For this reason, among several, it should not be trusted. There are undoubtedly large gaps in coverage, misunderstandings of concepts, etc. Use it with appropriate caution.

Hosted By
OASIS - Organization for the Advancement of Structured Information Standards Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover. Newsletter Subscription Newsletter Archives