(original) (raw)

StringStore is really a VisibleString. It is used to define very long strings which may need to be stored by the receiving program in special structures, such as a ByteStore, but it's just a hint. AsnTool stores StringStores in ByteStore structures. OCTET STRINGs are also stored in ByteStores by AsnTool typedef struct bsunit { /* for building multiline strings Nlm_Handle str; /* the string piece Nlm_Int2 len_avail, len; struct bsunit PNTR next; } /* the next one Nlm_BSUnit, PNTR Nlm_BSUnitPtr; typedef struct bytestore { Nlm_Int4 seekptr, /* current position totlen, /* total stored data length in bytes chain_offset; /* offset in ByteStore of first byte in curchain Nlm_BSUnitPtr chain, /* chain of elements curchain; /* the BSUnit containing seekptr } Nlm_ByteStore, PNTR Nlm_ByteStorePtr; AsnTool incorporates this as a primitive type, so the definition is here just for completeness StringStore ::= [APPLICATION 1] IMPLICIT OCTET STRING BigInt is really an INTEGER. It is used to warn the receiving code to expect a value bigger than Int4 (actually Int8). It will be stored in DataVal.bigintvalue Like StringStore, AsnTool incorporates it as a primitive. The definition would be: BigInt ::= [APPLICATION 2] IMPLICIT INTEGER Date is used to replace the (overly complex) UTCTtime, GeneralizedTime of ASN.1 It stores only a date for those unparsed dates use this if you can NOTE: this is NOT a unix tm struct full year (including 1900) month (1-12) day of month (1-31) for "spring", "may-june", etc hour of day (0-23) minute of hour (0-59) second of minute (0-59) Dbtag is generalized for tagging eg. { "Social Security", str "023-79-8841" } or { "member", id 8882224 } name of database or system appropriate tag Object-id can tag or name anything Person-id is to define a std element for people any defined database tag structured name MEDLINE name (semi-structured) eg. "Jones RM" unstructured name consortium name Structured names full name eg. "J. John Smith, Esq" first + middle initials Jr, Sr, III Dr., Sister, etc **** Int-fuzz ********************************************** * * uncertainties in integer values plus or minus fixed amount max to min % plus or minus (x10) 0-1000 some limit value unk - unknown gt - greater than lt - less than tr - space to right of position tl - space to left of position circle - artificial break at origin of circle other - something else **** User-object ********************************************** * * a general object for a user defined structured data item * used by Seq-feat and Seq-descr endeavor which designed this object type of object within class field label required for strs, ints, reals, oss field contents for using other definitions Article Ids can be many ids for an article see types below generic catch all Id from the PubMed database at NCBI Id from MEDLINE Document Object Identifier Controlled Publisher Identifier PubMed Central Id Publisher Id supplied to PubMed Central Publisher Id supplied to PubMed Status Dates points of publication received - date manuscript received for review accepted - accepted for publication epublish - published electronically by publisher ppublish - published in print by publisher revised - article revised by publisher/author pmc - article first appeared in PubMed Central pmcr - article revision in PubMed Central pubmed - article citation first appeared in PubMed pubmedr - article citation revision in PubMed aheadofprint - epublish, but will be followed by print premedline - date into PreMedline status medline - date made a MEDLINE record done as a structure so fields can be added time may be added later Citation Types article in journal or book title of paper (ANSI requires) authors (ANSI requires) journal or book lots of ids Journal citation title of journal Book citation Title of book part of a collection authors Meeting proceedings citation to meeting time and location of meeting Patent number and date-issue were made optional in 1997 to support patent applications being issued from the USPTO Semantically a Cit-pat must have either a patent number or an application number (or both) to be valid patent citation author/inventor Patent Document Country Patent Document Type Patent Document Number Patent Issue/Pub Date Patent Doc Appl Number Patent Appl File Date Applicants Assignees abstract of patent Patent country code number assigned in that country date of application just to identify a patent Patent Document Country Patent Document Number Patent Doc Appl Number Patent Doc Type letter, thesis, or manuscript same fields as a book Manuscript identifier NOTE: this is just to cite a direct data submission, see NCBI-Submit for the form of a sequence submission citation for a direct submission not necessarily authors of the paper this only used to get date.. will go medium of submission replaces imp, will become required description of changes for public view NOT from ANSI, this is a catchall anything, not parsable medline uid for GenBank style references eg. cit="unpublished",title="title" PubMed Id Authorship Group author affiliation Author, Primary or Secondary Author Role Indicator TRUE if corresponding author unparsed string std representation Author Affiliation, Name Author Affiliation, Division Author Affiliation, City Author Affiliation, County Sub Author Affiliation, Country street address, not ANSI Title, Anal,Coll,Mono AJB Title, Subordinate A B Title, Translated AJB Title, Abbreviated J specifically ISO jta J specifically MEDLINE jta J a coden J ISSN J Title, Abbreviated B ISBN B Imprint group date of publication publisher, required for book copyright date, " " " part/sup of volume put here for simplicity for prepublication citations submitted - submitted, not accepted in-press - accepted, not published part/sup on issue retraction info current status of this publication dates for this record retraction of an entry retracted - this citation retracted notice - this citation is a retraction notice in-error - an erratum was published about this erratum - this is a published erratum citation and/or explanation a MEDLINE or PubMed entry regular medline record MEDLINE UID, sometimes not yet available if from PubMed Entry Month article citation MEDLINE records may include the PubMedId publisher - record as supplied by publisher premedline - premedline record TRUE if main point (*) the MeSH term TRUE if main point the subheading medline substance records type of record cas - CAS number ec - EC number CAS or EC number if present name (always present) medline cross reference records type of xref ddbj - DNA Data Bank of Japan carbbank - Carbohydrate Structure Database embl - EMBL Data Library hdb - Hybridoma Data Bank genbank - GenBank hgml - Human Gene Map Library mim - Mendelian Inheritance in Man msd - Microbial Strains Database pdb - Protein Data Bank (Brookhaven) pir - Protein Identification Resource prfseqdb - Protein Research Foundation (Japan) psd - Protein Sequence Database (Japan) swissprot - SwissProt gdb - Genome Data Base the citation/accession number Keyed type other - look in line code comment - comment line erratum - retracted, corrected, etc the text reference to a document general or generic unparsed submission medline uid proceedings of a meeting identify a patent manuscript, thesis, or letter to cite a variety of ways PubMedId *** Org-ref *********************************************** * * Reference to an organism * defines only the organism.. lower levels of detail for biological * molecules are provided by the Source object * preferred formal name common name genus/species type name virus names are different hybrid between organisms some hybrids have genus x species name when genus not known attribution of name lineage with semicolon separators genetic code (see CdRegion) mitochondrial genetic code GenBank division code dosage - chromosome dosage of hybrid nat-host - natural host of this specimen gb-acronym - used by taxonomy database gb-anamorph - used by taxonomy database gb-synonym - used by taxonomy database other - ASN5: old-name (254) will be added to next spec attribution/source of name required species required if subspecies used other - level must be set in string ******************************************************************** BioSource gives the source of the biological material for sequences ******************************************************************** biological context natural - normal biological entity natmut - naturally occurring mutant mut - artificially mutagenized artificial - artificially engineered synthetic - purely synthetic to distinguish biological focus lat-lon - +/- decimal degrees collection-date - DD-MMM-YYYY format collected-by - name of person who collected the sample identified-by - name of person who identified the sample fwd-primer-seq - sequence (possibly more than one; semicolon-separated) rev-primer-seq - sequence (possibly more than one; semicolon-separated) attribution/source of this name Root Record for Chemical Substance Definition Internal Tracking Information Substance ID/Version [Either valid ID or a "0" dummy value, if "source" is to be used] Note: Version is for internal use (only?) Note: A valid ID is greater than "0" Data Source for this Submission Structure Description Original Deposited Structure Information ID and Version Description Information Unique "Global" ID Note: Must be greater than "0" or, if invalid, "0" Incremented when Depositor updates record Note: For Internal Use (only?) Describes Substance Source, if from another database Individual Submission External DB Submission MMDB Submission (deprecated) External DB Tracking Information Unique Name of External Database Primary Unique ID used by External DB External Database Release Date External Database Release Code/Description Data Submission to same DB by original Author MMDB Source Record detailing specific location or part of an MMDB Record MMDB Record ID Note: Must be greater than "0" or, if invalid, "0" MMDB Molecule ID Note: Must be greater than "0" or, if invalid, "0" Residue ID Note: Must be greater than "0" or, if invalid, "0" Residue Name Atom ID Note: Must be greater than "0" or, if invalid, "0" Atom Name Depositor Provided X-Ref and LinkOut data for Entrez External Database Registry ID Registry Number (e.g., EC Number, CAS Number) MESH Index Term PubMed ID Note: Must be greater than "0" or, if invalid, "0" GenBank General ID (DEPRECATED) Note: Please use protein-gi or nucleotide-gi, if possible Note: Must be greater than "0" or, if invalid, "0" MMDB ID Note: Must be greater than "0" or, if invalid, "0" PubChem Substance ID Note: Must be greater than "0" or, if invalid, "0" PubChem Compound ID Note: Must be greater than "0" or, if invalid, "0" Depositor Source Database Homepage Depositor Homepage for a Substance Depositor Homepage for an Assay GenBank General ID for a Protein (DEPRECATED) Note: Must be greater than "0" or, if invalid, "0" GenBank General ID for a Nucleotide (DEPRECATED) Note: Must be greater than "0" or, if invalid, "0" Taxonomy ID for an Organism Note: Must be greater than "0" or, if invalid, "0" PubChem BioAssay ID Note: Must be greater than "0" or, if invalid, "0" MIM, Mendelian Inheritance in Man, Number Note: Must be greater than "0" or, if invalid, "0" Entrez Gene ID Note: Must be greater than "0" or, if invalid, "0" Probe ID Note: Must be greater than "0" or, if invalid, "0" BioSystem ID (DEPRECATED) Note: Must be greater than "0" or, if invalid, "0" Gene Expression Omnibus Series Accession (GEO GSE) ID Note: Must be greater than "0" or, if invalid, "0" Gene Expression Omnibus Sample Accession (GEO GSM) ID Note: Must be greater than "0" or, if invalid, "0" Patent Identifier (e.g., USPTO, EPO, WPO, JPO, CPO) GenBank Accession for a Protein GenBank Accession for a Nucleotide digital object identifier (DOI) citation when PMID or DOI are not available PubChem Pathway accession Compound Record Tracking Information Compound Qualifier (Type/ID) AtomID/Type Information BondID/Type/Atom Information Provided Total Formal Charge (Signed Integer) Counts of various properties Alternate Valence-Bond Forms Qualification used to describe the type of Compound deposited, standardized, or derived. Please note that mixtures/cocktails may be specified using previously deposited substances. Compound Qualifier or Type For Compound Depositions deposited - Original Deposited Compound For Standardized Compounds standardized - Standardized Form of a Deposited Compound component - Component of a Standardized Compound neutralized - Neutralized Form of a Standardized Compound For Mixture/Cocktail Depositions mixture - Substance that is a component of a mixture For Theoretical Compounds tautomer - Predicted Tautomer Form pka-state - Predicted Ionized pKa Form unknown - Unknown Compound Type Compound Namespace and ID (absent for "deposited" type compounds) Standardized Compound PubChem Substance (for "mixture" type compounds) PubChem Theoretical Compound Superatom group (e.g. from MOL Sgroup) These enumerated values are adapted from the ctfile format specification Type of group (e.g. from MOL field STY) sup - Superatom mul - Multiple group sru - Structure repeat unit (polymer) mon - Monomer mer - Mer type cop - Copolymer cro - Crosslink mod - Modification gra - Graft com - Component mix - Mixture for - Formulation dat - Data Sgroup any - Any polymer gen - Generic Subtype (e.g. from MOL field SST) alt - Alternating ran - Random blo - Block Connectivity (e.g. from MOL field SCN) hh - Head-to-head ht - Head-to-tail eu - Either unknown Label (e.g. from MOL field SLB) Subscript (e.g. from MOL field SMT) Repeat count (e.g. for polymers) Special bonds in this group (typically capping/crossing bonds, e.g. from MOL field SBL) If present, from and to must be parallel lists of aid from PC-Bonds Bracket display (e.g. from MOL field SDI) Display coordinates for a bracket (e.g. from MOL field SDI) Counts of various properties of a Compound Total count of non-Hydrogen (Heavy) Atoms StereoChemistry Counts Total count of (SP3) Chiral Atoms Total count of Defined (SP3) Chiral Atoms Total count of Undefined (SP3) Chiral Atoms Total count of (SP2) Chiral Bonds Total count of (SP2) Defined Chiral Bonds Total count of (SP2) Undefined Chiral Bonds Isotopic Counts Total count of Atoms with Isotopic Information Discrete Structure Counts Total count of covalently-bonded units in the record Number of possible tautomers (Max. 999) List of atom identifiers which are in a common stereochemistry group. All atoms in this group possess the characteristic of the type specified. The convention adopted is intended to be compatible with MDL's Enhanced Stereochemical Representation white paper. An atom can only be member of a single stereo group, and all atoms in a stereo group must have a stereo descriptor. Stereogroups only apply to stereocenters that can have parity. absolute - Absolute configuration is known or - Relative configuration is known (absolute configuration is unknown) and - Mixture of stereoisomers unknown - Unknown configuration type Compound Description/Descriptor Data Universal Resource Name [for Value Qualification] Data Value Boolean or Binary Integer (signed or unsigned) Float or Double String Date Binary Data Bit List (specialized version of Boolean vector) Universal Resource Name Provides explicit source information on derived or calculated data Generic Name or Label for Display [e.g., "Log P"] Qualified Name [e.g., "XlogP"] Specific Data Type of Value [e.g., binary] Implementation Parameter [e.g., "metal=0"] Implementation Name [e.g., "E_XlogP"] Implementation Version [e.g., "3.317"] Implementation Software [e.g., "Cactvs"] Implementation Organization [e.g., "xemistry.com"] NCBI Implementation Release [e.g., "10.25.2005"] URN Data Type Provides the ability to use more specific data types than that directly provided by ASN.1. Provides for more specific validation of specified data. string - Basic Data Types String [maps to a VisibleString] stringlist - List of Strings [maps to VisibleString list] int - 32-Bit Signed Integer [maps to an INTEGER] intvec - Vector of 32-Bit Signed Integer [maps to INTEGER vector] uint - 32-Bit Unsigned Integer [maps to an INTEGER] uintvec - Vector of 32-Bit Unsigned Integer [maps to INTEGER vector] double - 64-Bit Float [maps to a REAL] doublevec - Vector of Double [maps to REAL vector] bool - Boolean or Binary value [maps to a BOOLEAN] boolvec - Boolean Vector [maps to BOOLEAN vector] uint64 - Specialized Data Types 64-Bit Unsigned Integer (Hex form) [maps to a VisibleString] binary - Binary Data Blob [maps to an OCTET STRING] url - URL [maps to a VisibleString] unicode - UniCode String [maps to a VisibleString] date - ISO8601 Date [maps to a Date] fingerprint - Binary Fingerprint (Gzip'ped bit [maps to an OCTET STRING] list w/ 4-Byte prefix denoting bit list length) unknown - Unknown Data Type [maps to a set of VisibleString] Coordinates for the Compound of a given type Drawing/Conformer Definition (in Parallel Arrays, synchronized to aid integer list) 3D coordinates are specified in a right-handed coordinate system. For 2D plots, Y axis leads upwards. Structure Annotations Coordinate Set Type Distinctions twod - 2D Coordinates threed - 3D Coordinates (should also indicate units, below) submitted - Depositor Provided Coordinates experimental - Experimentally Determined Coordinates computed - Computed Coordinates standardized - Standardized Coordinates augmented - Hybrid Original with Computed Coordinates (e.g., explicit H) aligned - Template used to align drawing compact - Drawing uses shorthand forms (e.g., COOH, OCH3, Et, etc.) units-angstroms - (3D) Coordinate units are Angstroms units-nanometers - (3D) Coordinate units are nanometers units-pixel - (2D) Coordinate units are pixels units-points - (2D) Coordinate units are points units-stdbonds - (2D) Coordinate units are standard bond lengths (1.0) units-unknown - Coordinate units are unknown or unspecified Drawing Annotations (in Parallel Arrays) [Note: A pair of atoms can have multiple annotations] Atom-Atom Annotation Information crossed - Double Bond that can be both Cis/Trans dashed - Hydrogen-Bond (3D Only?) wavy - Unknown Stereochemistry dotted - Complex/Fractional wedge-up - Above-Plane wedge-down - Below-Plane arrow - Dative aromatic - Aromatic resonance - Resonance bold - Fat Bond (Non-Specific User Interpreted Information) fischer - Interpret Bond Stereo using Fischer Conventions closeContact - Identification of Atom-Atom Close Contacts (3D Only) unknown - Unspecified or Unknown Atom-Atom Annotation Atom Information (in Parallel Arrays) Specification of an Association between an Atom Identifier and Source Atom Identifier for the R-Group Source Note: Atom ID's must be greater than "0" Atom Specific MMDB Record Specification of an Association between an Atom Identifier and an Integer Value Atom Identifier for the Value Note: Atom ID's must be greater than "0" Value Associated to the ID Specification of an Association between an Atom Identifier and a String Value Atom Identifier for the Value Note: Atom ID's must be greater than "0" Value Associated to the ID Rudimentary Atom Electronic Configuration Designation Atom Identifier for the Value Note: Atom ID's must be greater than "0" Type of Atom Radical singlet - Open-Shell Singlet doublet - Open-Shell Doublet triplet - Open-Shell Triplet quartet - Open-Shell Quartet quintet - Open-Shell Quintet hextet - Open-Shell Hextet heptet - Open-Shell Quintet octet - Open-Shell Octet none - Closed-Shell Singlet Element Information [which may contain "illegal" element values] a - Illegal Atom Numbers that may be Interpreted to be something else Unspecified Atom (Asterick) d - Dummy Atom r - Rgroup Label lp - Lone Pair h - Elements Bond Description Information (in Parallel Arrays) Bond Type Information single - Single Bond double - Double Bond triple - Triple Bond quadruple - Quadruple Bond dative - Dative Bond complex - Complex Bond ionic - Ionic Bond unknown - Unknown/Unspecified Connectivity Allowed Stereogenic Center Types [Using IUPAC Stereogenic Center recommendations and terminology] Tetrahedral (SP3) StereoCenter Planar (SP2) StereoCenter Square Planar (SP4) StereoCenter Octahedral (OC-6) / Square Pyramid (SPY-5) StereoCenters Trigonal BiPyramid (TBPY-4 and TBPY-5) StereoCenters T-Shaped (TS-3) StereoCenters Pentagonal BiPyramid (PBPY-7) StereoCenters SP3 Tetrahedral StereoCenter, Trigonal Pyramid Stereogenic Center, Cumulenic StereoCenter (Linear systems of an even number of double bonds), or Hindered biaryl stereocenter (All biaryls have hindered rotation that to some extent the ortho-hydrogens prevent coplanarity) [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom Identifier of Atom Center Note: Atom ID's must be greater than "0" Atom Identifier of Atom Above the Plane Note: Atom ID's must be greater than "0" Atom Identifier of Atom In-Plane and at the Top Note: Atom ID's must be greater than "0" Atom Identifier of Atom In-Plane and at the Bottom Note: Atom ID's must be greater than "0" Atom Identifier of Atom Below the Plane Note: Atom ID's must be greater than "0" StereoCenter Designation Type of StereoCenter, Tetrahedral, if not specified tetrahedral - Tetrahedral StereoCenter cumulenic - Cumulenic StereoCenter biaryl - Biaryl StereoCenter SP2 Planar Stereogenic Center, Cumulenic StereoCenter (Linear systems on an odd number of double bonds present planar stereochemistry) [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Left Double Bond Atom Note: Atom ID's must be greater than "0" Atom ID of Top Atom attached to the Left Double Bond Atom Note: Atom ID's must be greater than "0" Atom ID of Bottom Atom attached to the Left Double Bond Atom Note: Atom ID's must be greater than "0" Atom ID of Right Double Bond Atom Note: Atom ID's must be greater than "0" Atom ID of Top Atom attached to the Right Double Bond Atom Note: Atom ID's must be greater than "0" Atom ID of Bottom Atom attached to the Right Double Bond Atom Note: Atom ID's must be greater than "0" StereoCenter Designation Type of StereoCenter, SP2 Planar, if not specified planar - SP2 Planar StereoCenter cumulenic - Cumulenic StereoCenter Square Planar (SP4) StereoCenters [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Atom Center Note: Atom ID's must be greater than "0" Atom ID of Left Below Plane Atom Note: Atom ID's must be greater than "0" Atom ID of Right Below Plane Atom Note: Atom ID's must be greater than "0" Atom ID of Left Above Plane Atom Note: Atom ID's must be greater than "0" Atom ID of Right Above Plane Atom Note: Atom ID's must be greater than "0" StereoCenter Type u-shape - U shaped isomer (labove-lbelow-rbelow-rabove) z-shape - Z shaped isomer (labove-rabove-lbelow-rbelow) x-shape - X shaped isomer (labove-rbelow-rabove-lbelow) any - Nonspecific mixture of isomers Octahedral (OC-6) and Square Pyramid (SPY-5) StereoCenters [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Atom Center Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Top Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Bottom Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane on the Left Note: Atom ID's must be greater than "0" Atom ID of Atom Below the Plane on the Left Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane on the Right Note: Atom ID's must be greater than "0" Atom ID of Atom Below the Plane on the Right Note: Atom ID's must be greater than "0" Trigonal BiPyramid (TBPY-4 and TBPY-5) StereoCenters [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Atom Center Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane Note: Atom ID's must be greater than "0" Atom ID of Atom Below the Plane Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Top Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Bottom Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and to the Right Note: Atom ID's must be greater than "0" T-Shaped (TS-3) StereoCenters [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Atom Center Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Top Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Bottom Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane Note: Atom ID's must be greater than "0" Pentagonal BiPyramid (PBPY-7) StereoCenters [Using IUPAC Stereogenic Center recommendations and terminology] [Note: "-1" can be used for the Atom Identifier to represent a lone-pair or implicit hydrogen] Atom ID of Atom Center Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Top Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Bottom Note: Atom ID's must be greater than "0" Atom ID of Atom In-Plane and at the Left Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane on the Left Note: Atom ID's must be greater than "0" Atom ID of Atom Below the Plane on the Left Note: Atom ID's must be greater than "0" Atom ID of Atom Above the Plane on the Right Note: Atom ID's must be greater than "0" Atom ID of Atom Below the Plane on the Right Note: Atom ID's must be greater than "0" Container for Data Depositions and Assay Definitions Assay Description or pre-existing Identifier Assay Identifier External Assay Identifier Assay Description (new or updated) Assay Identifier/Version (for internal use) Container for multiple Assay Result Sets PC-AssayResultsSet ::= SEQUENCE OF PC-AssayResults Assay Results provided for a given Substance tested, with respect to the results types defined in the referenced Assay Description Internal/External Tracking Information Tested Substance ID/Version [Either valid ID or, if "sid-source" is used, this is a "0" value] Note: A valid ID is greater than "0" External Identifier for this Substance Note: May be used in-lieu of "sid" Note: This is non-optional if "sid" is "0" Version identifier for this AID-SID Result Note: Incoming data should set this to be "0" Data Annotation/Qualifier and URL to further Depositor Information Annotation or qualifier for this Result Assay Result Data for this Sample Note: Users need populate only those "tid"s, for which there is data, in any order. Assay Outcome inactive - Substance is considered Inactive active - Substance is considered Active inconclusive - Substance is Inconclusive unspecified - Substance Outcome is Unspecified probe - Substance Outcome is Unspecified Rank of Assay Outcome (for result ordering) Note: Larger numbers are more active Depositor provided URL for this Result Pubchem Release Date Assay Readouts/Results for a Tested Substance Assay Result Field Type ID (TID) Note: Result Field ID's must be greater than "0" Assay Result, must be the same type as defined for TID Assay Description provided by an Organization that describes the assay/protocol performed and defines the measured end-points and parameters to be stored. An Assay Description is not a database table. You can define as many Result Definitions as needed and they need not be used by all Substances tested. Assay Descriptions can be modified on both description text and Result Definitions after initial submission as desired, and such udpates will be tracked in PubChem Internal/External Tracking Information Assay Description ID/Version [Either valid ID or, if "aid-source" is used, a "0" dummy value] Note: Version is for internal use (only?) Note: A valid ID is greater than "0" External Identifier for this Assay Description Note: May be used in-lieu of "aid" Note: This is non-optional if "aid" ID is "0" Assay Description Information Short Assay Name (for display purposes) Additional Information pub SEQUENCE OF Pub OPTIONAL, Depositor provided publications for this assay (never used) Revision identifier for textual description Assay Outcome Qualifier other - All Other Type screening - Primary Screen Assay confirmatory - Confirmatory Assay summary - Probe Summary Assay to distinguish projects funded through MLSCN, MLPCN or other mlscn - assay depositions from MLSCN screen center mlpcn - assay depositions from MLPCN screen center mlscn-ap - assay depositions from MLSCN assay provider mlpcn-ap - assay depositions from MLPCN assay provider journal-article - to be deprecated and replaced by option 7, 8 & 9 assay-vendor - assay depositions from assay vendors literature-extracted - data from literature, extracted by curators literature-author - data from literature, submitted by author of articles literature-publisher - data from literature, submitted by journals/publishers rnaigi - RNAi screenings from RNAi Global Initiative Definition for Categorized description/comment This field is added to provide flexibility for depositors to present textual description/comments in a desirable way and to facilitate information validation by the depositor and data exchange with PubChem title for the description/comment Assay Dose-response attribute information used to define a set of readouts as being part of a dose-response curve (for curve plotting/analysis) Unique dose-response test set identifier Note: A valid ID is greater than "0" Dose-Response Curve Description (used as curve title) Dose Axis Description (used as axis name) Response Axis Description (used as axis name) experimental - dose-response data points measured directly by experiment calculated - dose-response data points derived from fitted curve Molecular target information provides by organization describes the functionality of the target, facilitates the linking between PubChem bioassays, and the linking between target molecule to other NCBI resources Molecular name of target database and identifier of the target molecule target is a NCBI Gene ID target is a NCBI Protein Accession target is a NCBI Nucleotide Accession target is beyond supported type (format = TYPE::RESOURCE::IDENTIFIER) target is a NCBI Taxonomy ID Target Organism Target Description (e.g., cellular functionality and location) Annotated Cross-Reference (XRef) Information to allow the XRef to be qualified, as to its meaning or context Cross-Reference Information Annotation qualifier describing Cross-Reference meaning pcit - primary PMID/citation directly associated with the current assay data pgene - gene encoding the protein assay target Definition of Allowed Result Types for a given Assay Tracking or Description Information Assay Result Field Type ID (TID) Result Field Name (short name for display) Result Data Type and Validation Information Result Data Type Allowed Values, used for validating incoming data If type is "float" Allowed values (x) must be [ fmin <= x ] Allowed values (x) must be [ x <= fmax ] Minimum/Maximum Range [ min <= x <= max ] Allowed values (x) must be [ imin <= x ] Allowed values (x) must be [ x <= imax ] Minimum/Maximum Range [ min <= x <= max ] Unit information provides the units for the values reported for this TID. For example, if the values reported for this TID are a concentration, e.g., micro-molar, setting the unit "um" allows PubChem to know that the value, e.g., "1.3", is actually "1.3 uM". This also allows PubChem to properly report the units when displaying the reported values for this TID. If the enumerated units provided below are insufficient, you may represent the units as a string in the optional "sunit" field (see below). Units for Value ppt - Parts per Thousand ppm - Parts per Million ppb - Parts per Billion mm - milliM um - microM nm - nanoM pm - picoM fm - femtoM mgml - milligrams per mL ugml - micrograms per mL ngml - nanograms per mL pgml - picograms per mL fgml - femtograms per mL m - Molar percent - Percent ratio - Ratio sec - Seconds rsec - Reciprocal Seconds min - Minutes rmin - Reciprocal Minutes day - Days rday - Reciprocal Days ml-min-kg - milliliter / minute / kilogram l-kg - liter / kilogram hr-ng-ml - hour * nanogram / milliliter cm-sec - centimeter / second mg-kg - milligram / kilogram ATTENTION: sunit field is DEPRECATED. It is no longer supported and remains for legacy data only. Unit Type (as a String) Value Transform information qualifies the values reported for this TID. For example, if the values reported for this TID are "-Log10 GI50", you may want to consider setting the "nlog" value below. In doing so, PubChem would know that the value, e.g., "5.0" is actually "1.0e-5". If the transformation applied is not listed, you may represent this transformation as a string in the "stransform" (see below) for eventual inclusion in the enumerated transform list below. ATTENTION: transform field is DEPRECATED. It is no longer supported and remains for legacy data only. Value Type Details linear - Linear Scale (x) ln - Natural Log Scale (ln x) log - Log Base 10 Scale (log10 x) reciprocal - Reciprocal Scale (1/x) negative - Negative Linear Scale (-x) nlog - Negative Log Base 10 Scale (-log10 x) nln - Negative Natural Log Scane (-ln x) ATTENTION: stransform field is DEPRECATED. It is no longer supported and remains for legacy data only. stransform VisibleString OPTIONAL, Value Transform Type as a string (never used) Tested concentration attribute if true, indicates that this TID field provides active concentration summary by reporting the concentration which produces 50% of the maximum possible biological response such as IC50, EC50, AC50, GI50 etc. or by reporting constant parameters such as Ki, that based on which the activity outcome in this assay is called endpoint qualifier (e.g. <, <=, =, >, >=) associated with the ac field above pmid - PubMed ID mmdb - MMDB ID url - indicate TID data is a url that provides supplementary information protein-gi (4), GenBank General ID (GI) for a Protein nucleotide-gi (5), GenBank General ID (GI) for a Nucleotide taxonomy - Taxonomy ID for an Organism mim - MIM, Mendelian Inheritance in Man, ID gene - Entrez Gene ID probe - Entrez Probe ID aid - PubChem BioAssay ID, may be used in 'Summary' assay sid - PubChem Substance ID, may be used in 'Summary' assay cid - PubChem Compound ID protein-target-gi (13), GenBank General ID (GI) for a Protein target biosystems-target-id (14), NCBI BioSystems ID target-name - target name target-descr - brief target description target-tax-id - NCBI Taxonomy ID for target molecule gene-target-id - NCBI Gene ID for a gene target dna-nucleotide-target-gi (19), GenBank General ID (GI) for a DNA Nucleotide target rna-nucleotide-target-gi (20), GenBank General ID (GI) for a RNA Nucleotide target protein-target-accession - GenBank Accession for a Protein target nucleotide-target-accession - GenBank Accession for a DNA/RNA Nucleotide target other - for identifier types not currently support The concentration attribute is to indicate that the readout under this test result field is biological concentration-response data, the attribute provides the value and unit of the tested concentration Units for Concentration um - microM Dose-Response Attribution ID (if applicable) Mininum and Maximum Constraints on an Integer Value (used for validating incoming data) Minimum Value Allowed Maximum Value Allowed Mininum and Maximum Constraints on a Real Value (used for validating incoming data) Minimum Value Allowed Maximum Value Allowed