NCBI’s LocusLink and RefSeq (original) (raw)
Journal Article
,
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Search for other works by this author on:
,
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Search for other works by this author on:
,
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Search for other works by this author on:
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
Search for other works by this author on:
Published:
01 January 2000
Cite
Donna R. Maglott, Kenneth S. Katz, Hugues Sicotte, Kim D. Pruitt, NCBI’s LocusLink and RefSeq, Nucleic Acids Research, Volume 28, Issue 1, 1 January 2000, Pages 126–128, https://doi.org/10.1093/nar/28.1.126
Close
Navbar Search Filter Mobile Enter search term Search
Abstract
The NCBI has introduced two new web resources—LocusLink and RefSeq—that facilitate retrieval of gene-based information and provide reference sequence standards. These resources are designed to provide a non-redundant view of current knowledge about human genes, transcripts and proteins. Additional information about these resources is available on the LocusLink web site at http://www.ncbi.nlm.nih.gov/LocusLink/
Received September 2, 1999; Revised and Accepted October 4, 1999.
BACKGROUND
The LocusLink and RefSeq databases were initiated to address data-access problems resulting from significant increases in both sequence data and the number of web sites relating information about genes. For example, it is increasingly difficult to identify unambiguously which sequence—of the many publicly available—is an appropriate, complete representative of a given mRNA or protein. Inversely, given an mRNA or protein sequence, it can also be a challenge to determine the official name or symbol for the gene from which the sequence was derived. And once a gene symbol or name is known, identifying other web resources that include information about that gene of interest may be very time-consuming. In its role as a web directory, LocusLink provides a single point-of-access to a variety of gene-specific information sources including web resources and RefSeq. RefSeq provides a non-redundant data set of reference sequences representing transcripts and proteins of known genes. RefSeq records include links to LocusLink, thereby facilitating making connections among sequence data, gene names and related biological information. The LocusLink and RefSeq resources establish reference sequences and stable database identifiers (LocusID) that can be used in variation, mutation and expression analyses.
SCOPE
LocusLink
LocusLink offers a simple query interface to retrieve information about human genes and some non-gene loci. It supports text-based queries by using official nomenclature provided through collaboration with the Human Gene Nomenclature Committee (HGNC; http://www.gene.ucl.ac.uk/nomenclature/ ) (1), as well as cytogenetic locations, aliases and historical names for both a gene and its products. LocusLink provides direct connections to related information available from several resources at NCBI (Table 1) as well as to external web sites including the Genome Database (GDB; http://gdbwww.gdb.org/ ), the Human Gene Mutation Database (HGMD; http://www.uwcm.ac.uk/uwcm/mg/hgmd0.html ) (2), GeneCard (http://bioinfo.weizmann.ac.il/cards/ ), GeneClinics (http://www.geneclinics.org/ ), and locus- or gene family-specific web sites. Some of the links to NCBI resources listed in Table 1 are represented by icons that, when displayed, give an immediate indication that additional information is indeed available. The goal of the PubMed and GenBank/GenPept (3) links is not to be comprehensive, but to establish sufficient connections to facilitate information retrieval via NCBI’s ENTREZ (4) ‘related sequences’ or ‘related publications’ links or through BLAST (5). LocusLink also provides a unique stable identifier for each locus (LocusID).
RefSeq
Although the goal of RefSeq in general is to provide reference sequences representing chromosomes, transcripts and proteins, discussion here is restricted to the subset of human mRNAs and proteins. A RefSeq record is made for an mRNA if the function of the gene product has been studied, and if the sequence of the complete coding region is known. Separate RefSeq records are made for experimentally supported alternate transcripts and their products. The sequence presented in a RefSeq record is usually derived from available GenBank records, although additional information is at times added from the literature or from communications with the research community. RefSeq records are provided in one of two states, either provisional or reviewed. Records initially released as provisional include much of the annotation from the GenBank record used as the source, but incorporate gene and protein names, PubMed links, summary text, and map and chromosome data from LocusLink when available (Table 2). Provisional records are subjected to a manual curation and review process, with the reviewed record being the end product. The reviewed record might differ from the original provisional record by including: (i) more extensive 5′ and 3′ untranslated regions derived from other GenBank records or the literature, (ii) additional mRNA and/or protein features, (iii) more publications and (iv) a summary text describing the gene. Table 2 lists additional annotation that may be added to provisional and reviewed RefSeq records. RefSeq records can be distinguished from GenBank records by the inclusion of a REFSEQ statement in a COMMENT field, and by the unique format of the accession number. The first three characters of the RefSeq mRNA and protein accession numbers are NM_ and NP_, respectively, followed by six numerals (e.g. NM_000280, NP_000337).
ACCESS
RefSeq records can be retrieved by text word queries (gene or protein names or symbols, accession numbers, etc.) or by sequence homology. LocusLink (see Table 3 for URLs) and ENTREZ both support accessing RefSeq records by text. BLAST-based sequence queries must be done against the nucleotide or protein nr databases. The RefSeq records in a BLAST query result can be readily identified by the ‘ref’ prefix and the distinct accession number format described above. More query details and examples are provided in the LocusLink and RefSeq help and FAQ pages available from the LocusLink home page.
LocusLink and RefSeq records are also freely available on the NCBI FTP site (see Table 3). Note that RefSeq records are not in GenBank and must be downloaded separately.
SEARCHING
Comprehensive descriptions of query strategies and navigation from LocusLink and RefSeq are provided from the LocusLink home page. Please note there are multiple sites within NCBI that include links to LocusLink and RefSeq by specific identifiers. These include Online Mendelian Inheritance in Man (OMIM; http://www.ncbi.nlm.nih.gov/omim/ ), UniGene (http://www.ncbi.nlm.nih.gov/UniGene/ ), GeneMap’99 (http://www.ncbi.nlm.nih.gov/genemap/ ) and dbSNP (http://www.ncbi.nlm.nih.gov/SNP/ ) (6).
MAINTENANCE
LocusLink and RefSeq records are created and maintained by an ongoing process as described by Pruitt et al. (7) and on the LocusLink web site. The LocusLink web pages are currently refreshed weekly. RefSeq records may be modified at any time based either on text changes (nomenclature), or by replacing a provisional record with a reviewed one (maintaining the same accession number, but changing the version number and sequence ID numbers if the sequence data has changed).
CONTACT
Questions, comments and suggestions can be emailed to info@ncbi.nlm.nih.gov . We welcome collaborations with and contributions from the research community.
*
To whom correspondence should be addressed. Tel: +1 301 435 5898; Fax: +1 301 480 9241; Email: pruitt@ncbi.nlm.nih.gov
Table 1.
LocusLink connections to resources at NCBI
Resource | Icona | Information provided |
---|---|---|
dbSNP | V | Variation |
dbSTS | Markers and matching sequences and maps | |
GenBank | G | Sequenced |
OMIMb | O | Descriptions of disorders and genes |
PDB Neighbors | Structures of related proteins | |
PROWc | Description of a protein | |
PubMed | P | Literature citationsd |
RefSeq | R | Reference sequence transcripts and proteins |
UniGene | U | Sequence, related proteins from model organisms, expression (CGAP) and radiation hybrid maps |
Resource | Icona | Information provided |
---|---|---|
dbSNP | V | Variation |
dbSTS | Markers and matching sequences and maps | |
GenBank | G | Sequenced |
OMIMb | O | Descriptions of disorders and genes |
PDB Neighbors | Structures of related proteins | |
PROWc | Description of a protein | |
PubMed | P | Literature citationsd |
RefSeq | R | Reference sequence transcripts and proteins |
UniGene | U | Sequence, related proteins from model organisms, expression (CGAP) and radiation hybrid maps |
aIcons displayed on the LocusLink query result and alphabetic listing pages.
bOnline Mendelian Inheritance in Man.
cProtein Reviews on the Web.
dNot comprehensive.
Table 1.
LocusLink connections to resources at NCBI
Resource | Icona | Information provided |
---|---|---|
dbSNP | V | Variation |
dbSTS | Markers and matching sequences and maps | |
GenBank | G | Sequenced |
OMIMb | O | Descriptions of disorders and genes |
PDB Neighbors | Structures of related proteins | |
PROWc | Description of a protein | |
PubMed | P | Literature citationsd |
RefSeq | R | Reference sequence transcripts and proteins |
UniGene | U | Sequence, related proteins from model organisms, expression (CGAP) and radiation hybrid maps |
Resource | Icona | Information provided |
---|---|---|
dbSNP | V | Variation |
dbSTS | Markers and matching sequences and maps | |
GenBank | G | Sequenced |
OMIMb | O | Descriptions of disorders and genes |
PDB Neighbors | Structures of related proteins | |
PROWc | Description of a protein | |
PubMed | P | Literature citationsd |
RefSeq | R | Reference sequence transcripts and proteins |
UniGene | U | Sequence, related proteins from model organisms, expression (CGAP) and radiation hybrid maps |
aIcons displayed on the LocusLink query result and alphabetic listing pages.
bOnline Mendelian Inheritance in Man.
cProtein Reviews on the Web.
dNot comprehensive.
Table 2.
Enhanced annotation in RefSeq nucleotide records
Sequence record feature | Data source | Content |
---|---|---|
CDS – /product= | reviewer | Preferred protein name |
COMMENT – COMPLETENESS | reviewer | Complete; complete 5′, complete 3′ |
COMMENT – REFSEQ | calculated | Identifies GenBank source sequence(s) |
COMMENT – Summary: | reviewer | Summary of gene and this product |
COMMENT – Transcript Variant | reviewer | Description distinguishing alternate transcripts |
DEFINITION | HGNC | Official gene name and symbola |
gene – /db_xref=LocusID | LocusLink | Unique, stable ID; linked to LocusLink |
gene – /db_xref=MIM: | OMIM | Unique, stable ID; linked to OMIM record |
gene – /gene= | HGNC | Official symbola |
LOCUS | HGNC | Official symbola |
polyA_signal; polyA_site | reviewer | mRNA terminus indicated when sufficient data available |
REFERENCE | HGNC; reviewer | Publication citations of relevance to the gene |
source – /chromosome; /map | LocusLink | Integrated from GDB, OMIM |
Sequence record feature | Data source | Content |
---|---|---|
CDS – /product= | reviewer | Preferred protein name |
COMMENT – COMPLETENESS | reviewer | Complete; complete 5′, complete 3′ |
COMMENT – REFSEQ | calculated | Identifies GenBank source sequence(s) |
COMMENT – Summary: | reviewer | Summary of gene and this product |
COMMENT – Transcript Variant | reviewer | Description distinguishing alternate transcripts |
DEFINITION | HGNC | Official gene name and symbola |
gene – /db_xref=LocusID | LocusLink | Unique, stable ID; linked to LocusLink |
gene – /db_xref=MIM: | OMIM | Unique, stable ID; linked to OMIM record |
gene – /gene= | HGNC | Official symbola |
LOCUS | HGNC | Official symbola |
polyA_signal; polyA_site | reviewer | mRNA terminus indicated when sufficient data available |
REFERENCE | HGNC; reviewer | Publication citations of relevance to the gene |
source – /chromosome; /map | LocusLink | Integrated from GDB, OMIM |
aAn interim gene symbol and name are used if an official symbol/name is not yet available.
Table 2.
Enhanced annotation in RefSeq nucleotide records
Sequence record feature | Data source | Content |
---|---|---|
CDS – /product= | reviewer | Preferred protein name |
COMMENT – COMPLETENESS | reviewer | Complete; complete 5′, complete 3′ |
COMMENT – REFSEQ | calculated | Identifies GenBank source sequence(s) |
COMMENT – Summary: | reviewer | Summary of gene and this product |
COMMENT – Transcript Variant | reviewer | Description distinguishing alternate transcripts |
DEFINITION | HGNC | Official gene name and symbola |
gene – /db_xref=LocusID | LocusLink | Unique, stable ID; linked to LocusLink |
gene – /db_xref=MIM: | OMIM | Unique, stable ID; linked to OMIM record |
gene – /gene= | HGNC | Official symbola |
LOCUS | HGNC | Official symbola |
polyA_signal; polyA_site | reviewer | mRNA terminus indicated when sufficient data available |
REFERENCE | HGNC; reviewer | Publication citations of relevance to the gene |
source – /chromosome; /map | LocusLink | Integrated from GDB, OMIM |
Sequence record feature | Data source | Content |
---|---|---|
CDS – /product= | reviewer | Preferred protein name |
COMMENT – COMPLETENESS | reviewer | Complete; complete 5′, complete 3′ |
COMMENT – REFSEQ | calculated | Identifies GenBank source sequence(s) |
COMMENT – Summary: | reviewer | Summary of gene and this product |
COMMENT – Transcript Variant | reviewer | Description distinguishing alternate transcripts |
DEFINITION | HGNC | Official gene name and symbola |
gene – /db_xref=LocusID | LocusLink | Unique, stable ID; linked to LocusLink |
gene – /db_xref=MIM: | OMIM | Unique, stable ID; linked to OMIM record |
gene – /gene= | HGNC | Official symbola |
LOCUS | HGNC | Official symbola |
polyA_signal; polyA_site | reviewer | mRNA terminus indicated when sufficient data available |
REFERENCE | HGNC; reviewer | Publication citations of relevance to the gene |
source – /chromosome; /map | LocusLink | Integrated from GDB, OMIM |
aAn interim gene symbol and name are used if an official symbol/name is not yet available.
Table 3.
LocusLink and RefSeq URLs
Web page | URL |
---|---|
LocusLink Home | http://www.ncbi.nlm.nih.gov/LocusLink/ |
LocusLink Help Documentationa | http://www.ncbi.nlm.nih.gov/LocusLink/help.html |
LocusLink FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/LLfaq.html |
LocusLink Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/statistics.html |
About RefSeq | http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html |
RefSeq FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/RSfaq.html |
RefSeq Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/RSstatistics |
LocusLink ftp site | ftp://ncbi.nlm.nih.gov/refseq/LocusLink/ |
RefSeq ftp site | ftp://ncbi.nlm.nih.gov/refseq/ |
Web page | URL |
---|---|
LocusLink Home | http://www.ncbi.nlm.nih.gov/LocusLink/ |
LocusLink Help Documentationa | http://www.ncbi.nlm.nih.gov/LocusLink/help.html |
LocusLink FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/LLfaq.html |
LocusLink Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/statistics.html |
About RefSeq | http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html |
RefSeq FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/RSfaq.html |
RefSeq Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/RSstatistics |
LocusLink ftp site | ftp://ncbi.nlm.nih.gov/refseq/LocusLink/ |
RefSeq ftp site | ftp://ncbi.nlm.nih.gov/refseq/ |
aDefinitions of terms and content, query help.
bFAQ: frequently asked questions.
Table 3.
LocusLink and RefSeq URLs
Web page | URL |
---|---|
LocusLink Home | http://www.ncbi.nlm.nih.gov/LocusLink/ |
LocusLink Help Documentationa | http://www.ncbi.nlm.nih.gov/LocusLink/help.html |
LocusLink FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/LLfaq.html |
LocusLink Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/statistics.html |
About RefSeq | http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html |
RefSeq FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/RSfaq.html |
RefSeq Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/RSstatistics |
LocusLink ftp site | ftp://ncbi.nlm.nih.gov/refseq/LocusLink/ |
RefSeq ftp site | ftp://ncbi.nlm.nih.gov/refseq/ |
Web page | URL |
---|---|
LocusLink Home | http://www.ncbi.nlm.nih.gov/LocusLink/ |
LocusLink Help Documentationa | http://www.ncbi.nlm.nih.gov/LocusLink/help.html |
LocusLink FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/LLfaq.html |
LocusLink Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/statistics.html |
About RefSeq | http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html |
RefSeq FAQb | http://www.ncbi.nlm.nih.gov/LocusLink/RSfaq.html |
RefSeq Statistics | http://www.ncbi.nlm.nih.gov/LocusLink/RSstatistics |
LocusLink ftp site | ftp://ncbi.nlm.nih.gov/refseq/LocusLink/ |
RefSeq ftp site | ftp://ncbi.nlm.nih.gov/refseq/ |
aDefinitions of terms and content, query help.
bFAQ: frequently asked questions.
References
1 White,J.A., McAlpine,P.J., Antonarakis,S., Cann,H., Eppig,J.T., Frazer,K., Frezal,J., Lancet,D., Nahmias,J., Pearson,P., Peters,J., Scott,A., Scott,H., Spurr,N., Talbot,C.,Jr and Povey,S. (
1978
)
Genomics
,
45
,
468
–471.
2Cooper,D.N., Ball,E.V. and Krawczak,M. (
1998
)
Nucleic Acids Res.
,
26
,
285
–287.
3Benson,D.A. (
1999
)
Nucleic Acids Res.
,
27
,
12
–17. Updated article in this issue:
Nucleic Acids Res
. (
2000
),
28
,
15
–18.
4 Schuler,G.D., Epstein,J.A., Ohkawa,H. and Kans,J.A. (
1996
)
Methods Enzymol.
,
266
,
141
–162.
5 Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (
1997
)
Nucleic Acids Res.
,
25
,
3389
–3402.
6 Sherry,S.T. (
2000
)
Nucleic Acids Res.
,
28
,
352
–355.
7 Pruitt,K.D., Katz,K.S., Sicotte,H. and Maglott,D.R. (
2000
)
Trends Genet.
, in press.
I agree to the terms and conditions. You must accept the terms and conditions.
Submit a comment
Name
Affiliations
Comment title
Comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.
Citations
Views
Altmetric
Metrics
Total Views 1,978
1,542 Pageviews
436 PDF Downloads
Since 12/1/2016
Month: | Total Views: |
---|---|
December 2016 | 2 |
January 2017 | 1 |
February 2017 | 2 |
April 2017 | 1 |
May 2017 | 1 |
June 2017 | 4 |
July 2017 | 1 |
August 2017 | 2 |
September 2017 | 7 |
October 2017 | 1 |
November 2017 | 3 |
December 2017 | 19 |
January 2018 | 8 |
February 2018 | 10 |
March 2018 | 33 |
April 2018 | 19 |
May 2018 | 8 |
June 2018 | 6 |
July 2018 | 23 |
August 2018 | 79 |
September 2018 | 12 |
October 2018 | 32 |
November 2018 | 13 |
December 2018 | 15 |
January 2019 | 11 |
February 2019 | 19 |
March 2019 | 31 |
April 2019 | 27 |
May 2019 | 26 |
June 2019 | 29 |
July 2019 | 34 |
August 2019 | 27 |
September 2019 | 33 |
October 2019 | 25 |
November 2019 | 32 |
December 2019 | 21 |
January 2020 | 44 |
February 2020 | 28 |
March 2020 | 26 |
April 2020 | 25 |
May 2020 | 21 |
June 2020 | 24 |
July 2020 | 24 |
August 2020 | 12 |
September 2020 | 31 |
October 2020 | 30 |
November 2020 | 38 |
December 2020 | 29 |
January 2021 | 33 |
February 2021 | 29 |
March 2021 | 28 |
April 2021 | 38 |
May 2021 | 18 |
June 2021 | 16 |
July 2021 | 14 |
August 2021 | 10 |
September 2021 | 16 |
October 2021 | 22 |
November 2021 | 28 |
December 2021 | 26 |
January 2022 | 15 |
February 2022 | 23 |
March 2022 | 30 |
April 2022 | 15 |
May 2022 | 17 |
June 2022 | 22 |
July 2022 | 33 |
August 2022 | 14 |
September 2022 | 23 |
October 2022 | 22 |
November 2022 | 29 |
December 2022 | 28 |
January 2023 | 16 |
February 2023 | 14 |
March 2023 | 12 |
April 2023 | 19 |
May 2023 | 14 |
June 2023 | 18 |
July 2023 | 15 |
August 2023 | 12 |
September 2023 | 15 |
October 2023 | 18 |
November 2023 | 7 |
December 2023 | 19 |
January 2024 | 31 |
February 2024 | 29 |
March 2024 | 74 |
April 2024 | 83 |
May 2024 | 12 |
June 2024 | 6 |
July 2024 | 22 |
August 2024 | 18 |
September 2024 | 22 |
October 2024 | 4 |
Citations
128 Web of Science
×
Email alerts
Citing articles via
More from Oxford Academic