UniProt (original) (raw)
Examples
- Select all taxa from the UniProt taxonomy[PREFIX up: http://purl.uniprot.org/core/ SELECT ?taxon FROM http://sparql.uniprot.org/taxonomy WHERE { ?taxon a up:Taxon . }](/sparql/?queryPREFIX up: http://purl.uniprot.org/core/
SELECT ?taxon
FROM http://sparql.uniprot.org/taxonomy
WHERE
{
?taxon a up:Taxon .
}) - Select all bacterial taxa and their scientific name from the UniProt taxonomy[PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ SELECT ?taxon ?name WHERE { ?taxon a up:Taxon . ?taxon up:scientificName ?name . # Taxon subclasses are materialized, do not use rdfs:subClassOf+ ?taxon rdfs:subClassOf taxon:2 . }](/sparql/?queryPREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
SELECT ?taxon ?name
WHERE
{
?taxon a up:Taxon .
?taxon up:scientificName ?name .?taxon rdfs:subClassOf taxon:2 . Taxon subclasses are materialized, do not use rdfs:subClassOf+
}) - Select all UniProtKB entries, and their organism and amino acid sequences (including isoforms), for E. coli K12 and all its strains[PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?organism ?isoform ?sequence WHERE { ?protein a up:Protein . ?protein up:organism ?organism . # Taxon subclasses are materialized, do not use rdfs:subClassOf+ ?organism rdfs:subClassOf taxon:83333 . ?protein up:sequence ?isoform . ?isoform rdf:value ?sequence . }](/sparql/?queryPREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?organism ?isoform ?sequence
WHERE
{
?protein a up:Protein .
?protein up:organism ?organism .?organism rdfs:subClassOf taxon:83333 . Taxon subclasses are materialized, do not use rdfs:subClassOf+
?protein up:sequence ?isoform .
?isoform rdf:value ?sequence .
}) - Select the UniProtKB entry with the mnemonic 'A4_HUMAN'[PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein WHERE { ?protein a up:Protein . ?protein up:mnemonic 'A4_HUMAN' }](/sparql/?queryPREFIX up: http://purl.uniprot.org/core/
SELECT ?protein
WHERE
{
?protein a up:Protein .
?protein up:mnemonic 'A4%5FHUMAN'
}) - Select a mapping of UniProtKB to PDB entries using the UniProtKB cross-references to the PDB database[PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?db WHERE { ?protein a up:Protein . ?protein rdfs:seeAlso ?db . ?db up:database http://purl.uniprot.org/database/PDB }](/sparql/?queryPREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?db
WHERE
{
?protein a up:Protein .
?protein rdfs:seeAlso ?db .
?db up:database http://purl.uniprot.org/database/PDB
}) - Select all cross-references to external databases of the category '3D structure databases' of UniProtKB entries that are classified with the keyword 'Acetoin biosynthesis (KW-0005)'[PREFIX keywords: http://purl.uniprot.org/keywords/ PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX up: http://purl.uniprot.org/core/ SELECT DISTINCT ?link WHERE { ?protein a up:Protein . ?protein up:classifiedWith keywords:5 . ?protein rdfs:seeAlso ?link . ?link up:database ?db . ?db up:category '3D structure databases' }](/sparql/?queryPREFIX keywords: http://purl.uniprot.org/keywords/
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX up: http://purl.uniprot.org/core/
SELECT DISTINCT ?link
WHERE
{
?protein a up:Protein .
?protein up:classifiedWith keywords:5 .
?protein rdfs:seeAlso ?link .
?link up:database ?db .
?db up:category '3D structure databases'
}) - Select reviewed UniProtKB entries (Swiss-Prot), and their recommended protein name, that have a preferred gene name that contains the text 'DNA'[PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?name WHERE { ?protein a up:Protein . ?protein up:reviewed true . ?protein up:recommendedName ?recommended . ?recommended up:fullName ?name . ?protein up:encodedBy ?gene . ?gene skos:prefLabel ?text . FILTER CONTAINS(?text, 'DNA') }](/sparql/?queryPREFIX skos: http://www.w3.org/2004/02/skos/core#
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?name
WHERE
{
?protein a up:Protein .
?protein up:reviewed true .
?protein up:recommendedName ?recommended .
?recommended up:fullName ?name .
?protein up:encodedBy ?gene .
?gene skos:prefLabel ?text .
FILTER CONTAINS%28?text, 'DNA'%29
})
8. Select the preferred gene name and disease annotation of all human UniProtKB entries that are known to be involved in a disease[PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ SELECT ?name ?text WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 . ?protein up:encodedBy ?gene . ?gene skos:prefLabel ?name . ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . ?annotation rdfs:comment ?text }](/sparql/?queryPREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX skos: http://www.w3.org/2004/02/skos/core#
PREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
SELECT ?name ?text
WHERE
{
?protein a up:Protein .
?protein up:organism taxon:9606 .
?protein up:encodedBy ?gene .
?gene skos:prefLabel ?name .
?protein up:annotation ?annotation .
?annotation a up:Disease%5FAnnotation .
?annotation rdfs:comment ?text
})
9. Select all human UniProtKB entries with a sequence variant that leads to a 'loss of function'[PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?text WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 . ?protein up:annotation ?annotation . ?annotation a up:Natural_Variant_Annotation . ?annotation rdfs:comment ?text . FILTER (CONTAINS(?text, 'loss of function')) }](/sparql/?queryPREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?text
WHERE
{
?protein a up:Protein .
?protein up:organism taxon:9606 .
?protein up:annotation ?annotation .
?annotation a up:Natural%5FVariant%5FAnnotation .
?annotation rdfs:comment ?text .
FILTER %28CONTAINS%28?text, 'loss of function'%29%29
})
10. Select all human UniProtKB entries with a sequence variant that leads to a tyrosine to phenylalanine substitution[PREFIX faldo: http://biohackathon.org/resource/faldo# PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?annotation ?begin ?text WHERE { ?protein a up:Protein ; up:organism taxon:9606 ; up:annotation ?annotation . ?annotation a up:Natural_Variant_Annotation ; rdfs:comment ?text ; up:substitution ?substitution ; up:range/faldo:begin [ faldo:position ?begin ; faldo:reference ?sequence ] . ?sequence rdf:value ?value . BIND (substr(?value, ?begin, 1) as ?original) . FILTER(?original = 'Y' && ?substitution = 'F') . }](/sparql/?queryPREFIX faldo: http://biohackathon.org/resource/faldo#
PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?annotation ?begin ?text
WHERE
{
?protein a up:Protein ;
up:organism taxon:9606 ;
up:annotation ?annotation .
?annotation a up:Natural%5FVariant%5FAnnotation ;
rdfs:comment ?text ;
up:substitution ?substitution ;
up:range/faldo:begin
[ faldo:position ?begin ;
faldo:reference ?sequence ] .
?sequence rdf:value ?value .
BIND %28substr%28?value, ?begin, 1%29 as ?original%29 .
FILTER%28?original = 'Y' && ?substitution = 'F'%29 .
})
11. Select all UniProtKB entries with annotated transmembrane regions and the regions' begin and end coordinates on the canonical sequence[PREFIX faldo: http://biohackathon.org/resource/faldo# PREFIX up: http://purl.uniprot.org/core/ SELECT ?protein ?begin ?end WHERE { ?protein a up:Protein . ?protein up:annotation ?annotation . ?annotation a up:Transmembrane_Annotation . ?annotation up:range ?range . ?range faldo:begin/faldo:position ?begin . ?range faldo:end/faldo:position ?end }](/sparql/?queryPREFIX faldo: http://biohackathon.org/resource/faldo#
PREFIX up: http://purl.uniprot.org/core/
SELECT ?protein ?begin ?end
WHERE
{
?protein a up:Protein .
?protein up:annotation ?annotation .
?annotation a up:Transmembrane%5FAnnotation .
?annotation up:range ?range .
?range faldo:begin/faldo:position ?begin .
?range faldo:end/faldo:position ?end
})
12. Select all UniProtKB entries that were integrated on the 30th of November 2010[PREFIX up: http://purl.uniprot.org/core/ PREFIX xsd: http://www.w3.org/2001/XMLSchema# SELECT ?protein WHERE { ?protein a up:Protein . ?protein up:created '2010-11-30'^^xsd:date }](/sparql/?queryPREFIX up: http://purl.uniprot.org/core/
PREFIX xsd: http://www.w3.org/2001/XMLSchema#
SELECT ?protein
WHERE
{
?protein a up:Protein .
?protein up:created '2010-11-30'^^xsd:date
})
13. Was any UniProtKB entry integrated on the 9th of January 2013[PREFIX up: http://purl.uniprot.org/core/ PREFIX xsd: http://www.w3.org/2001/XMLSchema# ASK WHERE { ?protein a up:Protein . ?protein up:created '2013-01-09'^^xsd:date }](/sparql/?queryPREFIX up: http://purl.uniprot.org/core/
PREFIX xsd: http://www.w3.org/2001/XMLSchema#
ASK
WHERE
{
?protein a up:Protein .
?protein up:created '2013-01-09'^^xsd:date
})
14. Construct new triples of the type 'HumanProtein' from all human UniProtKB entries[PREFIX taxon: http://purl.uniprot.org/taxonomy/ PREFIX up: http://purl.uniprot.org/core/ CONSTRUCT { ?protein a up:HumanProtein . } WHERE { ?protein a up:Protein . ?protein up:organism taxon:9606 }](/sparql/?queryPREFIX taxon: http://purl.uniprot.org/taxonomy/
PREFIX up: http://purl.uniprot.org/core/
CONSTRUCT
{
?protein a up:HumanProtein .
}
WHERE
{
?protein a up:Protein .
?protein up:organism taxon:9606
})
15. Select the average number of cross-references to the PDB database of UniProtKB entries that have at least one cross-reference to the PDB database[PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX up: http://purl.uniprot.org/core/ SELECT (AVG(?linksToPdbPerEntry) AS ?avgLinksToPdbPerEntry) WHERE { SELECT ?protein (COUNT(DISTINCT ?db) AS ?linksToPdbPerEntry) WHERE { ?protein a up:Protein . ?protein rdfs:seeAlso ?db . ?db up:database http://purl.uniprot.org/database/PDB . } GROUP BY ?protein ORDER BY DESC(?linksToPdbPerEntry) }](/sparql/?queryPREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX up: http://purl.uniprot.org/core/
SELECT %28AVG%28?linksToPdbPerEntry%29 AS ?avgLinksToPdbPerEntry%29
WHERE
{
SELECT ?protein %28COUNT%28DISTINCT ?db%29 AS ?linksToPdbPerEntry%29
WHERE
{
?protein a up:Protein .
?protein rdfs:seeAlso ?db .
?db up:database http://purl.uniprot.org/database/PDB .
}
GROUP BY ?protein ORDER BY DESC%28?linksToPdbPerEntry%29
})
16. More examples
Your SPARQL query
About
This SPARQL endpoint contains all UniProt data. It is free to access and supports the SPARQL 1.1 Standard.
There are 225,657,523,955 triples in this release (2025_02). The query timeout is 45 minutes. All triples are available in the default graph. There are 19 named graphs.