UK10K - Data & Methods (original) (raw)

Data & Methods

Datasets

See here for table listing the datasets associated with the flagship UK10K paper available in the EGA. These datasets correspond to the data analysed for the main UK10K paper.

See here for a table listing all UK10K datasets available in the EGA. Over the course of the UK10K project, data was released periodically. Releases were generally cumulative, in that samples were added between releases, however there were some samples dropped between releases when stricter QC measures were applied. There were also some follow up studies that were not included in the main analysis. These include high-coverage WGS sequencing for 20 RARE samples and a large exome sequencing follow-up for UK10K_RARE_FIND with 1151 samples. Additionally there were a handful of exome samples that came in after the final freeze. Data for these are included in their own datasets.

Sites and allele frequencies

A VCF and atab-delimited file are both available on the Sanger ftp site with sites, and allele frequencies for the final UK10K COHORT datasets. Allele frequencies for the UK10K exome studies are only available by obtaining access to the individual exome studies. See the data access page for information about requesting access.

The VCF is annotated with rsIDs from dbSNP138, and the following INFO fields:

FAQ