skip to primary navigation skip to content

CATGO - Cambridge Translational Genomics


Sample submission guidelines

Clinical Next Generation Sequencing Technologies for the NIHR BioResource

The NIHR BioResource Sequencing and Informatics Committee

The below document can also be downloaded from the files section of the website.


The NIHR BioResource - Rare Diseases is an integral part of the NIHR BioResource and has been awarded NIHR funding for the clinical application of next generation sequencing techniques (NGST). The application of NGST will be mainly used for the following categories of activities:

  1. to reduce the delay in ascertaining a genetic diagnosis for inherited and acquired genetic disorders (including rare cancers), where the genotype causing phenotype is known, by developing NGST-based diagnostic tests covering NHS diagnostically-important genes; such projects can include translational projects on e.g. a subset of diagnostic genes,
  2. to determine the genetic basis of Inherited Rare Diseases, including rare cancers for which the causative locus has hitherto not been identified, but which have potential wider relevance for the common diseases that are the focus of Biomedical Research Centres/Units (BRC/BRU)-funded translational and experimental medicine research, and
  3. to characterise induced pluripotent stem cell (iPSC) clones, particularly those established from patients with Rare Diseases for which novel causative genes have been discovered.
  4. For the purpose of this document Rare Diseases are defined as conditions with an incidence of less than 5 in 10,000 individuals of the UK population (see the Rare Diseases website for further information).

    Sequencing of exomes, genomes and other sequencing applications

    The NGST capacity for Rare Diseases has been mapped to (i) Exome sequencing (Exome-seq), (ii) whole genome sequencing to clinical standard (30x coverage), but access to other platforms can also be requested, e.g. RNA-seq. The NIHR BioResource NGST funding cannot be used for basic cell biology research, e.g. ChIP-seq, etc in the field of Rare Diseases. It is expected that with the ongoing reduction in sequencing costs that Exome-seq will increasingly be replaced by sequencing of the entire genome and this switch-over is expected to happen in part in 2014.

    How to get access to centrally funded NGST capacity

    Requests for access to centrally funded NIHR BioResource NGST capacity for the above three categories of activities can be made by any group of BRC/BRU researchers who have agreed to work across at least three BRCs/BRUs participating in the NIHR BioResource. The NIHR BioResource SIC can offer advice to ensure the maximum research, clinical and BRC/BRU benefits. Sequencing of samples funded by the NIHR BioResource can ONLY be performed through the CAmbridge Translation GenOmics laboratory. The application can be submitted to the SIC by submitting the completed "NGST-project" form by emailing Catgo (the form can be downloaded here).

    Requests for access to NIHR BioResource funded NGST capacity will be reviewed by the SIC. Proposals for NGST capacity should be ambitious, central to the experimental medicine strategy of the BRCs/BRUs, and are expected to result in publications in high impact journals. Proposals can be based on existing national, and under certain circumstances international collections of DNA samples or on new collections still to be established or to be expanded via the NIHR BioResource - Rare Diseases. If a proposed NGST project falls within the aims and objectives of the NIHR BioResource - Rare Diseases (as endorsed by the NIHR BioResource Steering Committee) then the enrolment office of the NIHR BioResource - Rare Diseases (BRODO) will provide support in establishing new collections or expanding existing ones. DNA samples to be sequenced with NIHR-BioResource funds should have been consented where possible to the NIHR BioResource Stage 1 consent for Rare Diseases, but exemption may be considered on a study-by-study basis.

    Criteria used to assess applications?

    The following criteria will be used by the NIHR BioResource SIC when considering a Category (b) or (c) application:

    • The study is ambitious, internationally competitive and aligned with the disease categories of interest to the NIHR BioResource,
    • The disorder is rare and there is an already existing collection or there is credible proposal to establish a collection,
    • The application is of a collaborative nature and enrolment will be across at least three BRCs/BRU with a clear plan to reach out across the NIHR Translational Research Collaboration for Rare Diseases and the wider NHS,
    • The postulated genetic architecture of the disorder(s) is amenable to discovery of causative variants by Exome-seq or other NGS techniques,
    • That the requesting team is able to provide feedback of Pertinent Findings and offer segregation analysis and pedigree counseling (Feedback of Pertinent Findings to the Clinical Care Team is current policy, Incidental Findings will NOT be returned to the Clinical Care Team).

    • Existing collections will be assessed against the following criteria:
      • Has the collection been established across at least three BRCs/BRU or does it concern a national collection,
      • Are the DNA samples of adequate quality and quantity and curated to NHS or similar (in case of samples from overseas) standards,
      • Samples have been obtained with consent to sequence the genome seeking genetic changes relevant to the submitted phenotype,
      • The study team agrees that cases can be re-contacted to seek their consent for enrolment into the NIHR BioResource - Rare Diseases,
      • Information about clinical phenotype and laboratory parameters are available and applicants agree to transfer this information in its entirety to the NIHR BioResource Phenotype database for Rare Diseases,
      • The requesting team agrees to provide clinical manpower to transform the phenotypic information into the Human Phenome Ontology (HPO) database terms where applicable. The clinical study team is expected to engage in HPO term definition.

    • New collections
      • New collections have to be established via the NIHR BioResource for Rare Diseases using the universal Participant Information Leaflet and Consent Form (a modified form will be generated for patients with psychiatric disorders).

    For inherited Rare Diseases the power to discover the causative locus underlying autosomal recessive disorders is far greater than for autosomal dominant ones. Informative and large pedigrees remain attractive but are not essential. The initial approach to gene discovery is by sequencing non-related index cases from different pedigrees. Additional power of discovery can be achieved by Exome-seq two cases per pedigree but there is only minimal gain of power by sequencing more than two affected cases per pedigree (for all inheritance patterns, although the two cases must be chosen wisely). All samples of pedigrees can be enrolled into the NIHR BioResource and samples can be selected for sequencing afterwards using tools developed on the BRIDGE project website. For example if a "de-novo" germline mutation is assumed to be causative then sequencing the DNA samples from propositus as well as the parents can be highly informative.

    Category (a) proposals

    Category (a) applications will be assessed against an additional set of criteria. Ideally, but not necessarily Category (a) applications are aligned with approved and successful Category (b) projects. The main purpose of the Category (a) projects is to modernize the genetic-based diagnosis of a group of related Rare Diseases. There are an estimated 7000 Rare Diseases of man and the genetic basis of about half has been resolved. It is hoped that the sequence variants for the remaining 3500 diseases will be identified by 2020. For a large number of Rare Diseases there is a considerable diagnostic delay because candidate gene-based diagnostic approaches are generally sequential and cost of the Wellcome Trust Sanger Institute sequencing are prohibitively high preventing the routine uptake by the wider NHS. For certain Category (a) projects the chosen NGST approach may have to be complemented, for the time being, by a high-density SNP array to detect the gene-to-exon sized causative structural variants that cannot be revealed yet by exome-seq (although new algorithms are under development).

    More affordable and rapid diagnosis can be achieved by the application of NGST together with SNP arrays for a group of candidate genes (100-300) that are associated with a certain clinical category of rare disorders (e.g. X-linked learning disability, retinitis pigmentosa, cardiac malformation and cardiomyopathies, bleeding and platelet disorders, etc.). Per sample-tested costs can be controlled by the pooling of DNA samples from different patients. Category (a) projects may only be successful if developed in partnership with commercial companies (e.g. Agilent, Roche/Nimblegen) and projects should lead to the transfer of validated tests into routine service delivery for the NHS.

    The following criteria will be used for the assessment of Category (a) applications:

    • Current knowledge about causative mutations/variants and number of candidate genes,
    • Aggregated incidence of the group of Rare Diseases and expected diagnostic case load,
    • Will DNA-based diagnosis inform/improve clinical management,
    • Can the project be completed before the scientific assessment during the site visit for the NIHR BRC funding 2012-2017 period,
    • Will clinical manpower be provided to drive the project forward, e.g. in partnership with the NIHR BioResource team for Rare Diseases,
    • Is phenotype information available and do applicants agree to transfer the phenotype information on the NIHR BioResource Phenotype database,
    • The requesting team agrees to provide clinical manpower to transform the phenotypic information into the Human Phenome Ontology database terms,
    • Is the case sample collection large enough to have adequate power to determine sensitivity and specificity of the NGST platform of choice,
    • Is the clinical team able to make resources available to curate the candidate genes and append published information about causative variants. This requires bioinformatics manpower with a review/approval process and sign-off by the clinical-genetics team.

    Overseas samples

    Samples from overseas cases can be included but the aggregated number is not to exceed 30% of the total number of samples to be sequenced. The SIC may grant permission for a higher percentage on a case-by-case basis (depending on capacity availability and rarity of disorder). If the committee agrees on providing additional capacity then marginal costs will be charged per additional sample to the research team that brings in these samples.

    Analysis of NGST results, data deposition and archiving

    For approved projects standard bioinformatical analysis (e.g. mapping of sequence reads to the reference genome, variant calling, assigning consequences according to PolyPhen and SIFT, archiving of sequence BAM files, etc.) will be provided at no costs by the Clinical Bioinformatics team of the NIHR BioResource - Rare Diseases. Requesters need to indicate on the request form whether they have bioinformatical/statistical genetics capacity within their Team of Applicants (names need to be provided on the request form) or whether they would seek access to central analytical NIHR BioResource - Rare Diseases capacity or propose to work with another bioinformatical/statistical genetics team in BRCs/BRUs or elsewhere.

    Exome-seq (whole genome) results will be deposited at the European Genome-to-phenome Archive (EGA) archive at the European Bioinformatics Institute (EBI) and be linked to diagnostic categories and HPO terms from the NIHR BioResource clinical phenotype database. The sequence data from EGA can be released together with “clinical diagnosis and related HPO terms” to genuine requesters under a “click-it” agreement. Oversight of data release will be by the NIHR BioResource Steering committee and will be aligned with permissions granted in the informed consent, after discussion with the involved clinicians where appropriate. The research team will have access to the variant-called sequence data 12 month before the same data will be released into EGA.

    Clinical and laboratory data deposition in the NIHR BioResource

    A sample that has been deposited into the NIHR BioResource and is earmarked for sequencing will not be sequenced until the clinical and laboratory data have been deposited in the NIHR BioResource study database for Rare Diseases. The clinical phenotype has to be coded against HPO terms.

    Approval process

    The NIHR BioResource SIC meets on a two-monthly basis, or less depending on the number of applications. Applications received before noon 12 days before the meeting will be distributed and discussed. Meeting dates will be initially publicised via the normal communication channels of the NIHR BioResource and at later date on the NIHR BioResource website (under construction). Teams of Applicants or their representative (e.g. the Principal Investigator(s) responsible for the proposed project) may be invited to the SIC meeting to review and discuss their application.

    What happens if a project is approved?

    If an application is approved then a coordinator of the BRODO will liaise with the lead investigator of the requesting team. A project plan with timelines for SSI application, case enrolment, clinical and laboratory phenotype data capture, biological sample delivery, sample QA, sequencing and analysis will be agreed. An inability to meet agreed timelines for DNA sample delivery may mean that the allocated capacity will be void, and the applicants may have to re-apply for capacity at a later date. The new application may result in a different allocated capacity than on the original application.

    Batch-wise production

    The NGST pipeline is expected to be on capacity by July 2013 and will be highly automated (maximum throughput by December 2013: 3 x 96 samples/week). For the purpose of data consistency and down-stream analysis samples will be batched where possible. Ideally all samples from a single study are processed as a single batch. If collections are still being expanded but a credible number of samples are already available then it could be agreed to split the sequencing of study samples over a number of batches.


    The interpretation of sequencing results will be greatly supported by the UK10K project ( that will be realized by 2013 by the Wellcome Trust Sanger Institute and their clinical research partners and the UK100K project (announced in December 2012). Substantial collaborative connections with the EBI and the Wellcome Trust Sanger Institute are already in place and will be expanded over the 2012-2017 period to reap the benefits of the genomics revolution for the NHS and beyond.

    NIHR BioResource Sequencing and Informatics Committee (SIC)


    • Cambridge BRC: Prof Willem H Ouwehand (chair)
    • Cambridge BRC: Dr Lucy Raymond
    • KCL-SLaM BRC: Dr Gerome Breen
    • KCL-SLAM BRC: Dr Richard Simpson
    • KCL BRC: Dr Michael Simpson
    • Leicester Cardiovascular BRU: Prof Nilesh Samani
    • IC BRC: Dr John Chambers
    • Oxford BRC: Prof Mark McCarthy
    • UCL BRC: Prof Nicholas Wood
    • Wellcome Trust Sanger Institute: Dr Matthew Hurles
    • Patient representative: to be appointed


    • NIHR BioResource: Dr Nathalie Kingston (assistant director)
    • NIHR BioResource Rare Diseases Office (BRODO): Sofie Ashford.
    • Cambridge Translational GenOmics laboratory (CATGO): Dr Ilenia Simeoni.