Our integrity guidelines were set to ensure that the data we produce is of the highest quality, but there
are times where it is impossible to obtain more/cleaner material. We will run whatever you ask us to, but
we cannot be responsible for the final quality
of samples which fail QC.
5. What types of analysis can be provided by the GSL?
The deliverable for all sequencing services should be assumed to be fastq files, which are the raw reads as they come off the sequencer, along with per-base quality scores. Analysis of data may be provided depending on the application and the customer's request. Alignment of WGS or RNA-seq data can be performed for a flat fee per sample. For additional analysis, such as comparison of RNA-seq data across groups, preparation of materials for slides or manuscripts, a collaboration will be set up between the customer and the HudsonAlpha scientist. Payment for analysis is still required, but appropriate recognition of the GSL staff is expected.
If samples are from a "common" organism (i.e. human, mouse, rat): Reads from DNA or RNA samples can
be aligned to a reference genome. If applicable, sequence capture efficiencies and coverages will be
calculated. Bam and vcf files can be provided for human whole genome samples run on the HiSeq X through our DRAGEN platform for a flat fee per sample. If desired, GATK analysis may also be performed for an additional fee, but does require additional time.
Analysis fees are calculated on a per-project basis and cover staff time, software licenses, and
If samples are from an "unusual" organism: The same basic analyses are available, but would
require an extra fee to set up reference data. This assumes, of course, that suitable data
(finished genome, gene models, etc) exist in the public domain. For specific protocols, such as metagenomics (16S) projects, analysis is provided entirely through Illumina's basespace which will categorize reads down to the genus level.
6. How long does it take to run the sequencers?
These values are how long it takes to sequence the samples; they do not include
or analysis time:
|HiSeq X PE-150bp
|HiSeq 2500 PE-50bp
|HiSeq 2500 PE-100bp
|HiSeq 2500 SE-50bp
|HiSeq 2500 rapid run
||8 hours - 3 days
7. What is a BAM file?
A BAM file is the binary (compressed) variant of a Sequence Alignment/Map file, a standard format for
storing large sets of alignment data. The BAM files that we deliver are already sorted. The standard genome used for human WGS alignment with the DRAGEN platform is hg19, but GRCh37 may be used at the customer's request if included within the Special Instructions at the time of submission.
8. How are fastq files named?
Fastq files sequenced at the GSL are named with a series of identifiers to allow the precise assignment of a particular barcoded library to a lane of a specific flowcell. As such, no two fastq files will ever be given the same name. For example, fastq file name HNJWNCCXX_s8_1_GSLv3_05_Sl146310.fastq.gz contains five pieces of information: the flowcell name (HNJWNCCXX), the lane (s8), the read number (1 = forward read, 2 = reverse read), the barcode set used and the barcode number (GSLv3 barcode set, barcode 05) and the unique Sequencing Library ID (SL146310).
9. How can I visualize my alignments?
10. What tools do you recommend for performing my own analysis?
For DNA sequencing, open source tools are BWA
for alignment,and the
Genome Analysis Toolkit
(GATK) for most other analysis, including variant finding.
For RNA-Seq, we use TopHat
for spliced alignments and Cufflinks
for isoform assembly and quantification.