Metagenomics sequencing can be used to identify the variety of organisms found in a pool of samples. Two general approaches may be used. The first method is very focused, with specific primers used to amplify regions of ribosomal DNA sequence (targeting the 16S subunit for bacteria or the 18S subunit for eukaryotes). Sequences obtained are matched against databases to report the diversity of the samples down to the genus level. This approach is usually less expensive and analysis more straightforward than a shotgun sequencing approach. In shotgun sequencing, whole genomic DNA is isolated from a source (such as a soil sample, pond, or tissue), and the isolate will contain DNA from any organism represented in the source. The DNA is processed through the whole genomic sequencing protocol and sequenced. The output is raw reads which then must be mapped and categorized using specialized software.
For identification of bacteria in a sample, PCR is performed to amplify the V3-V4 portion of the 16S ribosomal DNA sequence and sequencing libraries are then created from these amplicons. The 16S primer sequences used by the GSL are as described in Illumina's 16S Metagenomic Sequencing Library Preparation. This protocol targets the V3-V4 region and primers are selected from the Klindworth et al. 2013 manuscript. For eukaryotic organism identification, the 18S primer sequences used by the GSL are as reported in Hugerth et al. 2014.
To keep costs low, each individual library does not receive a complete library QC (picogreen, bioanalysis, and kapa qPCR). Instead, the concentration of the libraries is determined by picogreen and libraries are pooled using that concentration. The final pool of libraries undergoes a full library QC and is sequenced on a 300bp paired-end Miseq run. Therefore a standard quote will list these options:
Shotgun sequencing of metagenomics samples are processed as a standard WGS library preparation. The libraries are usually sequenced on the HiSeq 2500 at the longest read length available. Standard high output read lengths are 100PE and 125PE and each lane generates about 250M paired-end reads. A HiSeq 2500 Rapid Run may be used to sequence up to 250PE, with 300M PE reads per run, but the insert size of the library will need to be adjusted to make sure that reads do not overlap or sequence into adapter. A NextSeq run may also be used for shotgun samples. Metagenomics sequencing does not qualify for HiSeq X sequencing. Depending on the anticipated diversity, a few to 96 shotgun samples may be sequenced per lane of HiSeq 2500.
To submit samples for shotgun metagenomics processing, choose “DNA--Genomic DNA--Standard Library Sequencing (Any Sequencer)". You may then request the sequencing run type and amount of sequence per DNA sample. Please include a description of the samples in the Special Instructions, and request a modified insert size if needed.Raw sequencing data (fastq files) will be posted through the GSL LIMS. Several open source software packages are available to assemble data to consensus sequences and provide comparative metagenomics.
Upon arrival to the GSL, all DNA samples are evaluated for concentration using Qubit or Picogreen. DNA samples intended for the 16S or 18S metagenomics platform are not evaluated for sample integrity. The PCR reactions are normally robust, and standard input is 10ng. However, performance does vary by project, and this may be linked to the purity of the samples. It is important that the submitted DNA be as clean as possible, even if the concentration is very low. Samples for shotgun sequencing are evaluated for integrity using a 0.8% agarose gel with EtBr. This sample type is often degraded given the source of the DNA, and approval is requested before proceeding with library preparation.
The GSL may at times isolate metagenomic samples for customers, priced on a case-by-case basis. The DNA isolation queue is generally very long, and so results will be obtained more rapidly with the submission of customer-isolated DNA.