
Workshop: A roadmap to de-novo assembly of higher eukaryote genomes
REGISTRATION IS FREE, AND CAPPED AT 40 PARTICIPANTS. LUNCH NOT PROVIDED, BUT PARTICIPANTS ARE ENCOURAGED TO TAKE ADVANTAGE OF THE ALUMNI CAFE (IN THE ALUMNI CENTER) AND ARBUCKLE CAFETERIA, ALSO CLOSE BY.
-
Basics and A Priori Knowledge of the Genome to be Sequenced: To begin, the facilitator, Stefan Prost, will cover some basics and then discuss different genome characteristics that strongly influence whether a genome will be easy or difficult to sequence and assemble successfully, and where to find information on genome characteristics for different taxa.
-
Sequencing Platforms: Outline of 1st, 2nd and 3rd generation sequencing technologies. The sequencing platforms Prost will cover in this section include Illumina (MiSeq, HiSeq and NovaSeq), IonTorrent & IonProton, ABI Solid, PacBio, Nanopore and Helicos.
-
Library Setup: Next, Prost will discuss the differences, pros and cons of different Illumina library preparation methods, such as paired-end (PE), mate pair (MP), Dovetail Genomics’s Chicago and Hi-C library. He will further outline other strategies, such as BAC or fosmid based sequencing.
-
Raw Data Processing: Includes a discussion on tools used to assess as well as improve read quality.
-
Assembly vs Mapping: This section will cover the differences between de-novo genome assembly and reference-based mapping, and when either approach is favourable over the other.
-
De-Novo Assemblers: To make the workshop more useful, Prost will outline the different popular assembly tools (for assembly of large genomes), and briefly discuss the underlying algorithms. By doing so, he will also explain terms commonly used in genome assembly, such as "kmer."
-
Assembly Quality Assessment: A critical step after assembling a genome is the quality assessment of the resulting assembly. In cases where different assemblers or different kmer sizes are used, tools are needed to decide which of the assemblies is the best.
-
Assembly Improvement: There are different tools that can be used to improve a genome sequence after the initial assembly, either by filling gap regions or finding and resolving missassembled regions. Furthermore, genome assemblies can be merged to improve quality.
-
Draft vs. Finished Assembly: A crucial decision in genomics is whether a genome assembly is good enough to address the desired research questions. Here, Prost will explain the differences between finished and draft genome assemblies, and give some guidance on deciding if further sequencing is needed or not.
-
Downstream Analyses: To conclude the workshop, Prost will briefly outline subsequent downstream processing and analyses steps, such as repeat and gene annotation, and how to get a haploid genome sequence into a diploid genome mapping.
Stefan Prost is currently a Postdoctoral Fellow in Dmitri Petrov's lab at Stanford University. His research focuses on evolutionary genomics, genome architecture changes and genome assembly. More precisely, he studies how genomes change in response to adaption to new environments and living conditions in a variety of taxa. He started his research at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, working on ancient DNA analyses. He graduated with a Master’s in Microbiology and Genetics from the University of Vienna, Austria, before starting his PhD at the University of Otago, Dunedin, New Zealand. After he received his degree in New Zealand, he relocated to Sweden to do a short-term postdoc on evolutionary genomics at the Swedish Natural History Museum in Stockholm, before joining Rasmus Nielsen's lab at the University of Berkeley for two years as a postdoc. He is currently setting up a large scale comparative genomics project of Drosophila flies with Dmitri Petrov and other colleagues. Besides evolutionary genomics, he is also interested in genome assembly methods, and working with 2nd and 3rd generation sequencing technologies.