skip to primary navigationskip to content

Introduction to genome variation analysis using NGS


This course provides an introduction to the analysis of human genome sequence variation with next generation sequencing data (NGS), including:

  • an introduction to genetic variation as well as data formats and analysis workflows commonly used in NGS data analysis;
  • an overview of available analytical tools and discussion of their limitations; and
  • hands-on experience with common computational workflows for analysing genome sequence variation using bioinformatics and computational genomics approaches.



Marta Bleda, University of Cambridge

Matthias Haimel, University of Cambridge

Stefan Graf, University of Cambridge


Audience and Prerequisites

  • Basic experience of command line UNIX
  • Sufficient UNIX experience might be obtained from one of the many UNIX tutorials available online.
  • Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals.


Syllabus, Tools and Resources

During this course you will learn about:

  • Introduction to genome variation
  • Experimental design
  • Genome sequence alignment
  • Data quality assessment
  • Variant calling
  • Variant filtering, effect prediction and prioritisation
  • Genome-wide association


Learning Objectives

After this course you should be able to:

  • Recognize the challenges and pitfalls of the analysis of human genome sequence variation
  • Recognize issues with the data
  • Visualize alignments and variants using a genome viewer
  • Evaluate the quality of your alignments and your variants
  • Understand the standard file formats for representing variant data
  • Apply filters to your list of variants
  • Functionally annotate variants



Book Here



Time Title Type Content
Day 1
09:30 Introduction Lecture Genetics, Experimental design (case/control, families, inheritance model, PED), NGS, Illumina and other platforms, nanopore, dataformats, ethics
10:30 Introduction Hands-on Introduction to iPython and Bash
10:50 Coffee break
11:00 Fastq QC Lecture Quality Control: fastq format, quality, trimming
11:15 Fastq QC Hands-on Quality control
12:00 Alignment Lecture Mapping: reference genomes, assembly, GRCh37 vs GRCh38, tools
12:20 Alignment Hands-on Align reads over lunch
12:30 Lunch
13:30 Alignment Hands-on Command line options, output format (BAM), RG, CIGAR, mapping quality, orientation
14:00 Alignment towards seq. variation Lecture Summary and look ahead (allele hom vs het, multiallleic rehersal)
15:00 Coffee break
15:10 Alignment/Variation Hands-on Identify sequence variation SNVs, Indels by hand
16:30 End Day 1
Day 2
09:30 Variant calling Lecture

Overview: Methods, Mark duplicates, Indel realignment, BQSR, VCF format

Somatic calling problems

10:15 Variant calling Hands-on Pre-processing and calling in practise
10:50 Coffee break
11:00 Annotation and prioritization Lecture Importance, databases, consequence types, deleteriousness and conservation scores, rare and common variants
11:30 Annotation and filtering Hands-on Variant annotation and effect prediction
12:30 Lunch
13:30 Association Lecture GWAS, Case/control, TDT
14:00 Association Hands-on Examples
15:00 Coffee break
15:30 Own data analysis and troubleshooting
16:30 End Day 2
Filed under: