skip to primary navigationskip to content

Introduction to RNA-seq and ChIP-seq data analysis


The aim of this course is to familiarise the participants with the primary analysis of datasets generated through two popular next-generation sequencing (NGS) assays: ChIP-seq and RNA-seq.

This course starts with a brief introduction to the transition from capillary to high-throughput sequencing and discusses quality control issues, which are common among all NGS datasets.

Next, we will present the alignment step and how it differs between the two analysis workflows.

Finally, we focus on dataset specific downstream analysis, including peak calling and motif analysis for ChIP-seq and quantification of expression, transcriptome assembly and differential expression analysis for RNA-seq.



Hannah Meyer, EMBL-EBI

Konrad Rudolph, University of Cambridge

Luigi Grassi, University of Cambridge

Mukarram Hossain, University of Cambridge

Roman Kreuzhuber, University of Cambridge

Romina Petersen, University of Cambridge

Sandra Cortijo, University of Cambridge

Tomás Di Domenico, University of Cambridge


Audience and Prerequisites

  • This course is aimed at PhD students and post-doctoral researchers who are considering or already generating NGS datasets, but have limited experience in data analysis
  • Participants are expected to have some UNIX experience, such as a basic understanding of the command-line operations cd and ls for navigating and listing directories, respectively and the difference between absolute and relative paths
  • Participants are expected to have basic knowledge of the R syntax.
  • Graduate students, Postdocs and Staff members from the University of Cambridge, Affiliated Institutions and other external Institutions or individuals


Syllabus, Tools and Resources

During this course you will learn about:

  • High-throughput sequencing technology
  • Quality control of raw reads: FASTQC and fastx toolkit
  • Considerations on experiment design for ChIP-seq and RNA-seq
  • Read alignment to a reference genome: Bowtie and Tophat
  • File format conversion and processing: UCSC tools and samtools
  • Peak calling: MACS
  • Motif analysis: MEME
  • Quantification of expression and guided transcriptome assembly: Cufflinks
  • Differential expression analysis: Cuffdiff


Learning Objectives

After this course you should be able to: 

  • Understand the advantages and limitations of the high-throughput assays presented
  • Assess the quality of your datasets
  • Understand the difference between splice-aware and splice-unaware aligners
  • Perform alignment and peak calling of ChIP-seq datasets
  • Perform alignment, quantification of expression and guided transcriptome assembly of RNA-seq datasets



Book Here



Day 1
09:30 - 10:00  Lecture: Next generation sequencing overview 
10:00 - 10:15  Tea/coffee break 
10:15 - 12:30  Lecture/Practical: data retrieval practical 
12:30 - 13:30  Lunch 
13:30 - 14:30  Lecture & Practical: NGS quality control 
14:30 - 15:15  Lecture: Introduction to ChIP-seq 
15:15 - 15:30  Tea/coffee break 
15:30 - 17:30  Practical: ChIP-seq analysis 
Day 2
09:30 - 10:00  Lecture: Introduction to RNA-seq 
10:00 - 10:15  Tea/coffee break 
10:15 - 12:00  Practical: RNA-seq analysis - alignment 
12:00 - 13:00  Lunch 
13:00 - 15:00  Practical: RNA-seq - Transcriptome assembly 
15:00 - 15:15  Tea/coffee break 
15:15 - 17:30  Practical: RNA-seq analysis - Differential expression analysis 


Filed under: