RNA-Seq with Kallisto and Sleuth¶
Goal¶
Analyze RNA-Seq data for differential expression. Kallisto manual is a quick, highly-efficient software for quantifying transcript abundances in an RNA-Seq experiment. Even on a typical laptop, Kallisto can quantify 30 million reads in less than 3 minutes. Integrated into CyVerse, you can take advantage of CyVerse data management tools to process your reads, do the Kallisto quantification, and analyze your reads with the Kallisto companion software Sleuth in an R-Studio environment.
Manual Maintainer(s)¶
Who to contact if this manual needs fixing. You can also email Tutorials@CyVerse.org
Maintainer | Institution | Contact |
---|---|---|
Jason Williams | CyVerse / Cold Spring Harbor Laboratory | Williams@cshl.edu |
Prerequisites¶
Downloads, access, and services¶
In order to complete this tutorial you will need access to the following services/software
Prerequisite | Preparation/Notes | Link/Download |
---|---|---|
CyVerse account | You will need a CyVerse account to complete this exercise | CyVerse User Portal |
Platform(s)¶
We will use the following CyVerse platform(s):
Platform | Interface | Link | Platform Documentation | Quick Start |
---|---|---|---|---|
Data Store | GUI/Command line | Data Store | Data Store Manual | Data Store Guide |
Discovery Environment | Web/Point-and-click | Discovery Environment | DE Manual | Discovery Environment Guide |
Application(s) used¶
Discovery Environment App(s):
App name | Version | Description | App link | Notes/other links |
---|---|---|---|---|
Kallisto-v.0.43.1 | 0.43.1 | Kallisto v.0.43.1 | Kallisto manual | |
RStudio Sleuth | 0.30.0 | RStudio with Sleuth (v.0.30.0) and dependencies | Sleuth |
Input and example data¶
In order to complete this tutorial you will need to have the following inputs prepared
Input File(s) | Format | Preparation/Notes | Example Data |
---|---|---|---|
RNA-Seq reads | FastQ (may also be compressed, e.g. fastq.gz) | These reads should have been cleaned by upstream tools such as Trimmomatic | Example fastq files |
Reference transcriptome | fasta | Transcriptome for your organism of interest | Example transcriptome |
Sample Data and Working with Your Own Data¶
Sample data
About the Sample Dataset In this tutorial, we are using publicly available data from the SRA. This tutorial will start with cleaned and processed reads. The SRA experiment used data from bioproject PRJNA272719. The abstract from that project is reprinted here:
‘To survey transcriptome changes by the mutations of a DNA demethylase ROS1 responding to a phytohormone abscisic acid, we performed the Next-gen sequencing (NGS) associated RNA-seq analysis. Two ROS1 knockout lines (ros1-3, ros1-4; Penterman et al. 2007 [PMID: 17409185]) with the wild-type Col line (wt) were subjected. Overall design: Three samples (ros1-3, ros1-4 and wt), biological triplicates, ABA or mock treatment, using Illumina HiSeq 2500 system’ citation.
Tip
Working with your own data
If you have your own FASTQ files upload them to CyVerse using instructions in the CyVerse Data Store Guide (e.g. iCommands/Cyberduck).
Fix or improve this documentation
- Search for an answer: CyVerse Learning Center
- Ask us for help: click on the lower right-hand side of the page
- Report an issue or submit a change: Github Repo Link
- Send feedback: Tutorials@CyVerse.org