RSEM (RNA-Seq by Expectation-Maximization)


Updates

Jun 16, 2014   RSEM v1.2.15 is online now. Allowed for a subset of reference sequences to be declared in an input SAM/BAM file. For any transcript not declared in the SAM/BAM file, its PME estimates and credibility intervals are set to zero. Added advanced options for customizing Gibbs sampler and credibility interval calculation behaviors. Splitted options in 'rsem-calculate-expression' into basic and advanced options.

Jun 8, 2014   RSEM v1.2.14 is online now. Changed RSEM's behaviors for building Bowtie/Bowtie 2 indices. In 'rsem-prepare-reference', '--no-bowtie' and '--no-ntog' options are removed. By default, RSEM does not build either Bowtie or Bowtie 2 indices. Instead, it generates two index Multi-FASTA files, 'reference_name.idx.fa' and 'reference_name.n2g.idx.fa'. Compared to the former file, the latter one in addition converts all 'N's into 'G's. These two files can be used to build aligner indices for customized aligners. In addition, 'reference_name.transcripts.fa' does not have poly(A) tails added. To enable RSEM build Bowtie/Bowtie 2 indices, '--bowtie' or '--bowtie2' must be set explicitly. The most significant benefit of this change is that now we can build Bowtie and Bowtie 2 indices simultaneously by turning both '--bowtie' and '--bowtie2' on. Type 'rsem-prepare-reference --help' for more information. If transcript coordinate files are visualized using IGV, 'reference_name.idx.fa' should be imported as a genome (instead of 'reference_name.transcripts.fa'). For more information, see the third subsection of Visualization in 'README.md'. Modified RSEM perl scripts so that RSEM directory will be added in the beginning of the PATH variable. This also means RSEM will try to use its own samtools first. Added --seed option to set random number generator seeds in 'rsem-calculate-expression'. Added posterior standard deviation of counts as output if either '--calc-pme' or '--calc-ci' is set. Updated boost to v1.55.0. Renamed makefile as Makefile. If '--output-genome-bam' is set, in the genome BAM file, each alignment's 'MD' field will be adjusted to match the CIGAR string. 'XS:A:value' field is required by Cufflinks for spliced alignments. If '--output-genome-bam' is set, in the genome BAM file, first each alignment's 'XS' filed will be deleted. Then if the alignment is an spliced alignment, a 'XS:A:value' field will be added accordingly. Added instructions for users who want to put all RSEM executables into a bin directory (see Compilation & Installation section of 'README.md').

May 26, 2014   RSEM v1.2.13 is online now. Allowed usersto use the SAMtools in the PATHfirst and enabled RSEM to find its executables via a symbolic link. Changed the behavior of parsing GTF file. Now if a GTF line's feature is not "exon" and it does not contain a "gene_id" or "transcript_id" attribute, only a warning message will be produced (instead of failing the RSEM).

Click here for full update information.

Author

RSEM is mainly developed by Bo Li, who was a member of Deweylab.

License

RSEM is under the GNU General Public License

Source Code

Documentation

README

Prebuilt RSEM Indices (RSEM v1.1.17) for Galaxy Wrapper

These indices are based on RefSeq containing NM accession numbers only. That means only curated genes (no experimental, no miRNA, no noncoding). Only mature RNAs. In addition, 125bp Poly(A) tails are added at the end of each transcript.

Mouse Indices, extracted from mouse genome mm9

Human Indices, extracted from human genome hg18

Simulation Data

Simulation Data using Refseq set as reference

Simulation Data using Ensembl set as reference

Google Users and Announce Groups

Google Groups
Subscribe to RSEM Announce
Email:
Visit this group
Google Groups
Subscribe to RSEM Users
Email:
Visit this group

Acknowledgements

RSEM was developed with the help of Prof. Colin Dewey.


(last modified on Jun 16, 2014)