Multiple Sequence Alignment Tools

T-Coffee is a versatile Multiple Sequence Alignment method suitable for aligning most types of biological sequences. The series of protocols presented here show how the package can be used to multiply align proteins, DNA and RNA sequences. The protein section presents controlled cases for PSI-Coffee the homology extended mode suitable for remote homologues, Expresso the structure based multiple aligner and M-Coffee, a meta-version able to combine several third party aligners into one. We then show how the T-RMSD option can be used to produce a functionally informative structure based clustering. RNA alignment procedures are shown for R-Coffee a mode that produces secondary structure based MSAs. DNA alignments are illustrated with Pro-Coffee, a multiple aligner specific of promoter regions. The last sections presents the many reformatting utilities bundled with T-Coffee. The package is an open-source freeware available from www.tcoffee.org.

An extended version of this tutorial has been published in the Nature Protocols journal. Please cite the following reference: Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures. Taly JF, Magis C, Bussotti G,Chang JM, Di Tommaso P, Erb I, Espinosa-Carrasco J, Kemena K, Notredame C, Nature Protocols,6, 1669-1682 (2011), doi: 10.1038/nprot.2011.393.

Materials:

Presentation: An overview of T-Coffee [ppt]

Data: The list of files (input and output) required by this protocol is available from www.tcoffee.org/Projects/Datasets. They can be automatically retrieved using the following command:

t_coffee -other_pg nature_protocol.pl

This will create 4 repertories containing the input sequences necessary for the protocols we report in this section. For each part, all command lines have been collected into the file README.sh.