Multipe Sequence Alignments:
Orthology and Paralogy
Solution




  1. What is the problem with the initial alignment?

    • The initial alignment is obviously correct [aln]

    • It can easily be turned into a [tree]


    • To compute this tree:

      1. Compute the alignment using a clustalw server.

      2. Do not forget to request Phylip,Fasta or MSF as an output format!

      3. Paste the output alignment in drawtree

    • On this tree, Human appears more related to the Mouse than to the Xenope.

    • The reason for this wrong phylogeny is that although the proteins are homologous, they are not orthologous.

    • In order to gain a better understanding of what is going on, one solution will be to add new sequences.


  2. Adding in new sequences

    • This may be done by running a blast against Swissprot.

    • Here is a sample alignment obtained this way: [aln]

    • That alignment gave the following tree [tree]

    • There is still a problem: SCG1_HUMAN, SCG1_RAT and SCG1_MOUSE are not in the right order

    • The reason is that their sequences are too closely related at the protein level.

    • A possible solution would be to use the nucleotide sequence rather than the protein sequence

    • There are more neutral sites in coding nucleotide sequences. These accounts for a faster evolution that renders phylogenetic analysis easier

  3. CONCLUSION

    The first part of this exercise shows that while blast is useful for making a judgement on the existance of some homology between two sequences, more complicated analysis such as orthology or paralogy require a tree as a support.

Questions should be sent to C.Notredame