Clustering Methods and Algorithms in Genomics - Python Practicals

Overview

These practicals cover a part of the Durbin section on HMM modelling. Through these practical we will show you how to generate your own reference data, measure its properties, estimate a model with known labels, decode data with a provided HMM and train your own HMM using Viterbi The code is written in python with no special complication

General Note on the exercises.

All these exercises work on the same principle. A task is given, for instance parse a FASTA file and a template file is then provided along with data. The template file is named x.y.foo.pb.py. This file contains missing bits that are indicated as missing #x. It also comes along with another file named x.y.foo.pb.output. The purpose of the exercise is therefore to modify the pb file so as to generate the output file. In order to help you, we also provide a binary solution file named x.y.foo.sol.pyc. This file is an executable that solves the problem and that you can use to generate new sample output.