Databases on the Internet - hints & answers

1

1-1a Go to SWISS-PROT, use full text search with carbonic anhydrase and pick human anhydrase 2 (CAH2_HUMAN).
1-1b-d follow the links
1-1e no

1-2a Go to SWISS-PROT, use full text search with RPL36B and pick this entry (R36B_YEAST)
1-2b-c the gene is GAL4
1-2d-f follow the links, localisation is nuclear, GAL4_YEAST
1-2g follow the links

1-3a CFTR mutation database

1-4a-b FlyBase, TSH_DROME

1-5a Genome Net
1-5b KEGG
1-5c EC 1.2.3.4
1-5d follow the links OXO1_HORVU, GERMIN
1-5e BRENDA

2

This exercise illustrates some of the difficulties with searching databases. There is no accepted, standard nomenclature for genes and their products. Therefore, all imaginable alternate spellings should be tried, and combined with logical operators. The following syntax finds many IL-2 receptor sequences: (IL2 | IL-2 | 'interleukin 2' | interleukin-2) & recept*  However, some receptor entries do not contain the word 'receptor'... So it is safer to use IL2 | IL-2 | 'interleukin 2' | interleukin-2 and go through the output by hand.

You will also notice that many EMBL entries do not contain links to Swiss-Prot. Therefore, the only way to retrieve the corresponding protein entry is to search Swiss-Prot using the EMBL accession number.

The rest of the exercise can be completed entirely by following links from one Web page to another. There is no single "right" way to do this. In fact, you are encouraged to try alternative routes to the same information.

3
 

a.   orf->~-CodV->~-CodW->~-CodX->~-CodY->~ five genes are in the entry sequence, the first one is truncated.
b.   ~-exon4->~-exon5->~  the first three exons of this gene are in an other EMBL entry.
c.  -2 bp-~intron~-5bp-    this entry contains an intron of a tRNA flanked by really short fragments of the adjascent exons.