One of the key achievements in bioinformatics is the completion of the Human Genome Project. The goal of the project was to identify the chemical of sequences and functions of each gene in our DNA. The project began in early 1990 and ran until 2003, when a completed draft was published simultaneously by a private group, Celera, and a public group headed by UC Santa Cruz. A number of results have already surfaced, such as identification of various predispositions towards genetic diseases, although much of the interpretation of the data is still to be done.
While the project encompasses techniques and expertise from a variety of fields, this lab focuses on some computational techniques used to reassemble the DNA sequences. In particular, we will analyze the problem of reassembling DNA during a procedure called shotgun sequencing. In essence, there was no way to sequence an entire chromosome at once. The technique used was to first randomly "cut up" the DNA using restriction enzymes into manageable pieces, and sequencing each of these pieces individually. Where computers came in was in re-assembling each of these small sequences into the original sequence, just like a jigsaw puzzle.
To get some insight into this problem, take a look at the strips of paper your TA has handed out or will hand out. The idea is that you had originally one long coherent string of DNA. After being cut up, there are now overlapping pieces, and your job is to completely reassemble them.