Mining the full information content of mitochondrial DNA

Date of Completion

January 2008


Biology, Genetics|Biology, Bioinformatics




Mitochondrial DNA (mtDNA) analysis is a potent tool in human population/evolution studies and, more recently, forensic investigations. The use of mtDNA is becoming more prevalent in forensics, not only for DNA convicting purposes, but as an exclusionary tool as well. However, the power of mtDNA is currently restricted by insufficiently sized databases and the dearth of polymorphic sites assayed. Furthermore, the elimination of error is of the highest importance to the forensic community, both for testing and database construction. There are three critical aims of this research: evaluate the fairness of the FBI mtDNA database, improve the techniques that are currently used to mine information from the mtDNA, and to eliminate sources of error from the mtDNA typing process. First, to investigate the deficiencies of the national FBI mtDNA dataset, a unique database comprised of over 400 Connecticut residents was assembled. This collection includes information about geographic place of origin, as well as self-identified ethnicity of each sample. The Connecticut dataset demonstrates that: (1) the FBI sample set is severely lacking in size and diversity, and (2) the possibility of population stratification based on geography remains a potential concern. Secondly, the FBI database includes sequence information from only the hypervariable regions obtained by standard dideoxy sequencing methods. This ignores potentially discriminating and informative variation throughout the coding region. We demonstrate that the use of the Affymetrix MitoChip® v2.0 resequencing array is an excellent alternative to dideoxy sequencing. The MitoChip® is accurate, sensitive and assays the entire mitochondrial genome quickly and efficiently. Full mitochondrial genome sequencing by the MitoChip greatly improves the power of discrimination that can be attained using mtDNA. In addition, the MitoChip can identify informative mutations in a putative maternal relative, preserving valuable evidentiary materials. Lastly, we describe the implementation and effectiveness of an automated polymorphism detection and haplogroup prediction software tool. The polymorphism detection algorithm successfully detects all mutations within a model set of mtDNA haplotypes. This tool will reduce or eliminate errors within mtDNA databases by removing the potential for human transcription mistakes. The findings of this research improve the typing and use of mtDNA data as well as the construction of mtDNA databases in a forensic context. ^