Date of Award

12-1-2003

Degree Type

Thesis

University or Center

Clark Atlanta University(CAU)

Degree Name

M.S.

Biological Sciences

First Advisor

Professor William Seffens

Abstract

Many biological experiments require a protein sequence to be translated to the nucleic acid sequence that codes for it or require an investigator to possess a means to “backtranslate” a protein to its amino acid sequence. However, the degenerate nature of the genetic code greatly frustrates this process through ambiguities in the wobble bases. One possible solution to this dilemma is to predict codon usage frequencies for a target organism through use of an Artificial Neural Network. Consequently, a Neural Network was trained on amino and nucleic acid sequences to determine the network’s capacity in accurate predictions for a twenty amino acid window. Moreover, 10 different network architectures were surveyed to ascertain which one yields optimum (least error) results when trained on the same nucleic acid sequences. The winning architecture was examined using two new training sets that have been partitioned into those with high bias and those with low bias for mRNA secondary structure. The more negative the bias, the more secondary structure it will have, whereas less negative bias will display less secondary structure. Testing of these two training sets revealed that the neural network was able to distinguish between the two sets; i.e., the training set with greater secondary structure learned the patterns in less training cycles and produced a lower error when compared to the training set with less secondary structure given the same network architecture. Ultimately, this work might be beneficial as a computation tool for backtranslation in degenerate PCR cloning and in identifying the unknown coding regions in genes.

Signature Location_Supplemental file.pdf (45 kB)
Notice to Users, Transmittal and Statement of Understanding

Included in

Biology Commons

Share

COinS