RNA到蛋白质。程序编译但不会停止运行

时间:2015-08-11 04:59:02

标签: c++ multidimensional-array

我最近了解了多维数组,并负责分析RNA链并将其转化为蛋白质序列。我决定利用我对多维数组的知识来创建一个密码子(3个RNA碱基组)将转化为每个氨基酸的定义。

//RNA codon to amino acid mapping
    char aminoAcid[4][4][4];
        //A = 0, C = 1, G = 2, U = 3

        //phenylalanine - F
            aminoAcid[3][3][3] = 'F';
            aminoAcid[3][3][1] = 'F';
        //Leucine - L
            aminoAcid[3][3][0] = 'L';
            aminoAcid[3][3][2] = 'L';
        //Serine - S
            aminoAcid[3][1][3] = 'S';
            aminoAcid[3][1][1] = 'S';
            aminoAcid[3][1][0] = 'S';
            aminoAcid[3][1][2] = 'S';
        //tyrosine - Y
            aminoAcid[3][0][3] = 'Y';
            aminoAcid[3][0][1] = 'Y';
        //stop codon
            aminoAcid[3][0][0] = '-';
            aminoAcid[3][0][2] = '-';
        //cysteine - C
            aminoAcid[3][2][3] = 'C';
            aminoAcid[3][2][1] = 'C';
        //stop codon
            aminoAcid[3][2][0] = '-';
        //tryptophan - W
            aminoAcid[3][2][2] = 'W';
        //leucine - L
            aminoAcid[1][3][3] = 'L';
            aminoAcid[1][3][1] = 'L';
            aminoAcid[1][3][0] = 'L';
            aminoAcid[1][3][2] = 'L';
        //proline - P
            aminoAcid[1][1][3] = 'P';
            aminoAcid[1][1][1] = 'P';
            aminoAcid[1][1][0] = 'P';
            aminoAcid[1][1][2] = 'P';
        //histidine - H
            aminoAcid[1][0][3] = 'H';
            aminoAcid[1][0][1] = 'H';
        //glutamine - Q
            aminoAcid[1][0][0] = 'Q';
            aminoAcid[1][0][2] = 'Q';
        //arginine - R 
            aminoAcid[1][2][3] = 'R';
            aminoAcid[1][2][1] = 'R';
            aminoAcid[1][2][0] = 'R';
            aminoAcid[1][2][2] = 'R';
        //isoleucine - I
            aminoAcid[0][3][3] = 'I';
            aminoAcid[0][3][1] = 'I';
            aminoAcid[0][3][0] = 'I';
        //methionine(start codon) - M
            aminoAcid[0][3][2] = 'M';
        //threonine -T
            aminoAcid[0][1][3] = 'T';
            aminoAcid[0][1][1] = 'T';
            aminoAcid[0][1][0] = 'T';
            aminoAcid[0][1][2] = 'T';
        //asparagine - N
            aminoAcid[0][0][3] = 'N';
            aminoAcid[0][0][1] = 'N';
        //lysine - K
            aminoAcid[0][0][0] = 'K';
            aminoAcid[0][0][2] - 'K';
        //serine - S
            aminoAcid[0][2][3] = 'S';
            aminoAcid[0][2][1] = 'S';
        //arginine - R
            aminoAcid[0][2][0] = 'R';
            aminoAcid[0][2][2] = 'R';
        //valine - V
            aminoAcid[2][3][3] = 'V';
            aminoAcid[2][3][1] = 'V';
            aminoAcid[2][3][0] = 'V';
            aminoAcid[2][3][2] = 'V';
        //alanine - A
            aminoAcid[2][1][3] = 'A';
            aminoAcid[2][1][1] = 'A';
            aminoAcid[2][1][0] = 'A';
            aminoAcid[2][1][2] = 'A';
        //aspartic acid - D
            aminoAcid[2][0][3] = 'D';
            aminoAcid[2][0][1] = 'D';
        //glutamic acid - E
            aminoAcid[2][0][0] = 'E';
            aminoAcid[2][0][2] = 'E';
        //glycine - G
            aminoAcid[2][2][3] = 'G';
            aminoAcid[2][2][1] = 'G';
            aminoAcid[2][2][0] = 'G';
            aminoAcid[2][2][2] = 'G';

我创建了以下函数来翻译strand。在这种情况下,请注意我的rna链是:

  

AUGCUUAUUAACUGAAAACAUAUGGGUAGUCGAUGA

string rnaAnalysis::translateRna()
{
    string protein = "";
    int firstBase, secondBase, thirdBase;

    for(int i = 0; i < newSequence.length() - 2; i+3)
    {
        if(newSequence[i] == 'A')
        {
            firstBase = 0;
        }
        else if(newSequence[i] == 'C')
        {
            firstBase = 1;
        }
        else if(newSequence[i] == 'G')
        {
            firstBase = 2;
        }
        else if(newSequence[i] == 'U')
        {
            firstBase = 3;
        }

        if(newSequence[i+1] == 'A')
        {
            secondBase = 0;
        }
        else if(newSequence[i+1] == 'C')
        {
            secondBase = 1;
        }
        else if(newSequence[i+1] == 'G')
        {
            secondBase = 2;
        }
        else if(newSequence[i+1] == 'U')
        {
            secondBase = 3;
        }

        if(newSequence[i+2] == 'A')
        {
            thirdBase = 0;
        }
        else if(newSequence[i+2] == 'C')
        {
            thirdBase = 1;
        }
        else if(newSequence[i+2] == 'G')
        {
            thirdBase = 2;
        }
        else if(newSequence[i+2] == 'U')
        {
            thirdBase = 3;
        }

        bool readSequence = true;
        if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[0][3][2])
        {
            readSequence = true;
        }
        else if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][0] || 
        aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][2] || 
        aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][2][0])
        {
            readSequence = false;
        }


        if(readSequence)
        {
            protein = protein + aminoAcid[firstBase][secondBase][thirdBase] + " ";
        }
        else 
        {
            continue;
        }
    }

    return protein;
}

bool用于“起始密码子”和“终止密码子”,基本上是密码子中的密码子,它将告诉您何时记录以及何时停止。 newSequence将是RNA链。

编辑:我对此很新,所以我理解我的代码可能看起来很难看。关于如何清理它的任何反馈也非常受欢迎。

1 个答案:

答案 0 :(得分:1)

for(int i = 0; i < newSequence.length() - 2; i+3)

应该是

for(int i = 0; i < newSequence.length() - 2; i += 3)

您的代码永远不会更改i的值,这就是它永远不会停止运行的原因。

您的循环以相同的代码段开始三次,您将字母转换为&#39;基本索引&#39;。这是一个使用函数的显而易见的地方

for (int i = 0; i < newSequence.length() - 2; i += 3)
{
    int firstBase = baseIndex(newSequence[i]);
    int secondBase = baseIndex(newSequence[i + 1]);
    int thirdBase = baseIndex(newSequence[i + 2]);
    ...

我将让你写下baseIndex函数。