我最近了解了多维数组,并负责分析RNA链并将其转化为蛋白质序列。我决定利用我对多维数组的知识来创建一个密码子(3个RNA碱基组)将转化为每个氨基酸的定义。
//RNA codon to amino acid mapping
char aminoAcid[4][4][4];
//A = 0, C = 1, G = 2, U = 3
//phenylalanine - F
aminoAcid[3][3][3] = 'F';
aminoAcid[3][3][1] = 'F';
//Leucine - L
aminoAcid[3][3][0] = 'L';
aminoAcid[3][3][2] = 'L';
//Serine - S
aminoAcid[3][1][3] = 'S';
aminoAcid[3][1][1] = 'S';
aminoAcid[3][1][0] = 'S';
aminoAcid[3][1][2] = 'S';
//tyrosine - Y
aminoAcid[3][0][3] = 'Y';
aminoAcid[3][0][1] = 'Y';
//stop codon
aminoAcid[3][0][0] = '-';
aminoAcid[3][0][2] = '-';
//cysteine - C
aminoAcid[3][2][3] = 'C';
aminoAcid[3][2][1] = 'C';
//stop codon
aminoAcid[3][2][0] = '-';
//tryptophan - W
aminoAcid[3][2][2] = 'W';
//leucine - L
aminoAcid[1][3][3] = 'L';
aminoAcid[1][3][1] = 'L';
aminoAcid[1][3][0] = 'L';
aminoAcid[1][3][2] = 'L';
//proline - P
aminoAcid[1][1][3] = 'P';
aminoAcid[1][1][1] = 'P';
aminoAcid[1][1][0] = 'P';
aminoAcid[1][1][2] = 'P';
//histidine - H
aminoAcid[1][0][3] = 'H';
aminoAcid[1][0][1] = 'H';
//glutamine - Q
aminoAcid[1][0][0] = 'Q';
aminoAcid[1][0][2] = 'Q';
//arginine - R
aminoAcid[1][2][3] = 'R';
aminoAcid[1][2][1] = 'R';
aminoAcid[1][2][0] = 'R';
aminoAcid[1][2][2] = 'R';
//isoleucine - I
aminoAcid[0][3][3] = 'I';
aminoAcid[0][3][1] = 'I';
aminoAcid[0][3][0] = 'I';
//methionine(start codon) - M
aminoAcid[0][3][2] = 'M';
//threonine -T
aminoAcid[0][1][3] = 'T';
aminoAcid[0][1][1] = 'T';
aminoAcid[0][1][0] = 'T';
aminoAcid[0][1][2] = 'T';
//asparagine - N
aminoAcid[0][0][3] = 'N';
aminoAcid[0][0][1] = 'N';
//lysine - K
aminoAcid[0][0][0] = 'K';
aminoAcid[0][0][2] - 'K';
//serine - S
aminoAcid[0][2][3] = 'S';
aminoAcid[0][2][1] = 'S';
//arginine - R
aminoAcid[0][2][0] = 'R';
aminoAcid[0][2][2] = 'R';
//valine - V
aminoAcid[2][3][3] = 'V';
aminoAcid[2][3][1] = 'V';
aminoAcid[2][3][0] = 'V';
aminoAcid[2][3][2] = 'V';
//alanine - A
aminoAcid[2][1][3] = 'A';
aminoAcid[2][1][1] = 'A';
aminoAcid[2][1][0] = 'A';
aminoAcid[2][1][2] = 'A';
//aspartic acid - D
aminoAcid[2][0][3] = 'D';
aminoAcid[2][0][1] = 'D';
//glutamic acid - E
aminoAcid[2][0][0] = 'E';
aminoAcid[2][0][2] = 'E';
//glycine - G
aminoAcid[2][2][3] = 'G';
aminoAcid[2][2][1] = 'G';
aminoAcid[2][2][0] = 'G';
aminoAcid[2][2][2] = 'G';
我创建了以下函数来翻译strand。在这种情况下,请注意我的rna链是:
AUGCUUAUUAACUGAAAACAUAUGGGUAGUCGAUGA
string rnaAnalysis::translateRna()
{
string protein = "";
int firstBase, secondBase, thirdBase;
for(int i = 0; i < newSequence.length() - 2; i+3)
{
if(newSequence[i] == 'A')
{
firstBase = 0;
}
else if(newSequence[i] == 'C')
{
firstBase = 1;
}
else if(newSequence[i] == 'G')
{
firstBase = 2;
}
else if(newSequence[i] == 'U')
{
firstBase = 3;
}
if(newSequence[i+1] == 'A')
{
secondBase = 0;
}
else if(newSequence[i+1] == 'C')
{
secondBase = 1;
}
else if(newSequence[i+1] == 'G')
{
secondBase = 2;
}
else if(newSequence[i+1] == 'U')
{
secondBase = 3;
}
if(newSequence[i+2] == 'A')
{
thirdBase = 0;
}
else if(newSequence[i+2] == 'C')
{
thirdBase = 1;
}
else if(newSequence[i+2] == 'G')
{
thirdBase = 2;
}
else if(newSequence[i+2] == 'U')
{
thirdBase = 3;
}
bool readSequence = true;
if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[0][3][2])
{
readSequence = true;
}
else if (aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][0] ||
aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][0][2] ||
aminoAcid[firstBase][secondBase][thirdBase] == aminoAcid[3][2][0])
{
readSequence = false;
}
if(readSequence)
{
protein = protein + aminoAcid[firstBase][secondBase][thirdBase] + " ";
}
else
{
continue;
}
}
return protein;
}
bool用于“起始密码子”和“终止密码子”,基本上是密码子中的密码子,它将告诉您何时记录以及何时停止。 newSequence将是RNA链。
编辑:我对此很新,所以我理解我的代码可能看起来很难看。关于如何清理它的任何反馈也非常受欢迎。答案 0 :(得分:1)
for(int i = 0; i < newSequence.length() - 2; i+3)
应该是
for(int i = 0; i < newSequence.length() - 2; i += 3)
您的代码永远不会更改i
的值,这就是它永远不会停止运行的原因。
您的循环以相同的代码段开始三次,您将字母转换为&#39;基本索引&#39;。这是一个使用函数的显而易见的地方
for (int i = 0; i < newSequence.length() - 2; i += 3)
{
int firstBase = baseIndex(newSequence[i]);
int secondBase = baseIndex(newSequence[i + 1]);
int thirdBase = baseIndex(newSequence[i + 2]);
...
我将让你写下baseIndex函数。