Question

我的代码旨在从包含lonf DNA串的文件（即CTAAATCGATGGCGATGATAAATG ...）中找到最常见的密码子。从初始位置pos开始，每三个字符组成一个密码子。我遇到的问题是每当我运行代码时，它告诉我字符串索引超出范围。我知道问题在于

str = line.substring(idx, idx + 2);

但不知道如何解决它。另外，我不确定我是否正确计算频率。我需要增加不止一次看到的每个键的值。

public static void findgene(String line){
            int idx, pos;
            int[] freq = new int[100];
            String str;

            //pos is the position to start at
            pos = 0;
            idx = pos;

            for(int i = 0; i < line.length(); i++){
                if(idx >= 0){
                    //makes every three characters into a codon
                    str = line.substring(idx, idx + 2);
                    //checks if the codon was previously seen
                    if(genes.containsKey(str)){
                        genes.put(str, freq[i]++);
                    }
                    idx = idx + 2;
                }
            }
}

Answer 1

在循环的每次迭代中，您将idx递增2。但是，您没有对idx的上限施加任何限制。

因此，substring()函数的参数很快就会超出范围：

str = line.substring(idx, idx + 2);

您需要做的是将条件更改为：

if(idx+2<=line.length()){
    //code here
}

使用HashMaps查找字符串频率

1 个答案: