我正在尝试调试我的程序以查找错误;例如,当我尝试运行我的代码时,它只打印出DNA字符串而不是打印出基因序列。问题区域是printAll方法的while语句。我需要在while循环中调用findStopIndex方法。但是我想知道为什么当我跑它时我会空着。任何见解将不胜感激。
public class FindMultiGenes4 {
public
int
findStopIndex(String dna, int index){
int stop1 = dna.indexOf("tga", index);
if (stop1 == -1 || (stop1-index) % 3 != 0){
stop1 = dna.length();
}
int stop2 = dna.indexOf("taa", index);
if (stop2 == -1 || (stop2-index) % 3 != 0){
stop2 = dna.length();
}
int stop3 = dna.indexOf("tag", index);
if (stop3 == -1 || (stop3-index) % 3 != 0){
stop3 = dna.length();
}
return Math.min(stop1, Math.min(stop2,stop3));
}
public void printAll(String dna) {
dna = "CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
String sequence = dna.toLowerCase();
int index = 0;
int newIndex = 0;
while (true) {
index = sequence.indexOf("atg", index);
if (index == -1)
break;
if (newIndex == -1) // Check needed only if a stop codon is not guaranteed for each start codon.
break;
int stop = findStopIndex(dna, index);
if (stop != sequence.length()){
System.out.println("From " + (index ) + " to " + newIndex+3 + " Gene: " + sequence.substring(index, stop+3));
index = sequence.substring(index, stop + 3).length();
}
else {index = index+3;
}
}
}
public void testFinder(){
FindMultiGenes4 FMG = new FindMultiGenes4();
String dna = "CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA";
FMG.printAll(dna);
System.out.println("DNA: "+dna);
}
}
答案 0 :(得分:0)
问题出在以下一行
int stop = findStopIndex(dna, index);
dna
是大写字符串,其中findStopIndex
检查小写基因序列tga, taa, tag
。
代码还有一些其他小问题,我希望已经纠正,请参阅下面的修改代码
public class FindMultiGenes4 {
private static final String GENE_PREFIX = "ATG";
private static final String[] GENE_SUFFIXES = {"TGA", "TAA", "TAG"};
public int findStopIndex(String dna, int index) {
int minStop = dna.length();
for(String suffix : GENE_SUFFIXES) {
int stop = -1;
int localIndex = index;
do{//repeating if the match found is not multiple of 3
stop = dna.indexOf(suffix, localIndex);
if(stop == -1) {
stop = dna.length();
break;
}
localIndex = stop + 3;
} while( (stop - index) % 3 != 0);
if(minStop > stop) {
minStop = stop;
}
}
return minStop + 3;
}
public void printAll(String dna) {
String localDna = dna.toUpperCase();
int index = 0;
while(index != -1 && index + 3 < localDna.length()) {
index = localDna.indexOf(GENE_PREFIX, index);
if(index == -1) {
break;
}
int stop = findStopIndex(localDna, index + 3);
if(stop < dna.length()) {
System.out.println("From " + (index) + " to " + stop
+ " Gene: " + dna.substring(index, stop));
}
index = stop;
}
}
public static void main(String[] args) {
FindMultiGenes4 FMG = new FindMultiGenes4();
String[] dnaSamples = {"CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA",
"catgtaatagatgaatgactgatagatatgcttgtatgctatgaaaatgtgaaatgaccca",
"cAtGtAaTaGaTgAaTgAcTgAtAgAtAtGcTtGtAtGcTaTgAaAaTgTgAaAtGaCcCa",
"ATGAAATGAAAA",
"ccatgccctaataaatgtctgtaatgtaga"};
for(String dna : dnaSamples) {
System.out.println("DNA: " + dna);
FMG.printAll(dna);
System.out.println("");
}
}
}
<强>输出强>
DNA: CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA
From 1 to 7 Gene: ATGTAA
From 10 to 22 Gene: ATGAATGACTGA
From 27 to 57 Gene: ATGCTTGTATGCTATGAAAATGTGAAATGA
DNA: catgtaatagatgaatgactgatagatatgcttgtatgctatgaaaatgtgaaatgaccca
From 1 to 7 Gene: atgtaa
From 10 to 22 Gene: atgaatgactga
From 27 to 57 Gene: atgcttgtatgctatgaaaatgtgaaatga
DNA: cAtGtAaTaGaTgAaTgAcTgAtAgAtAtGcTtGtAtGcTaTgAaAaTgTgAaAtGaCcCa
From 1 to 7 Gene: AtGtAa
From 10 to 22 Gene: aTgAaTgAcTgA
From 27 to 57 Gene: AtGcTtGtAtGcTaTgAaAaTgTgAaAtGa
DNA: ATGAAATGAAAA
From 0 to 9 Gene: ATGAAATGA
DNA: ccatgccctaataaatgtctgtaatgtaga
From 2 to 11 Gene: atgccctaa
From 14 to 29 Gene: atgtctgtaatgtag
我使用下面的正则表达式实现了相同的算法,事实证明,这比上面简单。
使用正则表达式
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FindMultiGenes5 {
/*(?i) : Case insensitive match
* ATG : Starts with ATG
* (\\w{3})*? : smallest string with length multiple of 3
* (TGA|TAA|TAG) : one of TAG, TAA or TAG
*/
private static final String GENE_REGEX = "(?i)ATG(\\w{3})*?(TGA|TAA|TAG)";
public void regexMatch(String dna) {
Matcher matcher = Pattern.compile(GENE_REGEX).matcher(dna);
while(matcher.find()) {
System.out.println("From " + matcher.start() + " to " + matcher.end() + " Gene: " + matcher.group());
}
}
public static void main(String[] args) {
FindMultiGenes5 FMG = new FindMultiGenes5();
String[] dnaSamples = {"CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA",
"catgtaatagatgaatgactgatagatatgcttgtatgctatgaaaatgtgaaatgaccca",
"cAtGtAaTaGaTgAaTgAcTgAtAgAtAtGcTtGtAtGcTaTgAaAaTgTgAaAtGaCcCa",
"ATGAAATGAAAA",
"ccatgccctaataaatgtctgtaatgtaga"};
/*String[] dnaSamples = {"ATGaaabbbATGTGATAATGA".toLowerCase()};*/
for(String dna : dnaSamples) {
System.out.println("DNA: " + dna);
FMG.regexMatch(dna);
System.out.println("");
}
}
}
正则表达式输出
DNA: CATGTAATAGATGAATGACTGATAGATATGCTTGTATGCTATGAAAATGTGAAATGACCCA
From 1 to 7 Gene: ATGTAA
From 10 to 22 Gene: ATGAATGACTGA
From 27 to 57 Gene: ATGCTTGTATGCTATGAAAATGTGAAATGA
DNA: catgtaatagatgaatgactgatagatatgcttgtatgctatgaaaatgtgaaatgaccca
From 1 to 7 Gene: atgtaa
From 10 to 22 Gene: atgaatgactga
From 27 to 57 Gene: atgcttgtatgctatgaaaatgtgaaatga
DNA: cAtGtAaTaGaTgAaTgAcTgAtAgAtAtGcTtGtAtGcTaTgAaAaTgTgAaAtGaCcCa
From 1 to 7 Gene: AtGtAa
From 10 to 22 Gene: aTgAaTgAcTgA
From 27 to 57 Gene: AtGcTtGtAtGcTaTgAaAaTgTgAaAtGa
DNA: ATGAAATGAAAA
From 0 to 9 Gene: ATGAAATGA
DNA: ccatgccctaataaatgtctgtaatgtaga
From 2 to 11 Gene: atgccctaa
From 14 to 29 Gene: atgtctgtaatgtag
答案 1 :(得分:0)
好的, 你有/仍然有几个问题。因为我不知道最终目标,所以我无法提供太多帮助。我不知道算法的规则。
然而,我做了一些事情,最终似乎有效: 首先,序列必须作为参数而不是dna在行中发送:int stop = findStopIndex(dna, index);
变为
int stop = findStopIndex(sequence, index);
然后您会发现newIndex
变量没有做太多事情。该值始终保持为0.此外,检查-1无关紧要。我将输出中的值更改为(stop + 3)
。还要注意括号。没有它,它将被解释为字符串。
其他一些不错的改进包括将您的值添加为变量而不是硬编码:
private final String[] STOP_SEQUENCES = {"tga", "taa", "tag"};
private final String START_SEQ = "atg";
作为一般规则,请尽量避免重复代码。在findStopIndex(String dna, int index)
代码中,重复变量3次。这很好,直到有更多的变量。什么是50 000个停止代码?
因此可以将该方法拆分并使其更加通用:
public int findStopIndex(String dna, int index) {
int minStop = dna.length();
int prevStop = dna.length();
for (String stopSeq : STOP_SEQUENCES) {
int stop = dna.indexOf(stopSeq, index);
if (!hasStop(stop, index)) {
stop = dna.length();
}
int tempMinStop = Math.min(stop, prevStop);
minStop = minStop > tempMinStop ? tempMinStop : minStop;
prevStop = stop;
}
return minStop;
}
public boolean hasStop(int stop, int index) {
if (stop == -1 || (stop - index) % 3 != 0) {
return false;
}
return true;
}
printAll(String dna)
方法:
public void printAll(String dna) {
String sequence = dna.toLowerCase();
int index = 0;
while (true) {
index = sequence.indexOf(START_SEQ, index);
if (index == -1) {
break;
}
int stop = findStopIndex(sequence, index);
if (stop != sequence.length()) {
System.out.println("From " + (index) + " to " + (stop + 3) + " Gene: " + sequence.substring(index, stop + 3));
index = stop;
} else {
index = index + 3;
}
}
}
请注意所做的更改:
index = sequence.substring(index, stop + 3).length();
现在是
index = stop;
避免无限循环。
这可以通过内置的调试工具轻松调试。一个合适的Java IDE应该有一个调试器。 有关更多信息,请查看使用IDE调试, 例如,以下是如何使用Netbeans进行调试:Netbeans Debugging
这是Eclipse上的一个: Eclipse Debugging
除此之外,虽然这是一个小程序,但在解决意外输出时,打印或记录程序某些区域的值也会有很大帮助。