我试图命名将在最后一组括号" genesym"之间出现的字符串。到目前为止,我正在使用制表符交换每个括号的最后一个匹配项。在这些函数之间,我想命名现有的String geneym。
我意识到这是一个扫描仪功能,但这是我知道怎么说的唯一方法。
import java.lang.*;
import java.io.*;
public class TESTING
{
public static void main(String[] args)
{
try {
BufferedReader br = new BufferedReader(new FileReader("human.rna.fna"));
BufferedWriter bw = new BufferedWriter(new FileWriter("FormattedHumanRNA"));
String line
String genesym;
while ((line = br.readLine()) != null) {
if (line.startsWith(">")) {
// Replaces the last set of parenthesis with a tab character
int openbracket = line.lastIndexOf("(");
line = new StringBuilder(line)
.replace(openbracket, openbracket + 1, "\t")
.toString();
**genesym = br.nextString();**
// Replaces the last close parenthesis with a tab character
int closebracket = line.lastIndexOf(")");
line = new StringBuilder(line)
.replace(closebracket, closebracket + 1, "\t")
.toString();
} else {
line = line.replaceAll ("\n", "");
}
bw.write(genesym + " : " + line);
}
br.close();
bw.close();
} catch(IOException e) {
e.printStackTrace(System.err);
}
}
}
示例:(我的数据比这大,大约100万行)
输入文件:
>365 (LOC1), long non-coding RNA AGCGTCT
>22 (1*split3**) (FLJ), long RNA AAAATC
>13 (RTV), RNA ATGCG
期望的输出:
LOC1 : >365 LOC1 , long non-coding RNA AGCGTCT
FLJ : >22 (1*split3**) FLJ , long RNA AAAATC
RTV : >13 RTV ,RNA ATGCG
答案 0 :(得分:0)
String genesym=line.substring(openbracket,closebracket);
然后替换您想要替换的内容。