计算.maf文件中发生突变的次数

时间:2016-06-12 04:29:30

标签: java python bioinformatics

我正在尝试计算MAF文件中的突变数量。我最初在python中编写了这段代码并且它工作得非常好,但是当我将它翻译成Java时它停止了工作。在输出文件中,突变的数量总是一个。我在这里做错了什么?

package dev.suns.bioinformatics;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.PrintWriter;

public class Main {

    static String filePath = "C:/Users/Matthew/Bioinformatics/Data Files/DLBC.maf";
    static String fileName = "DLBC_Info.txt";

public static void main(String[] args){
    createFile(filePath, fileName);
}

public static void createFile(String filePath, String fileName){
    BufferedReader br = null;
    String line = "";
    String delimiter = "\t";

    String geneSymbol = "";
    String newGene = "";
    int count;

    try {

        PrintWriter writer = new PrintWriter(fileName);
        br = new BufferedReader(new FileReader(filePath));

        writer.println("Gene" + "\t" + "Mutations" + "\n");

        br.readLine();

        while ((line = br.readLine()) != null){

            String[] splitFile = line.split(delimiter);
            newGene = splitFile[0];

            if(geneSymbol == ""){
                geneSymbol = newGene;   
            }

            else if(newGene == geneSymbol){
                #This is here I am having trouble. I have this if-statement to check if the gene appears more than once in the .maf file, but nothing is ever entering this.
                count++;
            }
            else{
                count++;
                writer.println(geneSymbol + "\t" + count + "\n");
                geneSymbol = newGene;
                count=0;
            }

        }
        writer.close();
    }catch(Exception e){
        e.printStackTrace();
    }
  }
}

以下是该文件的前几行

基因突变

A1CF 1

A2M 1

A2M 1

A2ML1 1

A4GALT 1

AADAC 1

AADACL3 1

AAED1 1

AAGAB 1

AAGAB 1

AARD 1

AARS2 1

AARS2 1

AARS2 1

1 个答案:

答案 0 :(得分:0)

在java中,您需要使用equals函数比较字符串。这应该有效 -

else if(newGene.equals(geneSymbol)){
            #This is here I am having trouble. I have this if-statement to check if the gene appears more than once in the .maf file, but nothing is ever entering this.
            count++;
        }

“==”检查参考。它们是否是相同的字符串对象。为了比较字符串的值,你需要使用equals()函数。