在java中读取文件会产生错误的字符

时间:2015-11-30 00:09:15

标签: java string stringbuilder readfile

我目前正在进行Coursera生物信息学专业化,并且我坚持反向补充问题。我不是要问这个问题的答案,因为这是不道德的。

当我使用测试数据集测试我的解决方案时,我将其作为字符串直接放在源代码中,我的答案是正确的。但是,当我使用从文本文件中读取的数据集测试我的解决方案时,我得到了错误的答案。数据集由随机字符(A,T,C,G)组成。

我的主要方法如下:

public static void main(String[] args) throws IOException
{
    String dataset = readFile("filepath/dataset_3_2 (7).txt");
    String output = reverseComplement(dataset);
    BufferedWriter writer = null;
    try
    {
        writer = new BufferedWriter( new FileWriter("ergebnis.txt"));
        writer.write(output);

    }
    catch ( IOException e)
    {
    }
    finally
    {
        try
        {
            if ( writer != null)
            writer.close( );
        }
        catch ( IOException e)
        {
        }
    }
    System.out.println(checkForWrongCharacters(dataset));
    System.out.println("Invalid characters: " + returnOthers(dataset));
}

由于输入数据集应该只包含字母A,G,C,T。因此我实现了两种方法来检查无效字符。

public static String returnOthers(String pattern)
{
    StringBuilder others = new StringBuilder();
    for(int i = 0; i < pattern.length(); i++)
    {
        char c = pattern.charAt(i);
        switch(c) {
        case 'A': continue;
        case 'G': continue;
        case 'T': continue;
        case 'C': continue;
        default: others.append(c);
        break;
        }
    }
    return others.toString();
}

这是另一个:

public static boolean checkForWrongCharacters(String pattern)
{
    boolean flag = false;
    StringBuilder result = new StringBuilder();
    for(int i = 0; i < pattern.length(); i++)
    {
        String s = "";
        char c = pattern.charAt(i);
        switch(c) {
        case 'A': continue;
        case 'G': continue;
        case 'T': continue;
        case 'C': continue;
        default: s = "Z";
        break;
        }
        result.append(s);
    }
    if(result.toString().contains("Z"))
    {
        flag = true;
    }
    else
    {
        flag = false;
    }
    return flag;
}

方法checkForWrongCharacters()返回true,这意味着数据集中的字符串不是A,T,C或G.但方法returnOthers()不会返回任何内容。

当我读取大量文本文件时,是否存在编码问题?

修改

完全忘记发布我的readFile()方法...

public static String readFile(String filename) throws IOException
{
    String content = null;
    File file = new File(filename);
    FileReader reader = null;
    try {
         reader = new FileReader(file);
         char[] chars = new char[(int) file.length()];
         reader.read(chars);
         content = new String(chars);
         reader.close();
    } catch (IOException e) {
          e.printStackTrace();
    } finally {
        if(reader !=null){reader.close();}
    }
    return content;
}

1 个答案:

答案 0 :(得分:-1)

这完成了这项工作。有回车和换行符混乱了结果。

public static void main(String[] args) throws IOException
{
    String dataset = readFile("filepath/dataset_3_2 (7).txt");
    String dataset1 = dataset.replace("\r","");
    String dataset2 = dataset1.replace("\n","");
    String output = reverseComplement(dataset2);
    BufferedWriter writer = null;
    try
    {
        writer = new BufferedWriter( new FileWriter("ergebnis.txt"));
        writer.write(output);

    }
    catch ( IOException e)
    {
    }
    finally
    {
        try
        {
            if ( writer != null)
            writer.close( );
        }
        catch ( IOException e)
        {
        }
    }
    System.out.println(checkForWrongCharacters(dataset));
    System.out.println("Invalid characters: " + returnOthers(dataset));
}