Java说非空文件是空的?

时间:2011-12-22 02:07:57

标签: java file-io

我有一个particular file,Java说是空的......

源代码

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class MinimumWorkingExample
{
    public static void main(String[] args) throws FileNotFoundException
    {
        String filename = "/home/tyson/Data/English-French_test/test/test.f";
        Scanner fileIn = new Scanner(new File(filename));
        System.out.println("***START***");
        while(fileIn.hasNextLine())
        {
            System.out.println(fileIn.nextLine());
        }
        System.out.println("***FINISH***");
    }
}

输出

***START***
***FINISH***

...但该文件不为空:

控制台

tyson@tyson-desktop:~$ head /home/tyson/Data/English-French_test/test/test.f
<s snum=0001> 2 .  </s>
<s snum=0002> 2 .  </s>
<s snum=0003> oh , oh !  </s>
<s snum=0004> oh , oh !  </s>
<s snum=0005> oh , oh !  </s>
<s snum=0006> souvenons - nous , monsieur le Orateur , que ce sont ces secteurs de notre soci�t� qui servent de �pine dorsale � notre �conomie .  </s>
<s snum=0007> bravo !  </s>
<s snum=0008> bravo !  </s>
<s snum=0009> monsieur le Orateur , ma question se adresse � le ministre charg� de les transports .  </s>
<s snum=0010> tous deux poss�dent de nombreuses ann�es de exp�rience dans la fabrication et la distribution de les produits forestiers .  </s>
tyson@tyson-desktop:~$ 

问题

为什么会发生这种情况?

3 个答案:

答案 0 :(得分:3)

同时执行Scanner fileIn = new Scanner(新文件(filename),“Cp1252”);因为这是法语的编码,你的系统似乎是UTF-8。 如果扫描程序认为读取UTF-8多字节,则可能存在编码问题。

答案 1 :(得分:0)

您可能缺少扫描仪的默认分隔符,因此它将整个文件视为一行而没有结束,因此hasNextLine()为false。确保你从

获得的角色
Scanner.delimiter()

存在于您的文件中。如果它们不匹配,您可以使用

Scanner.useDelimiter("\\s or your regex/string here")

将其设置为正确的。

答案 2 :(得分:0)

根据Java Docs,行分隔符是以下任何一种。你的文件是否包含任何?

private static final String LINE_SEPARATOR_PATTERN = "\r\n|[\n\r\u2028\u2029\u0085]"