Question

我正在尝试按顺序从文件中读取数字，但是文件中的第一个数字存在问题，请阅读问题末尾的示例。

public static ArrayList<String> ArraylineLengths() {
            ArrayList<String> Lines = new ArrayList<String>();
            String file = "tra.srt";
            BufferedReader br = null;
            try {
                br = new BufferedReader(new FileReader(file));

                String line;
                while((line = br.readLine()) != null) {
                    line = line.trim();
                    if(isInteger(line)) {
                        int i = Integer.parseInt(line);
                        if(i > 0) {
                            Lines.add(line);
                            System.out.println(line);
                        }
                    }
                }

            } catch(IOException ioe) {
                ioe.printStackTrace();
            } finally {
                if(br != null) {
                    try {
                        br.close();
                    } catch(IOException e) {
                        // do nothing
                    }
                }
            }
            return (Lines);

        }


        public static boolean isInteger(String s) {
            try {
                Integer.parseInt(s);
            } catch(NumberFormatException e) {
                return false;
            }
            // only got here if we didn't return false
            return true;
        }

    }

输入文件：

1
00:01:09,069 --> 00:01:11,446
All right now.
Y'all fresh veggies.

2
00:01:11,571 --> 00:01:13,239
Y'all gonna be in a chopped salad.

3
00:01:13,573 --> 00:01:16,409
Very simple. I want you to take your knife.

我应该得到的是1 2 3等数字，但我得到了：

2 3 4 5 ...等。

这是因为如果我使用substring(1,2)，则文件顶部的第一个数字会正常工作，但由于此link中存在旧问题，我无法解决问题。

用HXD读取文件后：

EF BB BF 31 0D 0A 30 30 3A 30 30 3A 30 31 2C 36 30 30 20 2D 2D 3E 20

30 30 20 2D 2D 3E 20 30 30 3A 30 30 3A 30 34 2C

Answer 1

在将输入复制粘贴到文件中后，我尝试了您的代码，并且它运行良好。

所以，我认为你在文件的开头有一个不可见的字符，我认为它可能是BOM。

您可以使用十六进制编辑器查看文件的开头，并发现有问题的字符。

以下是我在输入文件中执行的操作：

$ hexdump -C /tmp/tra.srt | head
00000000  31 0a 30 30 3a 30 31 3a  30 39 2c 30 36 39 20 2d  |1.00:01:09,069 -|

如您所见，文件以0x31开头，字符 1 ，并继续0x0a，即 \ n 。如果您在文件开头有BOM，则会以0xef 0xbb 0xbf开头。

如果您确实有BOM，可以查看this question以查看如何跳过它，或者您可以在修剪线后添加以下代码：

if (line.startsWith("\uFEFF"))
    line = line.substring(1);

Answer 2

您可以通过添加行

来解决问题

line = line.replace("\uFEFF", "");

按顺序从文件中读取一组数字

2 个答案: