逐行读取字符串

时间:2009-07-08 07:34:24

标签: java string

给定一个不太长的字符串,逐行读取它的最佳方法是什么?

我知道你可以这样做:

BufferedReader reader = new BufferedReader(new StringReader(<string>));
reader.readLine();

另一种方法是在eol上获取子字符串:

final String eol = System.getProperty("line.separator");
output = output.substring(output.indexOf(eol + 1));

还有其他更简单的方法吗?我对上述方法没有任何问题,只是想知道你们中是否有人知道一些看起来更简单,更有效的方法吗?

11 个答案:

答案 0 :(得分:187)

还有Scanner。您可以像BufferedReader

一样使用它
Scanner scanner = new Scanner(myString);
while (scanner.hasNextLine()) {
  String line = scanner.nextLine();
  // process the line
}
scanner.close();

我认为这对于两个建议的方法都是一种更清洁的方法。

答案 1 :(得分:124)

您还可以使用String的split方法:

String[] lines = myString.split(System.getProperty("line.separator"));

这为您提供了方便的数组中的所有行。

我不知道分裂的表现。它使用正则表达式。

答案 2 :(得分:37)

由于我对效率角度特别感兴趣,我创建了一个小测试类(如下)。 5,000,000行的结果:

Comparing line breaking performance of different solutions
Testing 5000000 lines
Split (all): 14665 ms
Split (CR only): 3752 ms
Scanner: 10005
Reader: 2060

像往常一样,确切的时间可能会有所不同,但这个比例是正确的,但我经常会这样做。

结论:&#34;更简单&#34;并且&#34;效率更高&#34; OP的要求不能同时满足,split解决方案(在任何一个化身中)都比较简单,但Reader实现胜过其他人。

import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

/**
 * Test class for splitting a string into lines at linebreaks
 */
public class LineBreakTest {
    /** Main method: pass in desired line count as first parameter (default = 10000). */
    public static void main(String[] args) {
        int lineCount = args.length == 0 ? 10000 : Integer.parseInt(args[0]);
        System.out.println("Comparing line breaking performance of different solutions");
        System.out.printf("Testing %d lines%n", lineCount);
        String text = createText(lineCount);
        testSplitAllPlatforms(text);
        testSplitWindowsOnly(text);
        testScanner(text);
        testReader(text);
    }

    private static void testSplitAllPlatforms(String text) {
        long start = System.currentTimeMillis();
        text.split("\n\r|\r");
        System.out.printf("Split (regexp): %d%n", System.currentTimeMillis() - start);
    }

    private static void testSplitWindowsOnly(String text) {
        long start = System.currentTimeMillis();
        text.split("\n");
        System.out.printf("Split (CR only): %d%n", System.currentTimeMillis() - start);
    }

    private static void testScanner(String text) {
        long start = System.currentTimeMillis();
        List<String> result = new ArrayList<>();
        try (Scanner scanner = new Scanner(text)) {
            while (scanner.hasNextLine()) {
                result.add(scanner.nextLine());
            }
        }
        System.out.printf("Scanner: %d%n", System.currentTimeMillis() - start);
    }

    private static void testReader(String text) {
        long start = System.currentTimeMillis();
        List<String> result = new ArrayList<>();
        try (BufferedReader reader = new BufferedReader(new StringReader(text))) {
            String line = reader.readLine();
            while (line != null) {
                result.add(line);
                line = reader.readLine();
            }
        } catch (IOException exc) {
            // quit
        }
        System.out.printf("Reader: %d%n", System.currentTimeMillis() - start);
    }

    private static String createText(int lineCount) {
        StringBuilder result = new StringBuilder();
        StringBuilder lineBuilder = new StringBuilder();
        for (int i = 0; i < 20; i++) {
            lineBuilder.append("word ");
        }
        String line = lineBuilder.toString();
        for (int i = 0; i < lineCount; i++) {
            result.append(line);
            result.append("\n");
        }
        return result.toString();
    }
}

答案 3 :(得分:21)

使用Apache Commons IOUtils,您可以通过

很好地完成此操作
List<String> lines = IOUtils.readLines(new StringReader(string));

它没有做任何聪明的事情,但它很好而且紧凑。它也会处理流,如果您愿意,也可以获得LineIterator

答案 4 :(得分:14)

使用Java 8Stream API

Method references功能的解决方案
new BufferedReader(new StringReader(myString))
        .lines().forEach(System.out::println);

public void someMethod(String myLongString) {

    new BufferedReader(new StringReader(myLongString))
            .lines().forEach(this::parseString);
}

private void parseString(String data) {
    //do something
}

答案 5 :(得分:7)

自Java 11以来,有一种新方法String.lines

/**
 * Returns a stream of lines extracted from this string,
 * separated by line terminators.
 * ...
 */
public Stream<String> lines() { ... }

用法:

"line1\nline2\nlines3"
    .lines()
    .forEach(System.out::println);

答案 6 :(得分:6)

您也可以使用:

String[] lines = someString.split("\n");

如果不起作用,请尝试将\n替换为\r\n

答案 7 :(得分:6)

你可以使用包含在BufferedReader中的流api和StringReader,它在java 8中输出了line()流:

import java.util.stream.*;
import java.io.*;
class test {
    public static void main(String... a) {
        String s = "this is a \nmultiline\rstring\r\nusing different newline styles";

        new BufferedReader(new StringReader(s)).lines().forEach(
            (line) -> System.out.println("one line of the string: " + line)
        );
    }
}

给出

one line of the string: this is a
one line of the string: multiline
one line of the string: string
one line of the string: using different newline styles

就像在BufferedReader的readLine中一样,新行字符本身不包括在内。支持所有类型的换行符(甚至在相同的字符串中)。

答案 8 :(得分:2)

或者将新的try with resources子句与Scanner结合使用:

   try (Scanner scanner = new Scanner(value)) {
        while (scanner.hasNextLine()) {
            String line = scanner.nextLine();
            // process the line
        }
    }

答案 9 :(得分:2)

您可以尝试以下正则表达式:

\r?\n

代码:

String input = "\nab\n\n    \n\ncd\nef\n\n\n\n\n";
String[] lines = input.split("\\r?\\n", -1);
int n = 1;
for(String line : lines) {
    System.out.printf("\tLine %02d \"%s\"%n", n++, line);
}

输出:

Line 01 ""
Line 02 "ab"
Line 03 ""
Line 04 "    "
Line 05 ""
Line 06 "cd"
Line 07 "ef"
Line 08 ""
Line 09 ""
Line 10 ""
Line 11 ""
Line 12 ""

答案 10 :(得分:1)

最简单,最通用的方法是只使用与Linebreak matcher相匹配的正则表达式\R Any Unicode linebreak sequence

Pattern NEWLINE = Pattern.compile("\\R")
String lines[] = NEWLINE.split(input)

@请参阅https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html