Question

我正在读取具有Java 8流的CSV文件，什么是获取CSV文件中特定列的值的最佳方法？例如。

firstName,lastName,age,
tom,abraham,18,
liz,abraham,15,
tonny,paul,25

我想提取第二列，因此结果集将为abraham;paul Java 8 lambda和Streams如何做到这一点？

Answer 1

欢迎使用StackOverflow :)

可以将CSV文件与其他任何文本文件一样读取，不需要解析器，并且String::split和File::readAllLines就足够了：

Set<String> names = Files.readAllLines(Paths.get("file.csv")) // Read all lines of file
                         .stream()                            // Stream them
                         .skip(1)                             // Omit the column names (if any)
                         .map(s -> s.split(";")[1])           // Split by ; and get the 2nd column
                         .collect(Collectors.toSet());        // Collect Strings to Set

我不太清楚标签。如果您已经解析了List<List<String>>，则获得相同结果的最简单方法是：

Set<String> names = parsedList.stream()
                              .map(row -> row.get(1))       // Get the second column
                              .collect(Collectors.toSet())  // collect to Set<String>

请注意以下几点：

不需要distinct()方法，因为收集到Set可以确保定义中的不同元素。如果您坚持要收集到List<String>，则将最后一行替换为：
```
 ....
 .distinct()
 .collect(Collectors.toList());
```
这种方式对我建议的两种解决方案均有效。
如果CSV模式不规则或使用;字符转义，则以下行可能会出现异常：
- map(s -> s.split(";")[1])
- map(list -> list.get(1))
然后您需要使用解析器。

Answer 2

Files.readAllLines(Paths.get("my.csv")).stream()
        .skip(1)
        .map(s -> s.split(";")[1])
        .distinct()
        .collect(Collectors.toList()).forEach(System.out::println);

这是最简单的方法，但在这种情况下，我宁愿使用正则表达式和Matcher。

Answer 3

您可以使用地图流。

Files.lines(Paths.get("file path")).map(row -> row.split(";")).map(row -> row[1]).distinct().collect(Collectors.toList());

//first map: transform string to Array.
//Second map: choose first index array.
//distinct: remove duplicates elements.

Java Streams->从List <list <string >>

3 个答案: