Question

我有一个输入文件，格式如下：安大略省：布兰普顿：43°41'N：79°45'W 安大略省：多伦多：43°39'N：79°23'W 魁北克省：蒙特利尔：45°30'N：73°31'W ...

我有一个名为值的类。例如：
省：安大略省城市：布兰普顿 LatDegrees：43 LatMinutes：41 LatDirection：N LongDegrees：79 ....等

我已经完成了一个正确解析它的方法，但我正在尝试学习使用Streams，Lambdas在Java 8中是否可以做得更好。

如果我从以下开始：

Files.lines(Paths.get(inputFile))
                
                .map(line -> line.split("\\b+")) //this delimits everything
                //.filter(x -> x.startsWith(":"))
                .flatMap(Arrays::stream)
                .forEach(System.out::println);

请有人帮我复制以下内容吗？

private void parseLine(String data) {
        int counter1 = 1;                       //1-2 province or city
        int counter2 = 1;                       //1-2 LatitudeDirection,LongitudeDirection
        int counter3 = 1;                       //1-4 LatitudeDegrees,LatitudeMinutes,LongitudeDegrees,LongitudeMinutes

        City city = new City();                 //create City object
        //String read = Arrays.toString(data);    //convert array element to String
        String[] splited = data.split(":");     //set delimiter
        
        for (String part : splited) {
            //System.out.println(part);
            char firstChar = part.charAt(0);    
            if(Character.isDigit(firstChar)){           //if the first char is a digit, then this part needs to be split again 
                String[] splited2 = part.split(" ");    //split second time with space delimiter
                for (String part2: splited2){
                    firstChar = part2.charAt(0);
                    if (Character.isDigit(firstChar)){                              //if the first char is a digit, then needs trimming
                        String parseDigits = part2.substring(0, part2.length()-1);  //trim trailing degrees or radians character
                        switch(counter2++){
                            case 1:
                                city.setLatitudeDegrees(Integer.parseInt(parseDigits));
                                //System.out.println("LatitudeDegrees: " + city.getLatitudeDegrees());
                                break;
                            case 2:
                                city.setLatitudeMinutes(Integer.parseInt(parseDigits));
                                //System.out.println("LatitudeMinutes: " + city.getLatitudeMinutes());
                                break;
                            case 3:
                                city.setLongitudeDegrees(Integer.parseInt(parseDigits));
                                //System.out.println("LongitudeDegrees: " + city.getLongitudeDegrees());
                                break;
                            case 4:
                                city.setLongitudeMinutes(Integer.parseInt(parseDigits));
                                //System.out.println("LongitudeMinutes: " + city.getLongitudeMinutes());
                                counter2 = 1;                       //reset counter2
                                break;
                        }
                    }else{
                        if(counter3 == 1){
                            city.setLatitudeDirection(part2.charAt(0));
                            //System.out.println("LatitudeDirection: " + city.getLatitudeDirection());
                            counter3++;                     //increment counter3 to use longitude next
                        }else{
                            city.setLongitudeDirection(part2.charAt(0));
                            //System.out.println("LongitudeDirection: " + city.getLongitudeDirection());
                            counter3 = 1;                   //reset counter 3
                            //System.out.println("Number of cities: " + cities.size());
                            cities.add(city);
                        }    
                    }
                }
            }else{
                if(counter1 == 1){
                    city.setProvince(part);
                    //System.out.println("\nProvince: " + city.getProvince());
                    counter1++;
                }else if(counter1 == 2){
                    city.setCity(part);
                    //System.out.println("City: " + city.getCity());
                    counter1 = 1;                       //reset counter1
                }
            }
        }
    }

毫无疑问，我的parseLine（）方法可能有更好的解决方案，但我真的想如上所述压缩它。谢谢!!

Answer 1

让我们从一些一般性说明开始。

不推荐您的序列.map(line -> line.split("\\b+")).flatMap(Arrays::stream)。这两个步骤将首先在创建包装该数组的另一个流之前创建一个数组。您可以使用splitAsStream跳过数组步骤，但这需要您明确处理Pattern而不是将其隐藏在String.split中：

.flatMap(Pattern.compile("\\b+")::splitAsStream)

但请注意，在这种情况下，分成单词并不能真正得到回报。

如果您想保留原始parseLine方法，只需执行

即可

Files.lines(Paths.get(inputFile))
     .forEach(this::parseLine);

你已经完成了。

但严重的是，这不是一个真正的解决方案。要进行模式匹配，您应该使用指定模式匹配的库，例如the regex package。当你通过split("\\b+")进行分割时，你已经在使用它了，但这远远落后于它可以为你做的事情。

让我们定义模式：

(…)形成一个允许捕获匹配部分的组，以便我们可以为结果提取它
[^:]*指定一个包含任意字符的标记，但任意长度的结肠（[^:]）除外（*）
\d+定义了一个数字（d =数字，+ =一个或多个）
[NS]和[WE]匹配单个字符N或S，或W或E

所以你要找的整个模式是

([^:]*):([^:]*):(\d+)° (\d+)' ([NS]):(\d+)° (\d+)' ([WE])

并且整个解析例程将是：

static Pattern CITY_PATTERN=Pattern.compile(
    "([^:]*):([^:]*):(\\d+)° (\\d+)' ([NS]):(\\d+)° (\\d+)' ([WE])");

static City parseCity(String line) {
    Matcher matcher = CITY_PATTERN.matcher(line);
    if(!matcher.matches())
        throw new IllegalArgumentException(line+" doesn't match "+CITY_PATTERN);
    City city=new City();
    city.setProvince(matcher.group(1));
    city.setCity(matcher.group(2));
    city.setLatitudeDegrees(Integer.parseInt(matcher.group(3)));
    city.setLatitudeMinutes(Integer.parseInt(matcher.group(4)));
    city.setLatitudeDirection(line.charAt(matcher.start(5)));
    city.setLongitudeDegrees(Integer.parseInt(matcher.group(6)));
    city.setLongitudeMinutes(Integer.parseInt(matcher.group(7)));
    city.setLongitudeDirection(line.charAt(matcher.start(8)));
    return city;
}

我真的希望你说你难以阅读的方法永远不会“浓缩”......

使用上面的例程，基于Stream的干净处理解决方案看起来像

List<City> cities = Files.lines(Paths.get(inputFile))
    .map(ContainingClass::parseCity).collect(Collectors.toList());

将文件收集到新的城市列表中。

将Files.lines与.map一起使用（line - ＆gt; line.split（“multiple delimiters”））

1 个答案: