解析两个文件并生成员工数据

时间:2018-03-19 06:58:46

标签: java sql file bufferedreader filereader

假设我们有两个逗号分隔值的日志文件。 file1.txt代表employee idemployee namefile2.txt代表与employee id相关联的projectsfile1file2有独特的条目。 file2.txt会有很多关系。如果新员工没有分配任何项目,则File1.txt:(EmpId, EmpName) 1,abc 2,ac 3,bc 4,acc 5,abb 6,bbc 7,aac 8,aba 9,aaa File2.txt: (EmpId, ProjectId) 1,102 2,102 1,103 3,101 5,102 1,103 2,105 2,200 9,102 Find the each employee has been assigned to number of projects. For New employees if they dont have any projects print 0; Output: 1=3 2=3 3=1 4=0 5=1 6=0 7=0 8=0 9=1 中没有任何条目。

file1

我使用BufferedReader从file2读取一行,并将其与public static void main(String[] args) throws IOException { // TODO Auto-generated method stub BufferedReader file1 = new BufferedReader(new FileReader("file1.txt")); BufferedReader file2 = new BufferedReader(new FileReader("file2.txt")); BufferedReader file3 = new BufferedReader(new FileReader("file2.txt")); HashMap<String,Integer> empProjCount = new HashMap<String, Integer>(); int lines =0; while (file2.readLine() != null) lines++; String line1 = file1.readLine(); String[] line_1 = line1.split(","); String line2 = file3.readLine(); String[] line_2 = line2.split(","); while(line1 != null && line2 != null) { int count = 0; for(int i=1;i<=lines+1 && line2 != null;i++) { if(line_1[0].equals(line_2[0])) { count++; } line2 = file3.readLine(); if(line2 != null){ line_2 = line2.split(","); } } file3 = new BufferedReader(new FileReader("file2.txt")); empProjCount.put(line_1[0], count); line1 = file1.readLine(); if(line1 != null) line_1 = line1.split(","); line2 = file3.readLine(); if(line2 != null) line_2 = line2.split(","); } System.out.println(empProjCount); 中的每一行进行比较。以下是我的代码,

file2.txt

我的问题是,

  1. 有没有办法优化它而不是O(n ^ 2),而不使用任何额外的空间?

  2. 我使用3个BufferedReader来读取res.locals,因为一旦我们读到一行,它就会移到下一行。是否还有其他选项来标记当前行?

  3. 如果我们将此视为表格,查询上述方案的最佳方法是什么?

3 个答案:

答案 0 :(得分:1)

1:是的。

2:是的:

我会在两次迭代中完成:

  1. 迭代ID(file1)并初始化地图(empId,projectCounter)

  2. 迭代项目(file2)和每行更新(projectCounter ++)地图中的相应条目。

  3. 通过这种方式,您将拥有几乎线性的执行时间(对于file1和file2大小)。

答案 1 :(得分:1)

Map收集所有员工ID的file 1,并将其初始化为包含0项目计数。

    // Build my map of all employees.
    Map<Integer, Integer> employeeProjectCount = Arrays.stream(file1)
            // Get empId - Split on comma, take the first field and convert to integer.
            .map(s -> Integer.valueOf(s.split(",")[0]))
            // Build a Map for the results.
            .collect(Collectors.toMap(
                    // Key is emp ID.
                    empId -> empId,
                    // Value starts at zero.
                    empId -> ZERO
            ));

遍历file 2计算项目。

    // Walk the projects list.
    Arrays.stream(file2)
            // Get empId - Split on comma, take the first field and convert to integer (again).
            .map(s -> Integer.valueOf(s.split(",")[0]))
            // Count the projects.
            .forEach(empId -> employeeProjectCount.put(empId, employeeProjectCount.get(empId)+1));

打印它:

    // Print it.
    System.out.println(employeeProjectCount);

给出

  

{1 = 3 = 2 = 3,3 = 1,4 = 0,5 = 1,6 = 0,7 = 0,8 = 0,9 = 1}

BTW:我使用String[] s。

这些文件
String[] file1 = {
        "1,abc",
        "2,ac",
        "3,bc",
        "4,acc",
        "5,abb",
        "6,bbc",
        "7,aac",
        "8,aba",
        "9,aaa",};
String[] file2 = {
        "1,102",
        "2,102",
        "1,103",
        "3,101",
        "5,102",
        "1,103",
        "2,105",
        "2,200",
        "9,102",
};

答案 2 :(得分:1)

使用Files.lines和正则表达式:

Pattern employeePattern = Pattern.compile("(?<id>\\d+),(?<name>\\s+)");
Set<String> employees = Files.lines(Paths.get("file1.txt"));
    .map(employeePattern::matcher).filter(Matcher::matches)
    .map(m -> m.group("id")).collect(Collectors.toSet());

Pattern projectPattern = Pattern.compile("(?<emp>\\d+),(?<proj>\\d+)");
Map<String,Long> projects = Files.lines(Paths.get("file2.txt"))
    .map(projectPattern::matcher).filter(Matcher::matches)
    .collect(Collectors.groupingBy(m -> m.group("emp"), Collectors.counting());

打印结果:

employees.stream()
    .map(emp -> emp + "=" + projects.getOrDefault(emp, 0L))
    .forEach(System.out::println);