Question

我是Java新手，我必须解析一个.csv文件。该文件在每一行中包含学生的ID，他们通过的科目ID和他们通过的科目成绩。例如：

Student ID,Subject ID,Grade
1,A1-102,7
1,A1-103,6
1,A1-104,5
1,A1-108,9
2,A1-101,5
2,A1-105,7

我需要以类似于SQL's GROUP BY的方式计算学生通过的课程数量，例如：SELECT count(*) FROM STUDENTS GROUP BY Student_ID;假设打开了csv文件并可以读取，有没有一种方法可以将一个学生的多个条目分组？

我的代码：

csvFile = "C:\\Myfile.csv";

             try {

            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) {
              // what do i need to do here?
            }
        } catch (FileNotFoundException e) {
            System.out.println("File not found\n");
        } catch (IOException e) {
            System.out.println("An I/O exception has occured\n");
        } finally {
                if (br != null)
                try {
                    br.close();
                } catch (IOException e) {
                    System.out.println("File is already closed");
                }
            }

有什么想法吗？

编辑：文件中的所有学生都通过了相应的学科。

Answer 1

您可以像这样使用Java8轻松做到这一点，

Pattern comma = Pattern.compile(",");
try (Stream<String> stream = Files.lines(Paths.get("C:\\data\\sample.txt"))) {
    Map<Integer, Long> numberOfLessonsPassed = stream.skip(1).map(l -> comma.split(l))
            .map(s -> new Student(Integer.valueOf(s[0]), s[1], Integer.valueOf(s[2])))
            .filter(s -> s.getGrade() >= 5)
            .collect(Collectors.groupingBy(Student::getId, Collectors.counting()));
    System.out.println(numberOfLessonsPassed);
} catch (IOException e) {
    e.printStackTrace();
}

首先读取文件，跳过标题行。然后使用,正则表达式分割每一行。之后，将每条分割的线映射到Student对象中。过滤掉所有grade < 5学生。最后，将剩余的学生按照他们的Id分组，同时计算每组中的学生人数。

Student模型类应该看起来像这样。

public class Student {
    private final int id;
    private final String subjectId;
    private final int grade;

    public Student(int id, String subjectId, int grade) {
        super();
        this.id = id;
        this.subjectId = subjectId;
        this.grade = grade;
    }

    public int getId() {
        return id;
    }

    public String getSubjectId() {
        return subjectId;
    }

    public int getGrade() {
        return grade;
    }

}

我已经使用了.txt文件，假设您能够将其移植到.csv文件中。

Answer 2

这是更详细的解决方案

package com.company;

import javax.swing.text.html.StyleSheet;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;

public class Main {

static String csvFile = "your path";


public static void main(String[] args) {
// write your code here
    BufferedReader br = null;
    ArrayList<String> result = new ArrayList<>();
    //this x value serves as the upper limit for the 
    //number of students you wish to view
    for(int x = 0; x <= 3; x++) {
        try {
            String line;

            br = new BufferedReader(new FileReader(csvFile.toString()));

            String StudentIDNeeded = Integer.toString(x);
            while ((line = br.readLine()) != null) {
                if (line.substring(0, 1).equals(StudentIDNeeded)) {
                    result.add(line.toString());
                }
            }

        } catch (FileNotFoundException e) {
            System.out.println("File not found\n");
        } catch (IOException e) {
            System.out.println("An I/O exception has occured\n");
        } finally {
            if (br != null)
                try {
                    br.close();
                } catch (IOException e) {
                    System.out.println("File is already closed");
                }
        }
        System.out.println(result.toString());
    }
}

}

得出

的结果

[1,A1-102,7, 1,A1-103,6, 1,A1-104,5, 1,A1-108,9, 2,A1-101,5, 3,A1-105,7, 3,A1-101,5]

我添加了一些额外的要点，例如用于测试的第三个学生ID。

要更新要选择的学生人数，请在for循环中更改x值。

Answer 3

出于数据组织目的，拥有一个arraylist并不是最佳解决方案。我附加了最后一个解决方案，以引入一个哈希表，该哈希表存储由学生ID标识的数组列表。有些事情是一样的，例如for循环需要确切的学生人数。

BufferedReader br = null;
    //this is the master HashMap, a datastructure which points to n amount of separate arraylist objects.
    HashMap<String, ArrayList<String>> master = new HashMap<>();

    //x = 3 for demonstration purposes replace the value with the 
    //actual number of students 
    for(int x = 1; x <= 3; x++) {

        try {
            String line;
            ArrayList<String> result = new ArrayList<>();

            br = new BufferedReader(new FileReader(csvFile.toString()));
            String StudentIDNeeded = Integer.toString(x);

            while ((line = br.readLine()) != null) {

                if (line.substring(0, 1).equals(StudentIDNeeded)) {
                    result.add(line.substring(2).toString());
                }
            }

            master.put(Integer.toString(x),result);

        } catch (FileNotFoundException e) {
            System.out.println("File not found\n");
        } catch (IOException e) {
            System.out.println("An I/O exception has occured\n");
        } finally {
            if (br != null)
                try {
                    br.close();
                } catch (IOException e) {
                    System.out.println("File is already closed");
                }
        }

    }

    System.out.println("Hash Size:"+master.size());
    System.out.println("Hash Contents" + master.toString());
}

此代码块输出这两个字符串

Hash Size:3
Hash Contents{1=[A1-102,7, A1-103,6, A1-104,5, A1-108,9], 2=[A1-101,5], 
3=[A1-105,7, A1-101,5]}

该解决方案应通过利用哈希图中的许多数组列表来扩展到更大的数据集。

在Java中的csv文件中对行进行分组

3 个答案: