从CSV文件计算平均值

时间:2015-04-22 15:50:51

标签: java csv average

我有一个CSV文件,格式如下:

City,Job,Salary Delhi,Doctors,500 Delhi,Lawyers,400 Delhi,Plumbers,100 London,Doctors,800 London,Lawyers,700 London,Plumbers,300 Tokyo,Doctors,900 Tokyo,Lawyers,800 Tokyo,Plumbers,400 Lawyers,Doctors,300 Lawyers,Lawyers,400 Lawyers,Plumbers,500 Hong Kong,Doctors,1800 Hong Kong,Lawyers,1100 Hong Kong,Plumbers,1000 Moscow,Doctors,300 Moscow,Lawyers,200 Moscow,Plumbers,100 Berlin,Doctors,800 Berlin,Plumbers,900 Paris,Doctors,900 Paris,Lawyers,800 Paris,Plumbers,500 Paris,Dog catchers,400

我想找到总工资的平均值。

这是我的代码:

`import java.io。*;

公共类A {

public static void main(String args[])
{
A a= new A();
a.run();
}

public void run()
{
String csv="C:\\Users\\Dipayan\\Desktop\\salaries.csv";
        BufferedReader br = null;
String line = "";
int sum=0;
int count=0;
//String a=new String();


        try {

            br = new BufferedReader(new FileReader(csv));
            try {
                while ((line = br.readLine()) != null) {

                        // use comma as separator
                    String[] country = line.split(",");
                    int sal=Integer.parseInt(country[2]);
                    sum=sum+sal;
                         count++;
                //System.out.println("Salary [job= " + country[0] 
                                  //        + " , salary=" + country[2] + "]");

                }
            } catch (NumberFormatException | IOException e) {
                System.out.println("NA");
                e.printStackTrace();
            }


        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }  
        System.out.println(sum/count);


        System.out.println("Done");
      }

    }` 

但是,显示错误:

  

java.lang.NumberFormatException:对于输入字符串:" Salary"       at java.lang.NumberFormatException.forInputString(Unknown Source)       在java.lang.Integer.parseInt(未知来源)       在java.lang.Integer.parseInt(未知来源)       在A.run(A.java:30)       在A.main(A.java:9)   线程" main"中的例外情况java.lang.ArithmeticException:/ by零       在A.run(A.java:46)       在A.main(A.java:9)`

是否有更好或更短的代码来解析CSV文件。

4 个答案:

答案 0 :(得分:1)

第一行包含单词" Salary"在第三点。把br.readLine()放在循环之前,一切都应该没问题。

你有:

br = new BufferedReader(new FileReader(csv));
try {
  while ((line = br.readLine()) != null) {

将其更改为:

br = new BufferedReader(new FileReader(csv));
br.readLine()
try {
  while ((line = br.readLine()) != null) {

答案 1 :(得分:1)

跳过CSV文件的第一行。做一个额外的

br.readLine()

之前。

您可能还想添加一些格式检查,以确保您正在阅读的文件格式正确。

答案 2 :(得分:0)

br.readLine()之前

while-loop会避免标题行问题,但如果您的数据不正确,您将再次获得相同的Exception,因此,为了制作更安全的方法,您可以改变这一行:

int sal=Integer.parseInt(country[2]);

使用try-catch块来迭代整个文件,即使值不是有效数字

int sal;
try {
    sal=Integer.parseInt(country[2]);
} catch (NumberFormatException e) {
    // if you want here you can show an error message 
    // to give feedback to the user there is not a valid number
}

答案 3 :(得分:0)

首先,使用CSV解析器 - 我将在此示例中使用OpenCSV。我与OpenCSV没有任何关系,这正是我目前在POM中所拥有的。

首先,创建一个class

public class Salary {
    private String city;
    private String job;
    private long salary;

    public String getCity() {
        return city;
    }

    public void setCity(String city) {
        this.city = city;
    }

    public String getJob() {
        return job;
    }

    public void setJob(String job) {
        this.job = job;
    }

    public long getSalary() {
        return salary;
    }

    public void setSalary(long salary) {
        this.salary = salary;
    }
}

现在你的CSV有三列,CSV的标题与我们bean的属性名称相匹配,所以我们可以简单地使用HeaderColumnNameMappingStrategy来确定在bean上设置哪些属性:

final HeaderColumnNameMappingStrategy<Salary> mappingStrategy = new HeaderColumnNameMappingStrategy<>();
mappingStrategy.setType(Salary.class);

现在我们只需要将CSV文件解析为我们的List个bean:

final CsvToBean<Salary> csvToBean = new CsvToBean<>();
try (final Reader reader = ...) {
    final List<Salary> salaries = csvToBean.parse(mappingStrategy, reader);
}

好。

现在,你如何从这个烂摊子中获得平均工资?只需在结果上使用Java 8 Stream

    final LongSummaryStatistics statistics = salaries.stream()
            .mapToLong(Salary::getSalary)
            .summaryStatistics();

现在我们可以获得各种有用的信息:

final long min = statistics.getMin();
final double average = statistics.getAverage();
final long max = statistics.getMax();