Question

背景我正在尝试使用Apache Commons Math确定救护车的统计数据。我可以为一辆救护车做非常基本的单变量统计，但是当我想确定我的车队中所有救护车的统计数据时，我会陷入困境。

目标： 我的目标是使用JDBC生成基本结果集，然后将信息解析为统计信息。例如，我想取结果集，使其看起来像一张显示救护车的表格，2014年的平均值，2015年的平均值为标题。表格详细信息将显示每个救护车和每个标题的平均值

<table>
<tr><th>ambulance</th><th>average response time for year 2014</th><th>average response time for year 2015</th></tr>
<tr><td>Medic1</td><td>62</td><td>74</td></tr>
<tr><td>Medic2</td><td>83</td><td>79</td></tr>
<tr><td>Medic3</td><td>68</td><td>71</td></tr>
</table>

尝试伪代码 伪代码看起来像这样; 1.）为2014日历年平均响应时间分配变量。 2.）如果日历年是2014年，则循环遍历结果集中的所有救护车然后计算平均值。 3.）为日历年2015平均响应时间分配一个变量。 4.）循环通过所有救护车，如果日历年是2015年，则计算平均值。 5.）输出救护车，2014年平均响应时间，2015年平均响应时间

注释： 这将是一个良好的开端。至少存在逻辑和格式以进行更复杂的分析，例如确定年复一年的差异。但我被卡住了。我不确定如何对每辆救护车进行迭代以产生平均值。

我能够编写SQL查询，为每辆救护车生成平均值。但是我想使用Apache Commons Math，因为它提供了Skew，Kurtosis和其他措施。你在本段上面看到的是一个更复杂的事情的简化例子。

Java代码：

package EMSResearch;

import java.sql.*;
import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics;

public class EMSResearch
{

    public static void main(String[] args)
    {
        Connection conn = null;
        Statement stmt = null;
        try
        {
            conn = DriverManager.getConnection("jdbc:sqlserver://MyDatabase;database=Emergencies;integratedsecurity=false;user=MyUserName;password=MyPassword");
            stmt = conn.createStatement();
            String strSelect = "SELECT EmergencyID, YearOfCall, ResponseTime, Ambulance";
            ResultSet rset = stmt.executeQuery(strSelect);

            DescriptiveStatistics ds = new DescriptiveStatistics();
/*the following code does the job of generating average response time for Medic1 for year 2015. But I want it to loop through and get all the ambulances for year 2015*/
            while (rset.next())
            {
                if (rset.getString("Ambulance").equals("Medic1") && rset.getInt("YearOfCall") == 2015)
                {
                    String event = rset.getString("I_EventNumber");
                    int year = rset.getInt("YearOfCall");
                    int responseTime = rset.getInt("ResponseTime");
                    String truck = rset.getString("Ambulance");
                    ds.addValue(responseTime);
                }
            }
            System.out.println("mean average value " + ds.getMean());


        } catch (SQLException ex)
        {
            ex.printStackTrace();
        } finally
        {

Answer 1

这样的东西可能会有所帮助。如果您使用地图存储所有年份和卡车的所有数据，您可以获得我认为您需要的所有数据。这段代码并没有完全出炉，但我认为它的概念非常甜蜜。

  private static void getstats(ResultSet rset) throws SQLException {
    Map<Integer, Map<String, DescriptiveStatistics>> stats = new HashMap<>();
    while (rset.next()) {

      String event = rset.getString("I_EventNumber");
      int year = rset.getInt("YearOfCall");
      int responseTime = rset.getInt("ResponseTime");
      String truck = rset.getString("Ambulance");
      if (stats.containsKey(year)) {
        Map<String, DescriptiveStatistics> get = stats.get(year);
        if (get.containsKey(truck)) {
          get.get(truck).addValue(responseTime);
        } else {
          Map<String, DescriptiveStatistics> newmap = new HashMap<>();
          DescriptiveStatistics newDs = new DescriptiveStatistics();
          newDs.addValue(responseTime);
          newmap.put(truck, newDs);
        }

      } else {

        Map<String, DescriptiveStatistics> newmap = new HashMap<>();
        DescriptiveStatistics newDs = new DescriptiveStatistics();
        newDs.addValue(responseTime);
        newmap.put(truck, newDs);
        stats.put(year, newmap);
      }

    }
    for(Integer year : stats.keySet()){
      for(String truck : stats.get(year).keySet()){
        DescriptiveStatistics ds = stats.get(year).get(truck);
        /**do stuff with the ds for this year and this truck**/

      }
    }

  }

Answer 2

正如马克格所说，Map会帮助你。只是为了添加一点，我还会以有意义的方式对您的数据进行分组。例如，您当前的实现包含：

DescriptiveStatistics ds = new DescriptiveStatistics();

while (rset.next())
{
    if (rset.getString("Ambulance").equals("Medic1") && rset.getInt("YearOfCall") == 2015)
    {
        String event = rset.getString("I_EventNumber");
        int year = rset.getInt("YearOfCall");
        int responseTime = rset.getInt("ResponseTime");
        String truck = rset.getString("Ambulance");
        ds.addValue(responseTime);
    }
}

您现在正在做的是确定数据是否符合特定条件，将其添加到您的单个数据集中。但是如果你想检查另一个标准，你需要初始化另一个数据集，添加另一个if语句，将代码复制到那里;它不可扩展。

相反，请考虑创建一个可以按以下方式对数据进行分组的对象：

public class DataPoint {
    // Consider private members with public getters/setters.
    public String ambulance;
    public int year;

    public DataPoint(String ambulance, int year) {
        this.ambulance = ambulance;
        this.year = year;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result
                + ((ambulance == null) ? 0 : ambulance.hashCode());
        result = prime * result + year;
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        DataPoint other = (DataPoint) obj;
        if (ambulance == null) {
            if (other.ambulance != null)
                return false;
        } else if (!ambulance.equals(other.ambulance))
            return false;
        if (year != other.year)
            return false;
        return true;
    }
}

hashCode()和equals()覆盖非常重要，但与此讨论相关。基本上，它们确保Map可以找到并确定具有相同参数的两个不同对象是相等的。

现在，使用我们的新DataPoint对象，我们可以将收到的数据映射到特定数据集。因此，我在上面概述的实施将替换为：

Map<DataPoint, DescriptiveStatistics> map = new HashMap<DataPoint, DescriptiveStatistics>();

while (rset.next())
{
    // Get parameters we differentiate based on.
    String truck = rset.getString("Ambulance");
    int year = rset.getInt("YearOfCall");

    // Create the data point.
    DataPoint point = new DataPoint(truck, year);

     // Get data set for point; if it doesn't exist, create it. 
    if (map.get(point) == null) {
        map.put(new DescriptiveStatistics());
    }
    DescriptiveStatistics ds = map.get(point);

    // Add the data of interest to the given data set.
    int responseTime = rset.getInt("ResponseTime");
    ds.addValue(responseTime);
}

当while循环结束时，您将获得一个填充了特定数据点及其相关数据集的映射的Map。从那里只是迭代地图条目，你可以用数据集做任何你想做的事情：

for (Entry<DataPoint, DescriptiveStatistics> entry : map.entrySet())
...

希望澄清一点。

循环遍历结果集以按组

2 个答案: