熊猫聚合分组与另一列的最大值?

时间:2020-03-14 12:10:17

标签: python pandas

当前,我正在使用COVID数据集进行一些分析。

数据集具有以下形式:

    Country Province    Lat Lon         Date                    Cases   Status
0   Thailand        15.0000 101.0000    2020-01-22 00:00:00+00:00   2   confirmed
1   Thailand        15.0000 101.0000    2020-01-23 00:00:00+00:00   3   confirmed
2   Thailand        15.0000 101.0000    2020-01-24 00:00:00+00:00   5   confirmed
3   Thailand        15.0000 101.0000    2020-01-25 00:00:00+00:00   7   confirmed
4   Thailand        15.0000 101.0000    2020-01-26 00:00:00+00:00   8   confirmed

我想按国家/地区分组,汇总“情况”列(我们将其称为“案例总和”列),但是我遇到了经度和纬度问题:我想取最大值的纬度/经度案例栏的内容。换句话说,我希望案例数量最多的那一行的经纬度。要澄清的是,用例是像法国这样的国家/地区中的行经度和纬度都不同(例如,法属波利尼西亚),但是我只想从情况最多的区域中得出经/纬度分组。

我当前正在按照以下方式运行聚合:

nonzero_cases[(nonzero_cases['Date'] == "03/13/2020")].groupby("Country").agg({"Lat":"first","Lon":"first","Cases":"sum"})

这将产生:

Country     Lat     Lon     Cases
Afghanistan 33.0000 65.0000 7
Albania 41.1533 20.1683 33
Algeria 28.0339 1.6596  26
Andorra 42.5063 1.5218  1
...

但这不是我想要的,因为它没有考虑案例编号,只是选择了第一个纬度/经度。

2 个答案:

答案 0 :(得分:2)

在列Cases上添加DataFrame.sort_values,所以现在第一个值是每组最多Cases的行:

print (df)
    Country   Lat    Lon                       Date  Cases     Status
0  Thailand  15.0  101.0  2020-01-22 00:00:00+00:00      2  confirmed
1  Thailand  15.0  101.0  2020-01-23 00:00:00+00:00      3  confirmed
2  Thailand  15.0  101.0  2020-01-24 00:00:00+00:00      5  confirmed
3  Thailand  15.0  101.0  2020-01-25 00:00:00+00:00      7  confirmed
4  Thailand  14.0  103.0  2020-01-26 00:00:00+00:00      8  confirmed <- changed data

df1 = (df.sort_values('Cases', ascending=False)
         .groupby("Country")
         .agg({"Lat":"first","Lon":"first","Cases":"sum"}))

print (df1)
           Lat    Lon  Cases
Country                     
Thailand  14.0  103.0     25

答案 1 :(得分:1)

比jezrael的答案更混乱,但可以完成工作。不幸的是,groupby,np.where,.loc和pd.merge当然是我要研究的50%的熊猫。

<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/css/bootstrap.min.css" integrity="sha384-Vkoo8x4CGsO3+Hhxv8T/Q5PaXtkKtu6ug5TOeNV6gBiFeWPGFN9MuhOf23Q9Ifjh" crossorigin="anonymous">

<script src="https://code.jquery.com/jquery-3.4.1.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/popper.js@1.16.0/dist/umd/popper.min.js" integrity="sha384-Q6E9RHvbIyZFJoft+2mJbHaEWldlvI9IOYy5n3zV9zzTtmI3UksdQRVvoxMfooAo" crossorigin="anonymous"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/js/bootstrap.min.js" integrity="sha384-wfSDF2E50Y2D1uUdj0O3uMBJnjuUD4Ih7YwaYd1iqfktj0Uod8GCExl3Og8ifwB6" crossorigin="anonymous"></script>

<div class="container my-5">
  <div class="row">
    <div class="col-4">
      <h2>Mark your checkboxes</h2>
      <form id="editForm">
        <div class="form-check">
          <input class="form-check-input" type="checkbox" value="1" id="defaultCheck1">
          <label class="form-check-label" for="defaultCheck1">
            checkbox 1
          </label>
        </div>
        <div class="form-check">
          <input class="form-check-input" type="checkbox" value="2" id="defaultCheck2">
          <label class="form-check-label" for="defaultCheck2">
            checkbox 2
          </label>
        </div>
        <div class="form-check">
          <input class="form-check-input" type="checkbox" value="3" id="defaultCheck3">
          <label class="form-check-label" for="defaultCheck3">
            checkbox 3
          </label>
        </div>
        <div class="form-check">
          <input class="form-check-input" type="checkbox" value="4" id="defaultCheck4">
          <label class="form-check-label" for="defaultCheck4">
            checkbox 4
          </label>
        </div>
        <button type="submit" class="btn btn-primary mt-3">Submit</button>
      </form>
    </div>
    <div class="col-8">
      <table class="table table-striped">
        <thead>
          <tr>
            <th scope="col">#</th>
            <th scope="col">First</th>
            <th scope="col">Last</th>
            <th scope="col">Options</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <th scope="row"> <a href="#" class="btn btn-primary editBtn">edit</a></th>
            <td>Mark</td>
            <td>Otto</td>
            <td class="optList">
              <ul class="list-group">
                <li class="list-group-item" data-val="1">1 - Cras justo odio</li>
                <li class="list-group-item" data-val="2">2 - Dapibus ac facilisis in</li>
              </ul>
            </td>
          </tr>
          <tr>
            <th scope="row"> <a href="#" class="btn btn-primary editBtn">edit</a></th>
            <td>Jacob</td>
            <td>Thornton</td>
            <td class="optList">
              <ul class="list-group">
                <li class="list-group-item" data-val="3">3 - Morbi leo risus</li>
                <li class="list-group-item" data-val="4">4 - Porta ac consectetur ac</li>
              </ul>
            </td>
          </tr>
          <tr>
            <th scope="row"> <a href="#" class="btn btn-primary  editBtn">edit</a></th>
            <td>Larry</td>
            <td>the Bird</td>
            <td class="optList">
              <ul class="list-group">
                <li class="list-group-item" data-val="1">1 - Cras justo odio</li>
              </ul>
            </td>
          </tr>
          <tr>
            <th scope="row">
              <a href="#" class="btn btn-primary editBtn">edit</a>
            </th>
            <td>Larry</td>
            <td>the Bird</td>
            <td class="optList">
              <ul class="list-group">
                <li class="list-group-item" data-val="4">4 - Cras justo odio</li>
              </ul>
            </td>
          </tr>
        </tbody>
      </table>
    </div>
  </div>
</div>