熊猫按其他列数填充列

时间:2020-04-06 10:10:41

标签: python pandas dataframe

示例df:

     company   vehicle registration
0   company1     truck       abc123
1   company1     truck      abcdefg
2   company1       car       234cse
3   company1  forklift          NaN
4   company1     truck        93ds2
5   company2       car      rentall
6   company2       car      rental2
7   company2     truck      rentals
8   company2     truck      rental*
9   company2       car      rental5
10  company3     truck       fdsa23
11  company3     truck        asdf4
12  company3     other       fdsag3
13  company3     other          NaN
14  company3     truck      gls319d

sample_data

我的目标是按公司和车辆类型进行计数(注册和车辆列将被删除)。

我已经尝试过了:

import pandas as pd

df = pd.read_csv('path to csv', header=0)

df.loc[df.vehicle == 'truck', 'trucks'] = 1
df.loc[df.vehicle == 'car', 'cars'] = 1
df.loc[df.vehicle != 'truck', 'others'] = 1
df.loc[df.vehicle != 'cars', 'others'] = 1

从那里开始,我假设某种groupby和sum函数将合并行和列。

不幸的是,这仅在车辆列中填充了“ 1”值,而不是在相应列中具有这些值。

我想要的输出是:


company   trucks  cars  others
company1  3       1     1 
company2  2       3     0
company3  3       0     2

我敢肯定这可能已经得到回答,但是今天早上我的google-fu很弱。

干杯。

1 个答案:

答案 0 :(得分:5)

首先将Series.map用于字典中已过滤的类别,然后将所有不匹配的值(NaN)替换为Series.fillna

然后传递到crosstab,如果输出列的顺序很重要,请添加DataFrame.reindex

      {'pic1': {'filename': 'pic1.png',
      'size': 545,
      'regions': [{'shape_attributes': {'name': 'polygon',
      'x_values': [211, 205, 214, 232, 254, 263, 265, 265, 263, 257, 221],
      'y_values': [186, 200, 214, 218, 214, 204, 198, 190, 187, 181, 180]},
      'type': {'animal': '1'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [272, 266, 275, 293, 315, 324, 326, 326, 324, 318, 282],
      'y_values': [233, 247, 261, 265, 261, 251, 245, 237, 234, 228, 227]},
      'type': {'animal': '2'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [366, 360, 369, 387, 409, 418, 420, 420, 418, 412, 376],
      'y_values': [315, 329, 343, 347, 343, 333, 327, 319, 316, 310, 309]},
      'type': {'animal': '2'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [201, 195, 204, 222, 244, 253, 255, 255, 253, 247, 211],
      'y_values': [224, 238, 252, 256, 252, 242, 236, 228, 225, 219, 218]},
      'type': {'animal': '3'}}],
      'file_attributes': {}},
      'pic2': {'filename': 'pic2.png',
      'size': 456,
      'regions': [{'shape_attributes': {'name': 'polygon',
      'x_values': [211, 205, 214, 232, 254, 263, 265, 265, 263, 257, 221],
      'y_values': [186, 200, 214, 218, 214, 204, 198, 190, 187, 181, 180]},
      'type': {'animal': '1'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [272, 266, 275, 293, 315, 324, 326, 326, 324, 318, 282],
      'y_values': [233, 247, 261, 265, 261, 251, 245, 237, 234, 228, 227]},
      'type': {'animal': '2'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [366, 360, 369, 387, 409, 418, 420, 420, 418, 412, 376],
      'y_values': [315, 329, 343, 347, 343, 333, 327, 319, 316, 310, 309]},
      'type': {'animal': '2'}},
      {'shape_attributes': {'name': 'polygon',
      'x_values': [201, 195, 204, 222, 244, 253, 255, 255, 253, 247, 211],
      'y_values': [224, 238, 252, 256, 252, 242, 236, 228, 225, 219, 218]},
      'type': {'animal': '3'}}],
      'file_attributes': {}}}