Question

我在python中使用pandas并拥有一个数据框，例如

 age  portembarked  fare  numparentschildren  passengerclass  sex  

0      1             1     1                   1               1    1   
1      2             2     1                   1               2    2   
2      1             1     1                   1               1    2  
...

我有一个列名列表，我想要称之为“父母”：[“年龄”，“票价”，“性别”] 和我希望数据为每个列的值称为“parent_vals”[1,2,2]

如何计算数据框的行数，使每列等于值？

例如，我正在寻找能做类似事情的pandas符号，

count = df[df[parents] == parent_vals].count()

^实际上并不支持。然后对于这个例子将返回1 如果我确切知道父列表中的内容，我知道我可以执行以下操作：

count = df[df["age"]==1 & df["fare"]==2 & df["sex"]==2].count()

但是“父母”中的特定列会随着我循环浏览更大的程序而改变，所以我想引用该列表。

Answer 1

IIUC，您可以对列进行索引，比较，然后sum向上计数。

df
   age  portembarked  fare  numparentschildren  passengerclass  sex
0    1             1     1                   1               1    1
1    2             2     1                   1               2    2
2    1             1     1                   1               1    2

(df[parents] == [1, 2, 2]).all(1).sum()
1

如果您收到Invalid broadcasting comparison错误，似乎解决方法是首先将列表转换为np.array，然后进行比较。

<强>详情

df[parents] == [1, 2, 2]
     age   fare   sex
0   True   True  True
1   True  False  True
2  False  False  True

(df[parents] == [1, 2, 2]).all(1)
0     True
1    False
2    False
dtype: bool

Pandas：如何在多个列上进行条件以计算具有特定列值的行

1 个答案: