计算多列熊猫中特定值的数量

时间:2021-01-03 12:49:22

标签: python pandas dataframe

我有一个数据框:

A    B    C    D     E

12  4.5  6.1   BUY  NaN
12  BUY  BUY   5.6  NaN
BUY  4.5  6.1  BUY  NaN
12  4.5  6.1   0    NaN 

我想计算每一行中“BUY”出现的次数。预期结果:

A    B    C    D     E   score

12  4.5  6.1   BUY  NaN    1
12  BUY  BUY   5.6  NaN    2
15  4.5  6.1  BUY   NaN    1
12  4.5  6.1   0    NaN    0

我尝试了以下方法,但它只是为所有行给出了 0:

df['score'] = df[df == 'BUY'].sum(axis=1)

请注意,BUY 只能出现在 B、C、D、E 列中。

我试图在网上找到解决方案,但令人震惊的是没有找到。

将不胜感激。谢谢!

4 个答案:

答案 0 :(得分:6)

您可以比较然后求和:

df['score'] = (df[['B','C','D','E']] == 'BUY').sum(axis=1)

这总结了所有的布尔值,你会得到正确的结果。


当你做 df[df == 'BUY'] 时,你只是用 BUY 替换了任何不是 np.nan 的东西,然后对 axis=1 求和不起作用,因为你在结果中剩下的就是np.nan'BUY' 字符串。因此你得到全 0。

答案 1 :(得分:2)

或者您可以将 applylist.count 一起使用:

df['score'] = df.apply(lambda x: x.tolist().count('BUY'), axis=1)
print(df)

输出:

     A    B    C    D   E  score
0   12  4.5  6.1  BUY NaN      1
1   12  BUY  BUY  5.6 NaN      2
2  BUY  4.5  6.1  BUY NaN      2
3   12  4.5  6.1    0 NaN      0

答案 2 :(得分:1)

尝试使用 extern crate chrono; extern crate timer; use std::collections::HashMap; use std::thread; use std::time::Duration; fn insert_to_guard_map(guard_map: &mut HashMap<i32, timer::Guard>) { let timer = timer::Timer::new(); let guard = timer.schedule_with_delay(chrono::Duration::seconds(2), || { println!("Called after 2s."); }); guard_map.insert(42, guard); } fn main() { let mut guard_map = HashMap::new(); insert_to_guard_map(&mut guard_map); thread::sleep(Duration::from_secs(4)); } apply over axis=1。这一次将每一行作为一个系列。您可以使用条件 lambda 过滤该行,然后使用 [row == 'BUY']

计算“BUY”的数量
len()
df['score'] = df.apply(lambda row: len(row[row == 'BUY']), axis=1)
print(df)

答案 3 :(得分:1)

import numpy as np
df['score'] = np.count_nonzero(df == 'BUY', axis=1)

输出:

      A   B   C   D   E score
0    12 4.5 6.1 BUY NaN     1
1    12 BUY BUY 5.6 NaN     2
2   BUY 4.5 6.1 BUY NaN     2
3    12 4.5 6.1   0 NaN     0