我有一个数据框:
A B C D E
12 4.5 6.1 BUY NaN
12 BUY BUY 5.6 NaN
BUY 4.5 6.1 BUY NaN
12 4.5 6.1 0 NaN
我想计算每一行中“BUY”出现的次数。预期结果:
A B C D E score
12 4.5 6.1 BUY NaN 1
12 BUY BUY 5.6 NaN 2
15 4.5 6.1 BUY NaN 1
12 4.5 6.1 0 NaN 0
我尝试了以下方法,但它只是为所有行给出了 0:
df['score'] = df[df == 'BUY'].sum(axis=1)
请注意,BUY 只能出现在 B、C、D、E 列中。
我试图在网上找到解决方案,但令人震惊的是没有找到。
将不胜感激。谢谢!
答案 0 :(得分:6)
您可以比较然后求和:
df['score'] = (df[['B','C','D','E']] == 'BUY').sum(axis=1)
这总结了所有的布尔值,你会得到正确的结果。
当你做 df[df == 'BUY']
时,你只是用 BUY
替换了任何不是 np.nan
的东西,然后对 axis=1 求和不起作用,因为你在结果中剩下的就是np.nan
和 'BUY'
字符串。因此你得到全 0。
答案 1 :(得分:2)
或者您可以将 apply
与 list.count
一起使用:
df['score'] = df.apply(lambda x: x.tolist().count('BUY'), axis=1)
print(df)
输出:
A B C D E score
0 12 4.5 6.1 BUY NaN 1
1 12 BUY BUY 5.6 NaN 2
2 BUY 4.5 6.1 BUY NaN 2
3 12 4.5 6.1 0 NaN 0
答案 2 :(得分:1)
尝试使用 extern crate chrono;
extern crate timer;
use std::collections::HashMap;
use std::thread;
use std::time::Duration;
fn insert_to_guard_map(guard_map: &mut HashMap<i32, timer::Guard>) {
let timer = timer::Timer::new();
let guard = timer.schedule_with_delay(chrono::Duration::seconds(2), || {
println!("Called after 2s.");
});
guard_map.insert(42, guard);
}
fn main() {
let mut guard_map = HashMap::new();
insert_to_guard_map(&mut guard_map);
thread::sleep(Duration::from_secs(4));
}
和 apply
over axis=1。这一次将每一行作为一个系列。您可以使用条件 lambda
过滤该行,然后使用 [row == 'BUY']
len()
df['score'] = df.apply(lambda row: len(row[row == 'BUY']), axis=1)
print(df)
答案 3 :(得分:1)
import numpy as np
df['score'] = np.count_nonzero(df == 'BUY', axis=1)
输出:
A B C D E score
0 12 4.5 6.1 BUY NaN 1
1 12 BUY BUY 5.6 NaN 2
2 BUY 4.5 6.1 BUY NaN 2
3 12 4.5 6.1 0 NaN 0