我的原始数据框看起来像这样,只有第一行...:
trait Named {
fn name(self: Box<Self>) -> String;
}
struct Person {
first_name: String,
last_name: String,
}
impl Named for Person {
fn name(self: Box<Self>) -> String {
format!("{} {}", self.first_name, self.last_name)
}
}
pub struct Mech<'a> {
driver: Box<Named + 'a>,
}
impl<'a> Mech<'a> {
pub fn driver_name(self) -> String {
self.driver.name()
}
}
fn main() {}
我用以下代码对其进行了汇总:
categories id products
0 A 1 a
1 B 1 a
2 C 1 a
3 A 1 b
4 B 1 b
5 A 2 c
6 B 2 c
然后是数据框,我也从DF中添加了n个离群值:
df2 = df.groupby('id').products.nunique().reset_index().merge(
pd.crosstab(df.id, df.categories).reset_index()
现在,我正在尝试删除新DF中的异常值:
id products A B C
0 1 2 2 2 1
1 2 1 1 1 0
2 3 50 1 1 30
然后我得到的是:
#remove outliners
del df2['id']
df2 = df2.loc[df2['products']<=20,[str(i) for i in df2.columns]]
它删除了异常值,但是为什么现在我在categorie列中仅获得NaN?
答案 0 :(得分:0)
df2 = df2.loc[df2['products'] <= 20]