我有以下...ead-loader » vue-style-loader » css-loader » vue-loader » postcss-loader » sass-loader » vue-loader » components\dashboard.vue
,
df
我想基于cluster_id inv_id
1 A1
1 A1
2 A1111A
2 A1111A
上的两个条件,groupby
cluster_id
并创建一个名为invalid_inv_id
的列:
inv_id
或
1. in each cluster, if the length of inv_id (stripped of non numerics) < 100 set "invalid_inv_id" to true;
代码就像
2. in each cluster, if the length of inv_id is < 3 set "invalid_inv_id" to true.
我想知道如何将两个条件合并为一行代码,所以结果看起来像这样,
df['inv_id_stp'] = df.inv_id.str.replace(r'\D+', '')
grouped = df.groupby('cluster_id')
invoices['invalid_inv_id'] = grouped['inv_id_stp'].transform(lambda x: x.str.len()) < 100
invoices['invalid_inv_id'] = grouped['inv_id'].transform(lambda x: x.str.len()) < 3
答案 0 :(得分:1)
IIUC,这里不需要groupby
(df.inv_id.str.len()<3)|(df.inv_id.str.replace(r'\D+', '').str.len()<100)
Out[472]:
0 True
1 True
2 True
3 True
Name: inv_id, dtype: bool
由于需要any
((df.inv_id.str.len()<3)|(df.inv_id.str.replace(r'\D+', '').str.len()<100)).groupby(df['cluster_id']).transform('any')