Question

我有一个项目编号列表如下。

item_numbers = [1,2,5]

我还有一个包含item_numbers成分的csv文件。

,sugar, protein, salt, oil
0, 0.2, 0.3, 0,   0
1, 0,    0,  0.2, 0.8
2, 0.4,  0,  0,   0

现在，我想获取列表中值大于零的项目的成分（如果值== 0，我不需要该成分）

E.g., item 1 in 'item_numbers' list -> ['salt', 'oil']

是否可以使用pandas进行？

Answer 1

您可以先按loc选择行，按dropna删除可能已添加NaN的行，然后按gt与0进行比较。最后列表使用apply：

df = df.loc[item_numbers].dropna(how='all').gt(0).apply(lambda x: x.index[x].tolist(), 1)
print (df)
1    [salt, oil]
2        [sugar]
dtype: object

如果您希望使用,加入值：

df = df.loc[item_numbers].dropna(how='all').gt(0)
s = np.where(df, ['{}, '.format(x) for x in df.columns], '')
out = pd.Series([''.join(x).strip(', ') for x in s], index=df.index)
print (out)
1    salt, oil
2        sugar
dtype: object

print (df.dtypes)
sugar      float64
protein    float64
salt       float64
oil        float64
dtype: object

在python中的csv文件中仅选择有限的列

1 个答案: