Question

我有一个pandas数据帧，我想通过一些增量（比如.001）增加任何大于零的值，但只能在列的子集中增加。

wrapper

所以我尝试了这个：

df=pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'], 'b': [2,np.nan, 0, 6], 'c': [1, 0, 2, 0]})

     a    b  c
0  abc  2.0  1
1  abc  NaN  0
2  abc  0.0  2
3  abc  6.0  0

但是，因为第一列有一个对象dtype，我不能这样做，因为你可以看到错误。所需的输出是：

df[df.loc[:,['b', 'c']]>0]+=1

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

有没有办法在没有明确循环遍历每一列的情况下执行此类操作？

我相信我只是错过了一个简单的方法，但似乎无法找到一个例子。

Answer 1

你可以试试这个：

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'], 
                   'b': [2,np.nan, 0, 6], 
                   'c': [1, 0, 2, 0]})

inc = 0.01
df.loc[:, df.dtypes.ne('object')] += inc
df.replace({inc:0}, inplace=True)        

print(df)

或者是Tai和np.where提出的建议（这应该更快）：

cols = df.columns[df.dtypes.ne('object')]
df[cols] += np.where(df[cols] >0, 0.01, 0)

返回：

     a     b     c
0  abc  2.01  1.01
1  abc   NaN  0.00
2  abc  0.00  2.01
3  abc  6.01  0.00

Answer 2

您可以将add与select_dtypes

一起使用

df.add((df.select_dtypes(exclude=object)>0).astype(int)*0.0001).combine_first(df)
Out[18]: 
     a       b       c
0  abc  2.0001  1.0001
1  abc     NaN  0.0000
2  abc  0.0000  2.0001
3  abc  6.0001  0.0000

Answer 3

You can also only add on columns b and c.

df[["b", "c"]] += np.where(df[["b", "c"]] > 0, 0.01, 0)

We use np.where to fill in 0 as to bypass np.nan in the data.

Anton vBR has an elegant way to select columns one need.

更改Pandas DataFrame中的特定值（其中存在混合类型）

3 个答案: