Question

我有这个数据框：

import pandas as pd

columns = ['ID','Data']
data = [['26A20',123],
        ['12A20',123],
        ['23A20',123]]
df = pd.DataFrame.from_records(data=data, columns=columns)

>>df
      ID  Data
0  26A20   123
1  12A20   123
2  23A20   123

这是一个简单的任务，当ID以26或23开头时，从ID中删除A：s。

df.loc[df['ID'].str.startswith(('23','26'))]['ID'] = df['ID'].str.replace('A','')

SettingWithCopyWarning：试图在一个副本上设置一个值从DataFrame切片。尝试使用.loc [row_indexer，col_indexer] = 值代替

没有任何变化：

>>df
      ID  Data
0  26A20   123
1  12A20   123
2  23A20   123

我正在使用loc，我在做什么错了？

Answer 1

删除双][以避免chained assignments：

df.loc[df['ID'].str.startswith(('23','26')), 'ID'] = df['ID'].str.replace('A','')
print (df)
      ID  Data
0   2620   123
1  12A20   123
2   2320   123

两侧也可以使用过滤器来减少函数replace的执行：

mask = df['ID'].str.startswith(('23','26'))
df.loc[mask, 'ID'] = df.loc[mask, 'ID'].str.replace('A','')
print (df)
      ID  Data
0   2620   123
1  12A20   123
2   2320   123

Answer 2

有np.where()方法：

df['ID'] = np.where(df['ID'].str.startswith(('23','26')), df['ID'].str.replace('A', ''), df['ID'])

替换字符串熊猫中的字母

2 个答案: