熊猫:根据另一列中的条件更改该列的先前单元格值

时间:2020-01-16 08:07:35

标签: python-3.x pandas multiple-columns

我有一个如下所示的Pandas数据集: dataset of words and their features

我希望将“性别”列中的“ x”替换为以下条件:如果“单词”列中包含“Mädchen”之类的单词列表,则应在“性别”列中添加“中性” ,在前一个单词的行(是数字)中。

例如,这个:

Gender   Words

 x        10.
 x        Mädchen

应成为:

Gender   Words

Neutral   10.
 x        Mädchen

我已经像这样尝试过np.where

Food2_case["Gender"]= np.where(Food2_case.Words.isin(["Mädchen"]), (dropped_data.Words.str.contains('\d',regex= True) == 'A'), "x")

但是我遇到了这个错误:

ValueError:操作数不能与形状一起广播 (8000,)(275988,)()

1 个答案:

答案 0 :(得分:0)

# Create dataset
data = pd.DataFrame([[0, 0, 0], [10, "Madchen", 5]]).T
data.columns = ["Gender", "Words"]

# Shift one column of interest (take the value of previous row)
data.loc[:, "iswordin"] = data.Words.shift(-1)

# Do what you want to do
data.loc[data.iswordin.isin(["Madchen", "Girl", "boy", "..."]), "Gender"] = "Neutral"

# Now you can drop "iswordin" column which is no longer useful