我有一个 Given_Col ,其中有5个不同的值。我需要创建新列,如下所示:
A和B都去了---> col1
A和B都不走---> col1
A走B不走---> col2
A不会走,B会走---> col2
不知道--->如果值均为NaN两者都
Given_Col Expexted_Col1 Expexted_Col2
Both A and B goes Both A and B goes No idea
Neither A nor B goes Neither A nor B goes No idea
A goes B doesn't go No idea A goes B doesn't go
A doesn't go B goes No idea A doesn't go B goes
A goes B doesn't go No idea A goes B doesn't go
Neither A nor B goes Neither A nor B goes No idea
No idea No idea No idea
Both A and B goes Both A and B goes No idea
我想不出任何解决方案。实际的方法是什么?
注意:我考虑过复制现有列并映射值吗?
答案 0 :(得分:1)
我认为应该有两个条件列分配。
每个人都根据选择标准为该列选择有效的条目。如果您有五种以上的可能性,这可能会很笨拙,但是对于这种情况,它应该可以很好地工作。
df['Expexted_Col1'] = df.apply(lambda x: x['Given_Col'] if (x['Given_Col'] == 'Both A and B goes' or x['Given_Col'] == 'Neither A nor B goes') else 'No idea', axis = 1)
df['Expexted_Col2'] = df.apply(lambda x: x['Given_Col'] if (x['Given_Col'] == "A goes B doesn't go" or x['Given_Col'] == "A doesn't go B goes") else 'No idea', axis = 1)
答案 1 :(得分:1)
将pandas.DataFrame.assign
与fillna
结合使用的一种方式:
mapper = {'col1': ['Both A and B goes', 'Neither A nor B goes'],
'col2': ["A goes B doesn't go", "A doesn't go B goes"]}
s = df["Given_Col"]
new_df = df.assign(**{k: s[s.isin(v)] for k, v in mapper.items()}).fillna("No idea")
print(new_df)
输出:
Given_Col col1 col2
0 Both A and B goes Both A and B goes No idea
1 Neither A nor B goes Neither A nor B goes No idea
2 A goes B doesn't go No idea A goes B doesn't go
3 A doesn't go B goes No idea A doesn't go B goes
4 A goes B doesn't go No idea A goes B doesn't go
5 Neither A nor B goes Neither A nor B goes No idea
6 No idea No idea No idea
7 Both A and B goes Both A and B goes No idea
答案 2 :(得分:0)
您可以使用几个np.where函数来做到这一点:
df['col1'] = np.where(df['Given_Col'] == 'Both A and B goes', 'Both A and B goes', df['col1'])
df['col2'] = np.where(df['Given_Col'] == 'Both A and B goes', 'No idea', df['col1'])
df['col1'] = np.where(df['Given_Col'] == 'Neither A nor B goes', 'Neither A nor B goes', df['col2'])
df['col2'] = np.where(df['Given_Col'] == 'Neither A nor B goes', 'No idea', df['col2'])
您可以从这里继续。...