我有以下数据框
import pandas as pd
import numpy as np
df = pd.DataFrame(data=[['yes',8],['yes',7],['no',np.nan],['yes',7],['no',np.nan]],columns=['passed','score'])
Out[8]:
passed score
0 yes 8.0
1 yes 7.0
2 no NaN
3 yes 7.0
4 no NaN
我想将通过的列合并,并仅将其计为1列,如下所示:
Out[10]:
passed
0 yes_8
1 yes_7
2 no
3 yes_7
4 no
我的尝试是df["passed"].map(str) + '_' + df["score"].map(str)
,但它没有我想要的那么干净
你能帮我吗?
答案 0 :(得分:0)
将df.apply
与axis=1
一起使用
演示:
import pandas as pd
import numpy as np
df = pd.DataFrame(data=[['yes',8],['yes',7],['no',np.nan],['yes',7],['no',np.nan]],columns=['passed','score'])
df["New"] = df.apply(lambda x: "yes_{}".format(int(x["score"])) if x["passed"] == "yes" else "no", axis=1)
print(df)
输出:
passed score New
0 yes 8.0 yes_8
1 yes 7.0 yes_7
2 no NaN no
3 yes 7.0 yes_7
4 no NaN no
答案 1 :(得分:0)
使用+
和pandas.DataFrame.apply
:
df['merged']=df['passed'].astype(str)+'_'+df['score'].fillna(' ').astype(str)
print(df['merged'].apply(lambda x: x.replace('_ ','').split('.')[0]))
输出:
0 yes_8
1 yes_7
2 no
3 yes_7
4 no
Name: merged, dtype: object
答案 2 :(得分:0)
使用dropna
删除NaN
,先转换为int
,再转换为string
,最后add
转换为列:
a = '_' + df['score'].dropna().astype(int).astype(str)
df['passed'] = df['passed'].add(a, fill_value='')
print (df)
passed score
0 yes_8 8.0
1 yes_7 7.0
2 no NaN
3 yes_7 7.0
4 no NaN
答案 3 :(得分:0)
您可以这样做:
df['passed'] = (df['passed'] + '_' + df['score'].fillna('').astype(str)).str.rstrip('_')
输出:
passed score
0 yes_8.0 8.0
1 yes_7.0 7.0
2 no NaN
3 yes_7.0 7.0
4 no NaN