输入数据框如下
data = {
's_id' :[5,7,26,70.0,55,71.0,8.0,'nan','nan',4],
'r_id' : [[34, 44, 23, 11, 71], [53, 33, 73, 41], [17], [10, 31], [17], [75, 8],[7],[68],[50],[]]
}
df = pd.DataFrame.from_dict(data)
df
Out[240]:
s_id r_id
0 5 [34, 44, 23, 11, 71]
1 7 [53, 33, 73, 41]
2 26 [17]
3 70 [10, 31]
4 55 [17]
5 71 [75, 8]
6 8 [7]
7 nan [68]
8 nan [50]
9 4 []
期望的数据帧
data = {
's_id' :[5,7,26,70.0,55,71.0,8.0,'nan','nan',4],
'r_id' : [[5,34, 44, 23, 11, 71], [7,53, 33, 73, 41], [26,17], [70,10, 31], [55,17], [71,75, 8],[8,7],[68],[50],[4]]
}
df = pd.DataFrame.from_dict(data)
df
Out[241]:
s_id r_id
0 5 [5, 34, 44, 23, 11, 71]
1 7 [7, 53, 33, 73, 41]
2 26 [26, 17]
3 70 [70, 10, 31]
4 55 [55, 17]
5 71 [71, 75, 8]
6 8 [8, 7]
7 nan [68]
8 nan [50]
9 4 [4]
需要用S_id中的元素作为r_id列表列中的第一个元素填充列表列,我也有nan值,其中有些显示为浮点列,谢谢。
我尝试了以下方法,
df['r_id'] = df["s_id"].apply(lambda x : x.append(df['r_id']) )
df['r_id'] = df["s_id"].apply(lambda x : [x].append(df['r_id'].values.tolist()))
答案 0 :(得分:1)
如果nan
缺少值,请使用apply
,将data = {
's_id' :[5,7,26,70.0,55,71.0,8.0,np.nan,np.nan,4],
'r_id' : [[34, 44, 23, 11, 71], [53, 33, 73, 41],
[17], [10, 31], [17], [75, 8],[7],[68],[50],[]]
}
df = pd.DataFrame.from_dict(data)
print (df)
f = lambda x : [int(x["s_id"])] + x['r_id'] if pd.notna(x["s_id"]) else x['r_id']
df['r_id'] = df.apply(f, axis=1)
print (df)
s_id r_id
0 5.0 [5, 34, 44, 23, 11, 71]
1 7.0 [7, 53, 33, 73, 41]
2 26.0 [26, 17]
3 70.0 [70, 10, 31]
4 55.0 [55, 17]
5 71.0 [71, 75, 8]
6 8.0 [8, 7]
7 NaN [68]
8 NaN [50]
9 4.0 [4]
的值转换为一个元素列表,然后转换为整数,并过滤掉误码值:
NaN
另一个想法是过滤器列并将功能应用于非m = df["s_id"].notna()
f = lambda x : [int(x["s_id"])] + x['r_id']
df.loc[m, 'r_id'] = df[m].apply(f, axis=1)
print (df)
s_id r_id
0 5.0 [5, 34, 44, 23, 11, 71]
1 7.0 [7, 53, 33, 73, 41]
2 26.0 [26, 17]
3 70.0 [70, 10, 31]
4 55.0 [55, 17]
5 71.0 [71, 75, 8]
6 8.0 [8, 7]
7 NaN [68]
8 NaN [50]
9 4.0 [4]
的行:
Crashes.SendingErrorReport += async (sender, e) =>
{
// Your code, e.g. to present a custom UI.
};