我有以下熊猫数据框-
df =
1.0 2.0 3.0 4.0 5.0
(1083, 596) (1050, 164) (1050, 164)
(1081, 595) (1050, 164) (1080, 162)
(1081, 594) (1049, 163) (1070, 164)
(1082, 593)
(1050, 164)
(1050, 164)
(1049, 163)
(1049, 163)
(1052, 463)
(1051, 468)
(1054, 465)
(1057, 463)
我需要一个全新的数据帧df2
,其中包含3列:1.0、2.0(组合2.0和4.0)和3.0(组合3.0和5.0)。
结果将是-
df2 =
1.0 2.0 3.0
(1083, 596) (1050, 164) (1050, 164)
(1081, 595) (1050, 164) (1080, 162)
(1081, 594) (1049, 163) (1070, 164)
(1082, 593)
(1050, 164)
(1050, 164)
(1049, 163)
(1049, 163)
(1052, 463)
(1051, 468)
(1054, 465)
(1057, 463)
您可以期望合并的列中不会有重叠的值;如果某一列连续有效,则其他列将具有NaN值。
我尝试-
df.fillna(0)
df2['2.0']=df['2.0']+df['4.0']
,它无法正常工作。有什么简单有效的方法吗?
答案 0 :(得分:1)
基本上只是复制和粘贴。我认为这可行。
# copy values over to your other columns
# note: [0:3,'2.0'] gets the first 4 rows (index 0 to 3) of column '2.0'
# then you set it equal to the first 4 rows of column '4.0'
df.loc[0:3,'2.0'] = df.loc[0:3,'4.0']
df.loc[0:3,'3.0'] = df.loc[0:3,'5.0']
# just get the three columns you need
df2 = df[['1.0','2.0','3.0']]
1.0 2.0 3.0
0 (1083, 596) (1050, 164) (1050, 164)
1 (1081, 595) (1050, 164) (1080, 162)
2 (1081, 594) (1049, 163) (1070, 164)
3 (1082, 593) NaN NaN
4 NaN (1050, 164) NaN
5 NaN (1050, 164) NaN
6 NaN (1049, 163) NaN
7 NaN (1049, 163) NaN
8 NaN NaN NaN
9 NaN NaN (1052, 463)
10 NaN NaN (1051, 468)
11 NaN NaN (1054, 465)
12 NaN NaN (1057, 463)
如果您的列名实际上是浮动的,请从以下部分中删除引号:df.loc[0:3,'2.0']
例如更改为df.loc[0:3,2.0]
,例如:
df.loc[0:3,2.0] = df.loc[0:3,4.0]
df.loc[0:3,3.0] = df.loc[0:3,5.0]
答案 1 :(得分:1)
您可以使用DataFrame.where()
和DataFrame.isnull()
来混合尝试使用的值:
df2 = pd.DataFrame(df["1.0"], columns=["1.0"])
df2["2.0"] = df["2.0"].where(~df2["2.0"].isnull(), df2["4.0"])
df2["3.0"] = df["3.0"].where(~df2["3.0"].isnull(), df2["5.0"])
答案 2 :(得分:1)
假设df
中的空格为NaN
。您只需要将移位列'2.0, 3.0, 4.0, 5.0'
移到2个位置,并对combine_first
做df
。最后,使用iloc
df2 = df.combine_first(df.drop('1.0',1).shift(-2, axis=1)).iloc[:,:3]
Out[297]:
1.0 2.0 3.0
0 (1083, 596) (1050, 164) (1050, 164)
1 (1081, 595) (1050, 164) (1080, 162)
2 (1081, 594) (1049, 163) (1070, 164)
3 (1082, 593) NaN NaN
4 NaN (1050, 164) NaN
5 NaN (1050, 164) NaN
6 NaN (1049, 163) NaN
7 NaN (1049, 163) NaN
8 NaN NaN (1052, 463)
9 NaN NaN (1051, 468)
10 NaN NaN (1054, 465)
11 NaN NaN (1057, 463)