我有一个DataFrame,当我创建它时,它的值在几列之间分开。我想将第2-4列合并为一个。
fish_frame: 0 1 2 3 \
0 735-8 NaN NaN NaN
1 NaN NaN NaN LIVE WGT
2 GBE COD NaN NaN 600
3 GBW COD NaN 11,189 NaN
4 GOM COD NaN 0 NaN
5 POLLOCK NaN NaN 1,103
6 WHAKE NaN NaN 12
7 GBE HADDOCK NaN 10,730 NaN
8 GBW HADDOCK NaN 64,147 NaN
9 GOM HADDOCK NaN 0 NaN
10 REDFISH NaN NaN 0
11 WITCH FLOUNDER NaN 370 NaN
12 PLAICE NaN NaN 622
13 GB WINTER FLOUNDER 54,315 NaN NaN
14 GOM WINTER FLOUNDER 653 NaN NaN
15 SNEMA WINTER FLOUNDER 14,601 NaN NaN
16 GB YELLOWTAIL NaN 1,663 NaN
17 SNEMA YELLOWTAIL NaN 1,370 NaN
18 CCGOM YELLOWTAIL 1,812 NaN NaN
4 6 package_deal_column Package_Price
0 NaN NaN Package Deal - $40,753.69 nan
1 NaN TOTAL Package Deal - $40,753.69 nan
2 NaN NaN Package Deal - $40,753.69 None
3 NaN NaN Package Deal - $40,753.69 None
4 Package Deal - $40,753.69 None Package Deal - $40,753.69 None
5 NaN NaN Package Deal - $40,753.69 None
6 NaN NaN Package Deal - $40,753.69 None
7 NaN NaN Package Deal - $40,753.69 None
8 NaN NaN Package Deal - $40,753.69 None
9 NaN NaN Package Deal - $40,753.69 None
10 NaN NaN Package Deal - $40,753.69 None
11 NaN NaN Package Deal - $40,753.69 None
12 NaN NaN Package Deal - $40,753.69 None
13 NaN None Package Deal - $40,753.69 None
14 NaN None Package Deal - $40,753.69 None
15 NaN None Package Deal - $40,753.69 None
16 NaN NaN Package Deal - $40,753.69 None
17 NaN NaN Package Deal - $40,753.69 None
18 NaN None Package Deal - $40,753.69 None
正如您所看到的,weight
值在多个列之间分开。
我按照这个答案的方向(pandas merge columns in same dataframe)尝试了:
pd.DataFrame({'Column 2': pd.concat([fish_frame[1], fish_frame[2], fish_frame[3]])}).sort_index()
但这并没有成功合并它们。我不确定是不是因为我错误地使用了命令,或者因为我的问题更具体。
另外,我并不认为这会解决它,但我确实尝试过:
fish_frame = fish_frame.dropna(axis=1, how='all')
但它并没有改变df。
任何帮助解决这个问题都将不胜感激。
fish_frame = fish_frame.set_index(0).stack()
做了什么:
fish_frame: 735-8 package_deal_column Package Deal - $40,753.69
Package_Price nan
NaN 3 LIVE WGT
6 TOTAL
package_deal_column Package Deal - $40,753.69
Package_Price nan
GBE COD 3 600
package_deal_column Package Deal - $40,753.69
GBW COD 2 11,189
package_deal_column Package Deal - $40,753.69
GOM COD 2 0
4 Package Deal - $40,753.69
package_deal_column Package Deal - $40,753.69
POLLOCK 3 1,103
package_deal_column Package Deal - $40,753.69
WHAKE 3 12
package_deal_column Package Deal - $40,753.69
GBE HADDOCK 2 10,730
package_deal_column Package Deal - $40,753.69
GBW HADDOCK 2 64,147
package_deal_column Package Deal - $40,753.69
GOM HADDOCK 2 0
package_deal_column Package Deal - $40,753.69
REDFISH 3 0
package_deal_column Package Deal - $40,753.69
WITCH FLOUNDER 2 370
package_deal_column Package Deal - $40,753.69
PLAICE 3 622
package_deal_column Package Deal - $40,753.69
GB WINTER FLOUNDER 1 54,315
package_deal_column Package Deal - $40,753.69
GOM WINTER FLOUNDER 1 653
package_deal_column Package Deal - $40,753.69
SNEMA WINTER FLOUNDER 1 14,601
package_deal_column Package Deal - $40,753.69
GB YELLOWTAIL 2 1,663
package_deal_column Package Deal - $40,753.69
SNEMA YELLOWTAIL 2 1,370
package_deal_column Package Deal - $40,753.69
CCGOM YELLOWTAIL 1 1,812
package_deal_column Package Deal - $40,753.69
答案 0 :(得分:0)
使用set_index
和stack
:
取决于您的列标题的dtype是整数还是str:
df.set_index('0').stack()
OR
df.set_index(0).stack()
输出:
0
NaN 3 LIVE WGT
GBE COD 3 600
GBW COD 2 11,189
GOM COD 2 0
POLLOCK 3 1,103
WHAKE 3 12
GBE HADDOCK 2 10,730
GBW HADDOCK 2 64,147
GOM HADDOCK 2 0
REDFISH 3 0
WITCH FLOUNDER 2 370
PLAICE 3 622
GB WINTER FLOUNDER 1 54,315
GOM WINTER FLOUNDER 1 653
SNEMA WINTER FLOUNDER 1 14,601
GB YELLOWTAIL 2 1,663
SNEMA YELLOWTAIL 2 1,370
CCGOM YELLOWTAIL 1 1,812
dtype: object
进一步重命名等..
df.set_index('0').stack().reset_index(name='weights').drop('level_1',axis=1)
输出:
0 weights
0 NaN LIVE WGT
1 GBE COD 600
2 GBW COD 11,189
3 GOM COD 0
4 POLLOCK 1,103
5 WHAKE 12
6 GBE HADDOCK 10,730
7 GBW HADDOCK 64,147
8 GOM HADDOCK 0
9 REDFISH 0
10 WITCH FLOUNDER 370
11 PLAICE 622
12 GB WINTER FLOUNDER 54,315
13 GOM WINTER FLOUNDER 653
14 SNEMA WINTER FLOUNDER 14,601
15 GB YELLOWTAIL 1,663
16 SNEMA YELLOWTAIL 1,370
17 CCGOM YELLOWTAIL 1,812