我有一个类似的数据框:
High income Middle income Low income
1 Norway Switzerland Qatar
2 Macao India Luxembourg
我需要类似的东西:
High income Middle income Low income
1 Norway none none
2 none Switzerland none
.
.
.
我使用了数据透视表: df = df.pivot(values =' Country Name&#39 ;, index = None,columns =' Income Group') 我得到类似的东西:
Dim Resp As String: Resp = Http.ResponseText
Dim Lines As Variant: Lines = Split(Resp, vbLf)
Dim sLine As String
Dim Values As Variant
For i = 0 To UBound(Lines)
sLine = Lines(i)
If InStr(sLine, ",") > 0 Then
Values = Split(sLine, ",")
W.Cells(i + 2, 2).Value = Replace(Values(1), Chr(34), "")
W.Cells(i + 2, 5).Value = Replace(Values(2), Chr(34), "")
W.Cells(i + 2, 6).Value = Values(3)
W.Cells(i + 2, 7).Value = Values(4)
W.Cells(i + 2, 8).Value = Values(5)
W.Cells(i + 2, 9).Value = Values(6)
W.Cells(i + 2, 10).Value = Values(7)
W.Cells(i + 2, 11).Value = Values(8)
W.Cells(i + 2, 13).Value = Values(9)
End If
有人可以建议一个更好的解决方案,而不是在这里转动,这样我就不必处理任何价值观了吗?
答案 0 :(得分:1)
诀窍是引入一个值为index
值的新列groupby/cumcount
。 cumcount
返回累计计数 - 从而对每个组中的项目进行编号:
df['index'] = df.groupby('Income Group').cumcount()
# Country Name Income Group index
# 1 Norway High income 0
# 2 Switzerland Middle income 0
# 3 Qatar Low income 0
# 4 Luxembourg Low income 1
# 5 Macao High income 1
# 6 India Middle income 1
获得index
列后,可以通过旋转获得所需的结果:
import pandas as pd
df = pd.DataFrame({'Country Name': ['Norway', 'Switzerland', 'Qatar', 'Luxembourg', 'Macao', 'India'], 'Income Group': ['High income', 'Middle income', 'Low income', 'Low income', 'High income', 'Middle income']})
df['index'] = df.groupby('Income Group').cumcount() + 1
result = df.pivot(index='index', columns='Income Group', values='Country Name')
result.index.name = result.columns.name = None
print(result)
产量
High income Low income Middle income
1 Norway Qatar Switzerland
2 Macao Luxembourg India