我无法按字母顺序排列数据框第一列和第二列中的名称。
数据框看起来像这样:
Boys Females
Rank
1 Michael Jennifer
2 Christopher Jessica
3 Matthew Amanda
4 Jason Sarah
5 David Melissa
6 Joshua Amy
7 James Nicole
8 John Stephanie
9 Robert Elizabeth
10 Daniel Heather
11 Joseph Michelle
12 Justin Rebecca
13 Ryan Kimberly
14 Brian Tiffany
我希望它看起来像这样:(男孩和女性的名字按字母顺序排列)
Rank Boys Rank Females
14 Brian 3 Amanda
2 Christopher 6 Amy
10 Daniel 9 Elizabeth
5 David 10 Heather
7 James 1 Jennifer
我玩过sort和sort_value但是列没有改变。这是我的原始代码
import pandas as pd
df = pd.read_html("file:///C:/Python27/babyname999.html")
df2 =df[0] # creating a data frame from the above list of dateframes
df2.rename(columns = {'0': 'Rank', '1': 'Boys', '2': 'Females'}, inplace = True)
del df2['Unnamed: 0']
#renaming columns of dataframe
df2.set_index('Rank', inplace = True) #setting index of dataframe to 'Rank'
我玩过sort和sort_value但是列没有改变。我没有在哪里。有什么建议吗?
谢谢!
答案 0 :(得分:3)
这是排序的工作示例。
import pandas as pd
from io import StringIO
data_file = StringIO(u"""Rank Boys Females
1 Michael Jennifer
2 Christopher Jessica
3 Matthew Amanda
4 Jason Sarah
5 David Melissa
6 Joshua Amy
7 James Nicole
8 John Stephanie
9 Robert Elizabeth
10 Daniel Heather
11 Joseph Michelle
12 Justin Rebecca
13 Ryan Kimberly
14 Brian Tiffany""")
df = pd.read_table(data_file, delim_whitespace=True)
boys = df[['Rank','Boys']].sort_values(['Boys']).rename(columns={'Rank': 'Rank_boys'})
females = df[['Rank','Females']].sort_values(['Females']).rename(columns={'Rank': 'Rank_females'})
result = pd.concat([boys.reset_index(drop=True), females.reset_index(drop=True)], axis=1)
结果将是:
Rank_boys Boys Rank_females Females
0 14 Brian 3 Amanda
1 2 Christopher 6 Amy
2 10 Daniel 9 Elizabeth
3 5 David 10 Heather
4 7 James 1 Jennifer
5 4 Jason 2 Jessica
6 8 John 13 Kimberly
7 11 Joseph 5 Melissa
8 6 Joshua 11 Michelle
9 12 Justin 7 Nicole
10 3 Matthew 12 Rebecca
11 1 Michael 4 Sarah
12 9 Robert 8 Stephanie
13 13 Ryan 14 Tiffany
答案 1 :(得分:2)
IIUC(你很难发布预期/期望的DF)你可以这样做:
df = (pd.read_html("file:///C:/Python27/babyname999.html")[0]
.rename(columns = {'0': 'Rank', '1': 'Boys', '2': 'Females'})
.drop('Unnamed: 0', 1)
.set_index('Rank')
)
然后:
In [86]: df['Rank_Boys'], df['Rank_Females'] = df.sort_values('Boys').index, df.sort_values('Females').index
In [87]: df
Out[87]:
Boys Females Rank_Boys Rank_Females
1 Michael Jennifer 14 3
2 Christopher Jessica 2 6
3 Matthew Amanda 10 9
4 Jason Sarah 5 10
5 David Melissa 7 1
6 Joshua Amy 4 2
7 James Nicole 8 13
8 John Stephanie 11 5
9 Robert Elizabeth 6 11
10 Daniel Heather 12 7
11 Joseph Michelle 3 12
12 Justin Rebecca 1 4
13 Ryan Kimberly 9 8
14 Brian Tiffany 13 14
答案 2 :(得分:2)