animals = pd.DataFrame({'animal': ['Dog','Cat','Cat','Cat','Dog','Dog','Cat','Dog','Cat','Cat','Dog','Dog','Cat'],
'age':[2,1,5,7,5,3,4,6,6,9,3,2,10],
'weight':[10,4,3,15,12,5,6,3,7.1,10,12,6,4],
'length':[1,0.45,0.49,0.50,1.2,1.16,0.40,1.2,0.45,0.50,0.75,1.1,0.43]})
假设我有这样一个数据框,我想创建一个较小的猫数据框,并按它们的年龄顺序排列如何完成这样的事情
答案 0 :(得分:3)
你可以这样做:
res = animals[animals['animal'].eq('Cat')].sort_values(by='age')
print(res)
输出
animal age weight length
1 Cat 1 4.0 0.45
6 Cat 4 6.0 0.40
2 Cat 5 3.0 0.49
8 Cat 6 7.1 0.45
3 Cat 7 15.0 0.50
9 Cat 9 10.0 0.50
12 Cat 10 4.0 0.43
如果您只想要年龄和动物栏,请执行以下操作:
res = animals[animals['animal'].eq('Cat')].filter(items=['animal', 'age']).sort_values(by='age')
print(res)
输出
animal age
1 Cat 1
6 Cat 4
2 Cat 5
8 Cat 6
3 Cat 7
9 Cat 9
12 Cat 10
答案 1 :(得分:1)
您只需要过滤掉不包含“Cat”的其余行:
animals = pd.DataFrame({'animal': ['Dog','Cat','Cat','Cat','Dog','Dog','Cat','Dog','Cat','Cat','Dog','Dog','Cat'],
'age':[2,1,5,7,5,3,4,6,6,9,3,2,10],
'weight':[10,4,3,15,12,5,6,3,7.1,10,12,6,4],
'length':[1,0.45,0.49,0.50,1.2,1.16,0.40,1.2,0.45,0.50,0.75,1.1,0.43]})
animals = animals[animals['animal'] == 'Cat'].sort_values(['age'])
animals
>>>
animal age weight length
1 Cat 1 4.0 0.45
6 Cat 4 6.0 0.40
2 Cat 5 3.0 0.49
8 Cat 6 7.1 0.45
3 Cat 7 15.0 0.50
9 Cat 9 10.0 0.50
12 Cat 10 4.0 0.43
仅获取相关数据('animal' 和 'age'):
animals[['animal','age']]
>>> animal age
1 Cat 1 4.0
6 Cat 4 6.0
2 Cat 5 3.0
8 Cat 6 7.1
3 Cat 7 15.0
9 Cat 9 10.0
12 Cat 10 4.0
答案 2 :(得分:1)
您可以在此处使用 df.query
。
df.query("animal=='Cat'").sort_values('age')
# Alternative
# df.query("animal.eq('Cat')").sort_values('age')
animal age weight length
1 Cat 1 4.0 0.45
6 Cat 4 6.0 0.40
2 Cat 5 3.0 0.49
8 Cat 6 7.1 0.45
3 Cat 7 15.0 0.50
9 Cat 9 10.0 0.50
12 Cat 10 4.0 0.43
如果您只想要 animal
和 age
df[['animal', 'age']].query("animal=='Cat'").sort_values('age')
animal age
1 Cat 1
6 Cat 4
2 Cat 5
8 Cat 6
3 Cat 7
9 Cat 9
12 Cat 10