import pandas as pd
import numpy as np
#Create a Dictionary of series
d =
{'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',
'Lee','David','Gasper','Betina','Andres']),
'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,
2.98,4.80,4.10,
3.65])
}
#Create a DataFrame
df = pd.DataFrame(d)
print(df.describe(include='all'))
如果运行此代码,我将得到以下输出:
Name Age Rating
count 12 12.000000 12.000000
unique 12 NaN NaN
top Betina NaN NaN
freq 1 NaN NaN
mean NaN 31.833333 3.743333
std NaN 9.232682 0.661628
min NaN 23.000000 2.560000
25% NaN 25.000000 3.230000
50% NaN 29.500000 3.790000
75% NaN 35.500000 4.132500
max NaN 51.000000 4.800000
每次顶部功能更改时,我运行代码。 顶层函数在输出中的作用是什么?
答案 0 :(得分:3)
top函数在输出中的作用是什么?
如果执行:
df.Name.value_counts()
您将在Name
列中看到一个人的价值及其计数。 top给出分类值中最高的计数值。
示例:
d ={'Name':pd.Series(['Tom','Steve','Ricky','Vin','Steve','Smith','Jack',
'Lee','David','Gasper','Betina','Andres']),
'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,
2.98,4.80,4.10,
3.65])
}
#Create a DataFrame
df = pd.DataFrame(d)
print(df.describe(include='all'))
Name Age Rating
count 12 12.000000 12.000000
unique 11 NaN NaN
top Steve NaN NaN
freq 2 NaN NaN
mean NaN 31.833333 3.743333
std NaN 9.232682 0.661628
min NaN 23.000000 2.560000
25% NaN 25.000000 3.230000
50% NaN 29.500000 3.790000
75% NaN 35.500000 4.132500
max NaN 51.000000 4.800000
print(df.Name.value_counts())
Steve 2
Ricky 1
Tom 1
Andres 1
Jack 1
Smith 1
Lee 1
Betina 1
Vin 1
Gasper 1
David 1
由于Name
的{{1}}计数最高,因此排名第一。