我正在尝试学习Pandas,我遇到了groupby
和nunique
的小问题。
在下面创建数据框:
import pandas as pd
import numpy as np
chicago_dataset = pd.read_csv("https://raw.githubusercontent.com/gjreda/gregreda.com/master/content/notebooks/data/city-of-chicago-salaries.csv" ,
converters={'Employee Annual Salary': lambda x: float(x.replace('$', ''))})
标题:
chicago_dataset.columns
Out[63]: Index(['Name', 'Position Title', 'Department', 'Employee Annual Salary'], dtype='object')
现在我想按Department
进行分组,并明确计算Position Title
我将如何做这样的事情?
by_dept = chicago_dataset.groupby('Department')
我可以使用下面的Name
来完成,但Position Title
是两个字。
by_dept.Name.nunique()
答案 0 :(得分:1)
有两种不同的方法:
df.valid_python_name
df['any string with spaces and # etc.']
使用第二个:
by_dept['Position Title'].nunique()
示例:
>>> by_dept['Position Title'].nunique().head()
Department
ADMIN HEARNG 15
ANIMAL CONTRL 19
AVIATION 125
BOARD OF ELECTION 23
BOARD OF ETHICS 9
Name: Position Title, dtype: int64