我有以下函数,它返回pandas系列状态 - Associated Counties
def answer():
census_df.set_index(['STNAME', 'CTYNAME'])
for name, state, cname in zip(census_df['STNAME'], census_df['STATE'], census_df['CTYNAME']):
print(name, state, cname)
Alabama 1 Tallapoosa County
Alabama 1 Tuscaloosa County
Alabama 1 Walker County
Alabama 1 Washington County
Alabama 1 Wilcox County
Alabama 1 Winston County
Alaska 2 Alaska
Alaska 2 Aleutians East Borough
Alaska 2 Aleutians West Census Area
Alaska 2 Anchorage Municipality
Alaska 2 Bethel Census Area
Alaska 2 Bristol Bay Borough
Alaska 2 Denali Borough
Alaska 2 Dillingham Census Area
Alaska 2 Fairbanks North Star Borough
我想知道拥有最多县的州。我可以像这样迭代每个州:
counter = 0
counter2 = 0
for name, state, cname in zip(census_df['STNAME'], census_df['STATE'], census_df['CTYNAME']):
if state == 1:
counter += 1
print(counter)
if state == 1:
counter2 += 1
print(counter2)
等等。我可以调整状态数(rng = range(1, 56)
)并迭代它,但创建56个列表是一场噩梦。如果这样做有更简单的方法吗?
答案 0 :(得分:2)
Pandas允许我们在没有循环/迭代的情况下进行此类操作:
In [21]: df.STNAME.value_counts()
Out[21]:
Alaska 9
Alabama 6
Name: STNAME, dtype: int64
In [24]: df.STNAME.value_counts().head(1)
Out[24]:
Alaska 9
Name: STNAME, dtype: int64
或
In [18]: df.groupby('STNAME')['CTYNAME'].count()
Out[18]:
STNAME
Alabama 6
Alaska 9
Name: CTYNAME, dtype: int64
In [19]: df.groupby('STNAME')['CTYNAME'].count().idxmax()
Out[19]: 'Alaska'