pandas:迭代DataFrame列时的IndexError

时间:2017-05-22 19:41:07

标签: python pandas

所以,我有这个DataFrame,我正在尝试迭代其中一个列:'Party',它看起来像这样:

    Year             President            Party      Value
0   1920         Woodrow Wilson      Democratic        NaN       
1   1921      Warren G. Harding      Republican   0.127172        
2   1922      Warren G. Harding      Republican   0.217386

我的代码如下:

df_Democrat = pd.DataFrame()
df_Republican = pd.DataFrame()
for i in range(1,96):
    if table.columns['Party']=='Democratic':
        df_Democrat['Year']= table['Year']
        df_Democrat['Return']= table['Value']
    else:
        similar code for Republicans

但由于If语句,我一直收到以下错误:

  

IndexError:只有整数,切片(:),省略号(...),   numpy.newaxis(None)和整数或布尔数组有效   指数。

非常感谢您的建议。非常感谢您!

2 个答案:

答案 0 :(得分:0)

此代码应为您提供所需的输出

df = pd.DataFrame({'year': [1920,1921,1922,1923,1924,1925,1926],
    'pres': ['jon doe1','jon doe2','jon doe3','jon doe4','jon doe5','jon doe6','jon doe7'],
    'party': ['dem','repub','dem','repub','dem','repub','repub'],
    'value': [18.61, 17.60, 18.27, 16.18, 16.81, 16.37, 67.07]})

repub = df.loc[df.party == 'repub']
dem = df.loc[df.party == 'dem']

输出:

   party      pres  value  year
1  repub  jon doe2  17.60  1921
3  repub  jon doe4  16.18  1923
5  repub  jon doe6  16.37  1925
6  repub  jon doe7  67.07  1926

答案 1 :(得分:0)

<强>设置

df=pd.DataFrame({'Party': {0: 'Democratic', 1: 'Republican', 2: 'Republican'},
 'President': {0: 'WoodrowWilson', 1: 'WarrenG.Harding', 2: 'WarrenG.Harding'},
 'Value': {0: np.nan, 1: 0.12717200000000001, 2: 0.21738600000000002},
 'Year': {0: 1920, 1: 1921, 2: 1922}})

df
Out[1243]: 
        Party        President     Value  Year
0  Democratic    WoodrowWilson       NaN  1920
1  Republican  WarrenG.Harding  0.127172  1921
2  Republican  WarrenG.Harding  0.217386  1922

#you can do this without a loop using groupby.
df_Democrat = df.rename(columns={'Value':'Return'}).groupby('Party')['Party','Year','Return'].get_group('Democratic')
Out[1238]: 
        Party  Year  Return
0  Democratic  1920     NaN

df_Republican  = df.rename(columns={'Value':'Return'}).groupby('Party')['Party','Year','Return'].get_group('Republican')
Out[1239]: 
        Party  Year    Return
1  Republican  1921  0.127172
2  Republican  1922  0.217386