我在这个人在StackOverflow上发布的groupby
方法遇到了类似的问题:
pandas group StopIteration error
我尝试使用grouby
方法做的更简单,但我收到类似的StopIteration
错误:
Traceback (most recent call last):
File "prepare_data_TJ2012_v1p0.py", line 107, in <module>
grouped = df.groupby('hh').apply(f)
File "/Users/shafiquejamal/allfiles/htdocs/venvs/easyframes-py3/lib/python3.4/site-packages/pandas/core/groupby.py", line 637, in apply
return self._python_apply_general(f)
File "/Users/shafiquejamal/allfiles/htdocs/venvs/easyframes-py3/lib/python3.4/site-packages/pandas/core/groupby.py", line 644, in _python_apply_general
not_indexed_same=mutated)
File "/Users/shafiquejamal/allfiles/htdocs/venvs/easyframes-py3/lib/python3.4/site-packages/pandas/core/groupby.py", line 2657, in _wrap_applied_output
v = next(v for v in values if v is not None)
StopIteration
以下是生成它的代码:
df = pd.DataFrame(
{'educ': {0: 'pri', 1: 'bach', 2: 'pri', 3: 'hi', 4: 'bach', 5: 'sec',
6: 'hi', 7: 'hi', 8: 'pri', 9: 'pri'},
'hh': {0: 1, 1: 1, 2: 1, 3: 2, 4: 3, 5: 3, 6: 4, 7: 4, 8: 4, 9: 4},
'id': {0: 1, 1: 2, 2: 3, 3: 1, 4: 1, 5: 2, 6: 1, 7: 2, 8: 3, 9: 4},
'has_car': {0: 1, 1: 1, 2: 1, 3: 1, 4: 0, 5: 0, 6: 1, 7: 1, 8: 1, 9: 1},
'weighthh': {0: 2, 1: 2, 2: 2, 3: 3, 4: 2, 5: 2, 6: 3, 7: 3, 8: 3, 9: 3},
'house_rooms': {0: 3, 1: 3, 2: 3, 3: 2, 4: 1, 5: 1, 6: 3, 7: 3, 8: 3, 9: 3},
'prov': {0: 'BC', 1: 'BC', 2: 'BC', 3: 'Alberta', 4: 'BC', 5: 'BC', 6: 'Alberta',
7: 'Alberta', 8: 'Alberta', 9: 'Alberta'},
'age': {0: 44, 1: 43, 2: 13, 3: 70, 4: 23, 5: 20, 6: 37, 7: 35, 8: 8, 9: 15},
'fridge': {0: 'yes', 1: 'yes', 2: 'yes', 3: 'no', 4: 'yes', 5: 'yes', 6: 'no',
7: 'no', 8: 'no', 9: 'no'},
'male': {0: 1, 1: 0, 2: 1, 3: 1, 4: 1, 5: 0, 6: 1, 7: 0, 8: 0, 9: 0}})
print(df)
print('-- groupby dataframes ---')
def f(df):
print('-------------------------')
print('DataFrame' )
print(df)
s = df['age']
print(s)
print('----> Not nulls:')
s_notnulls = ~s.isnull()
print(s_notnulls)
print('----> Number of non-nulls: %d' % len(s_notnulls[s_notnulls==True]))
df.groupby('hh').apply(f)
如果另一列中至少有一个非空值,我想按组对列执行操作。
我正在使用pandas==0.14.1
。似乎群体上的循环太长了。这是一个错误吗? (或者我使用groupby
方法错误...)
答案 0 :(得分:8)
您收到此错误,因为您要传递的功能不会返回任何内容。如果你关心的只是打印输出,你可以像这样返回df。
def f(df):
print('-------------------------')
print('DataFrame' )
print(df)
s = df['age']
print(s)
print('----> Not nulls:')
s_notnulls = ~s.isnull()
print(s_notnulls)
print('----> Number of non-nulls: %d' % len(s_notnulls[s_notnulls==True]))
return df
然后申请将无误地运行。
In [295]: df.groupby('hh').apply(f)
-------------------------
DataFrame
age educ fridge has_car hh house_rooms id male prov weighthh
0 44 pri yes 1 1 3 1 1 BC 2
1 43 bach yes 1 1 3 2 0 BC 2
2 13 pri yes 1 1 3 3 1 BC 2
.....