熊猫中数据框的递增列

时间:2018-07-07 13:07:07

标签: python pandas

我正在尝试合并多个Pandas数据框以从中创建一个聚合数据框。我要做的部分工作是计算给定行中有多少原始数据帧具有数据。我需要这些行具有0而不是NaN,但要知道,如果“完成”列包含0,那么最初这里没有数据。

这是我尝试过的:

daytona_stats = pd.merge(entry_list, track1_cut, 
                         on='Driver', how='left').fillna(0)
print(entry_list.head())
print(track1_cut.head())
print(daytona_stats.head())

if daytona_stats['Finish'] > 0:
    daytona_stats['races'] += 1

这将返回

            Driver         ...          avg_quality_passes
0        Joey Gase         ...                         0.0
1   Jamie McMurray         ...                         0.0
2  Brad Keselowski         ...                         0.0
3    Austin Dillon         ...                         0.0
4    Kevin Harvick         ...                         0.0

[5 rows x 6 columns]
           Driver  Finish       ...        Pct. Top 15 Laps  Quality Passes
0   Austin Dillon       1       ...                    40.6              67
1   Bubba Wallace       2       ...                    78.3             161
2    Denny Hamlin       3       ...                    66.7             101
3     Joey Logano       4       ...                    74.9             133
4  Chris Buescher       5       ...                    40.1              52

[5 rows x 5 columns]
            Driver  races       ...        Pct. Top 15 Laps  Quality Passes
0        Joey Gase    0.0       ...                     0.0             0.0
1   Jamie McMurray    0.0       ...                     0.0             2.0
2  Brad Keselowski    0.0       ...                    39.6           133.0
3    Austin Dillon    0.0       ...                    40.6            67.0
4    Kevin Harvick    0.0       ...                    44.0           171.0

[5 rows x 10 columns]
Traceback (most recent call last):
  File "C:\EclipseWorkspaces\csse120\Personal\Personal_Projects\Daytona_Projections.py", line 48, in <module>
    if daytona_stats['Finish'] > 0:
  File "C:\Users\burusj\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 1573, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

1 个答案:

答案 0 :(得分:1)

我认为这可行:

daytona_stats.loc[daytona_stats['Finish'] > 0,'races'] += 1

代替:

if daytona_stats['Finish'] > 0:
    daytona_stats['races'] += 1

您可以在pandas tutorial on indexing中进一步了解它的工作原理。