在Pandas Dataframe中对行进行求和返回NAN

时间:2017-01-05 17:07:21

标签: python-2.7 pandas numpy dataframe nan

我正在尝试获取Pandas Dataframe中每一行的总和:

new_df['cash_change'] = new_df.sum(axis=0)

但是我的结果会一直返回NaN

我认为这可能与我将位置转换为十进制进行乘法时有关:

pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

我需要做的是将我的列相乘。然而我把它丢了回来:

cash_change[security.symbol] = cash_change[security.symbol].astype(float)

这是完整的方法。它的目标是为每个安全性执行一些列乘法,然后总和结束时的总数:

def get_cash_change(self):
    """
    Calculate daily cash to be transacted every day. Cash change depends on
    the position (either buy or sell) multiplied by the adjusted closing price
    of the equity multiplied by the trade amount.
    :return:
    """
    cash_change = pd.DataFrame(index=self.positions.index)
    try:

        for security in self.market_on_close_securities:
            # First convert all the positions from floating-point to decimals
            pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

            cash_change['positions'] = pos_to_dec
            cash_change['bars'] = security.bars['adj_close_price'].values

            # Perform calculation for cash change
            cash_change[security.symbol] = cash_change['positions'] * cash_change['bars'] * self.trade_amount

            cash_change[security.symbol] = cash_change[security.symbol].astype(float)

            # Clean up for next security
            cash_change.drop('positions', axis=1, inplace=True)
            cash_change.drop('bars', axis=1, inplace=True)

    except InvalidOperation as e :
        print("Invalid input : " + str(e))

    # Sum each equities change in cash
    new_df = cash_change.dropna()

    new_df['cash_change'] = new_df.sum(axis=0)

    return cash_change

我的new_df数据框最终看起来像这样:

                MTD       ESS      SIG       SNA  cash_change
price_date                                                   
2000-01-04      0.0      0.00     0.00      0.00          NaN
2000-01-05      0.0      0.00     0.00      0.00          NaN
2000-01-06      0.0      0.00     0.00      0.00          NaN
2000-01-07      0.0      0.00     0.00      0.00          NaN
2000-01-10      0.0      0.00     0.00      0.00          NaN
2000-01-11      0.0      0.00     0.00      0.00          NaN
2000-01-12      0.0      0.00     0.00      0.00          NaN
2000-01-13      0.0      0.00     0.00      0.00          NaN
2000-01-14      0.0      0.00     0.00      0.00          NaN
2000-01-18      0.0      0.00     0.00      0.00          NaN
2000-01-19      0.0      0.00     0.00      0.00          NaN
2000-01-20      0.0      0.00     0.00      0.00          NaN
2000-01-21      0.0      0.00     0.00      0.00          NaN
2000-01-24      0.0   1747.83  1446.71      0.00          NaN
2000-01-25   3419.0      0.00     0.00      0.00          NaN
2000-01-26      0.0      0.00     0.00   1660.38          NaN
2000-01-27      0.0      0.00 -1293.27      0.00          NaN
2000-01-28      0.0      0.00     0.00      0.00          NaN

有关我做错的任何建议吗?或者可能是另一种方法来对每行的列进行求和?

2 个答案:

答案 0 :(得分:4)

当您在DF.sum方法中提供axis=0时,它会沿着索引执行求和(如果更容易理解,则执行垂直方向)。因此,您只得到4个与数据帧的4列相对应的值。然后,您将此结果分配给数据框的新列。由于它们不共享相同的索引轴以重新索引,因此您将获得一系列NaN个元素。

您实际上想要对列(水平方向)进行求和。

将该行更改为:

new_df['cash_change'] = new_df.sum(axis=1)  # sum row-wise across each column

现在你将得到有限的计算求和值。

答案 1 :(得分:1)

new_df['cash_change'] = new_df.sum(axis=1)