熊猫:功能转变:“模棱两可的真实价值”

时间:2018-07-18 20:28:19

标签: python pandas

我有一个包含dateIndex和price列的数据框,如下所示:

DATE   |     PRICE
01-01-2018    100
02-01-2018    101
03-01-2018    97

我编写了一个函数来计算一行价格与之前3行(“天”)的价格之间的差。 (我知道还有其他的pandas方法可以实现这一点,但是此功能是一个存根,我想稍后再扩展)

def case1(x):

  prevrow = x.shift(3)
  if np.isnan(prevrow['price']):
      pass
  else:
      if x['price'] > prevrow['price']:
          diff = prevrow['price'] - x['price']             
          print('The diff is {}').format(diff)

但是,当我尝试运行(case1(df))时,我遇到了

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

错误。它似乎是由函数开始处的移位生成的3个NaN值触发的。但是添加对NaN值的检查仍然会导致相同的错误消息。

有人知道我在做什么错吗?

1 个答案:

答案 0 :(得分:1)

为了更好的可视化,我们考虑使用更大的数据框:

DATE        |  price
01-01-2018     100
02-01-2018     101
03-01-2018     97
04-01-2018     102
05-01-2018     100
06-01-2018     107
07-01-2018     38

您的代码中有一些问题。您正在尝试使用数组而不是单个值进行布尔操作。解决方案:

def case1(x):
    # New df with a new column for shift prices
    df = x.assign(price_prevrow= x.shift(3)['price'])

    if np.isnan(df['price_prevrow']).all(): # Check ALL values
        pass
    else:
        # Slice df to get only rows with price greater than price_prevrow
        df = df.loc[df['price'] > df['price_prevrow']]

        # Calculate difference
        diff = df['price_prevrow'] - df['price']

        # Print all differences
        for d in diff:
            print('The diff is {}'.format(d))

上面的代码创建了一个新的价格变动后的数据框,然后将该价格框与价格值大于预售值的行切片。在此之后,区别很容易。

输出:

"The diff is -2.0"
"The diff is -10.0"