Dataframe上的计算

时间:2018-03-09 07:55:06

标签: python pandas dataframe

假设我有一个这样的Dataframe:

data = pd.DataFrame({'stcode': ['001', '002', '001', '002', '001', '002', '001', '002', '001', '002'],
      'trade_dt': ['20170101', '20170101', '20170102', '20170102', '20170103', '20170103', '20170104', '20170104', '20170105', '20170105'],
      'close': [1, 3, 5, 1, 2, 3, 5, 1, 2, 2],
      'trend': []})

我想用规则计算每只股票的收盘价:

if close[i+1] > close[i]: trend[i] = 1
elif close[i+1] < close[i]: trend[i] = -1
else: trend[i] = 0

然后将其存储在data['trend']中。 我该怎么办?

3 个答案:

答案 0 :(得分:1)

你可以

In [157]: s = data.close.diff()  # data.close - data.close.shift()

In [158]: data['trend'] = np.where(s.gt(0), 1, np.where(s.lt(0), -1, 0))

In [159]: data
Out[159]:
   close stcode  trade_dt  trend
0      1    001  20170101      0
1      3    002  20170101      1
2      5    001  20170102      1
3      1    002  20170102     -1
4      2    001  20170103      1
5      3    002  20170103      1
6      5    001  20170104      1
7      1    002  20170104     -1
8      2    001  20170105      1
9      2    002  20170105      0

答案 1 :(得分:1)

正如MrT所提到的,空趋势列使这个数据帧无效。 我通过填写np.nan来修复它。

所以:

import pandas as pd
import numpy as np

data = pd.DataFrame({'stcode': ['001', '002', '001', '002', '001', '002', '001', '002', '001', '002'], 
                     'trade_dt': ['20170101', '20170101', '20170102', '20170102', '20170103', '20170103', '20170104', '20170104', '20170105', '20170105'],
                     'close': [1, 3, 5, 1, 2, 3, 5, 1, 2, 2],
                     'trend': np.nan})

data['diff'] = data['close'].diff()
data.loc[(data['diff']) > 0, 'trend'] = 1
data.loc[(data['diff']) < 0, 'trend'] = -1

答案 2 :(得分:0)

你去了,但Yorian上面的答案是一个更好的答案,因为他没有循环数据框中的所有记录,所以它更有效。

import pandas
data = pandas.DataFrame({'stcode': ['001', '002', '001', '002', '001', '002', '001', '002', '001', '002'], 'trade_dt': ['20170101', '20170101', '20170102', '20170102', '20170103', '20170103', '20170104', '20170104', '20170105', '20170105'], 'close': [1, 3, 5, 1, 2, 3, 5, 1, 2, 2]})
data['trend'] = 0
for i in data.index:
    if i+1 in data.index:
      if data.loc[i+1, 'close'] > data.loc[i, 'close']:
          data.loc[i, 'trend'] = 1
      elif data.loc[i+1, 'close'] < data.loc[i, 'close']:
          data.loc[i, 'trend'] = -1