熊猫逆转diff()

时间:2018-04-18 15:00:42

标签: python arrays python-3.x numpy

我的价值有所不同,但我无法使用diffinv()

来区分它
    ds_sqrt=np.sqrt(ds)
    ds_sqrt=pd.DataFrame(ds_sqrt)
    ds_diff=ds_sqrt.diff().values

任何人都可以说如何解开这个?

5 个答案:

答案 0 :(得分:2)

您可以使用pmdarima中的diff_inv。Docs link

# genarating random table
  np.random.seed(10)
  vals = np.random.randint(1, 10, 6)
  df_t = pd.DataFrame({"a":vals})

  #creating two columns with diff 1 and diff 2
  df_t['dif_1'] = df_t.a.diff(1)
  df_t['dif_2'] = df_t.a.diff(2)

  df_t

    a   dif_1   dif_2
  0 5   NaN     NaN
  1 1   -4.0    NaN
  2 2   1.0    -3.0
  3 1   -1.0    0.0
  4 2   1.0     0.0
  5 9   7.0     8.0

然后创建一个函数,该函数将返回具有diff的反值的数组。

from pmdarima.utils import diff_inv

def inv_diff (df_orig_column,df_diff_column, periods):
# Generate np.array for the diff_inv function - it includes first n values(n = 
# periods) of original data & further diff values of given periods
value = np.array(df_orig_column[:periods].tolist()+df_diff_column[periods:].tolist())

# Generate np.array with inverse diff
inv_diff_vals = diff_inv(value, periods,1 )[periods:]
return inv_diff_vals

使用示例:

# df_orig_column - column with original values
# df_diff_column - column with differentiated values
# periods - preiods for pd.diff()
inv_diff(df_t.a, df_t.dif_2, 2) 

输出:

array([5., 1., 2., 1., 2., 9.])

答案 1 :(得分:1)

您可以通过numpy执行此操作。算法courtesy of @Divakar

当然,您需要了解系列中的第一项才能实现此目的。

df = pd.DataFrame({'A': np.random.randint(0, 10, 10)})
df['B'] = df['A'].diff()

x, x_diff = df['A'].iloc[0], df['B'].iloc[1:]
df['C'] = np.r_[x, x_diff].cumsum().astype(int)

#    A    B  C
# 0  8  NaN  8
# 1  5 -3.0  5
# 2  4 -1.0  4
# 3  3 -1.0  3
# 4  9  6.0  9
# 5  7 -2.0  7
# 6  4 -3.0  4
# 7  0 -4.0  0
# 8  8  8.0  8
# 9  1 -7.0  1

答案 2 :(得分:0)

这是一个工作示例。

首先,让我们导入需要的包

import numpy as np
import pandas as pd

import pmdarima as pm

import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

然后,让我们创建一个简单的离散余弦波

period = 5
cycles = 7
x = np.cos(np.linspace(0, 2*np.pi*cycles, periods*cycles+1))
X = pd.DataFrame(x)

和情节

fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(X, marker='.')
ax.set(
    xticks=X.index
)
ax.axvline(0, color='r', ls='--')
ax.axvline(period, color='r', ls='--')
ax.set(
    title='Original data'
)
plt.show()

enter image description here

请注意,句点是 5。现在让我们通过对周期 5 进行微分来消除这种“季节性”

X_diff = X.diff(periods=period)
# NOTE: the first `period` observations
#       are needed for back transformation
X_diff.iloc[:period] = X[:period]

请注意,我们必须保留第一个 period 观察值以允许反向转换。如果您不需要它们,则必须将它们保留在其他地方,然后在您想要反向转换时进行连接。

fig, ax = plt.subplots(figsize=(12, 5))
ax.axvline(0, color='r', ls='--')
ax.axvline(period-1, color='r', ls='--')
ax.plot(X_diff, marker='.')
ax.annotate(
    'Keep these original data\nto allow back transformation',
    xy=(period-1, .5), xytext=(10, .5),
    arrowprops=dict(color='k')
)
ax.set(
    title='Transformed data'
)
plt.show()

enter image description here

现在让我们用 pmdarima.utils.diff_inv

返回转换数据
X_diff_inv = pm.utils.diff_inv(X_diff, lag=period)[period:]

请注意,我们丢弃了第一个 period 结果,这些结果是 0 并且不需要。

fig, ax = plt.subplots(figsize=(12, 5))
ax.axvline(0, color='r', ls='--')
ax.axvline(period-1, color='r', ls='--')
ax.plot(X_diff_inv, marker='.')
ax.set(
    title='Back transformed data'
)
plt.show()

enter image description here

答案 3 :(得分:0)

与pandas在一行中反向差异

import pandas as pd

df = pd.DataFrame([10, 15, 14, 18], columns = ['Age'])
df['Age_diff'] = df.Age.diff()

df['reverse_diff'] = df['Age'].shift(1) + df['Age_diff']

print(df)

    Age  Age_diff  reverse_diff
0   10       NaN           NaN
1   15       5.0          15.0
2   14      -1.0          14.0
3   18       4.0          18.0  

答案 4 :(得分:-3)

df.cumsum()

Example:
data = {'a':[1,6,3,9,5], 'b':[13,1,2,5,23]}
df = pd.DataFrame(data)

df = 
    a   b
0   1   13
1   6   1
2   3   2
3   9   5
4   5   23

df.diff()

a   b
0   NaN NaN
1   5.0 -12.0
2   -3.0    1.0
3   6.0 3.0
4   -4.0    18.0

df.cumsum()

a   b
0   1   13
1   7   14
2   10  16
3   19  21
4   24  44