对于我的实验,我具有以下三种格式的具有不同特征的不同时间序列数据,其中第一列是时间戳记,第二列是值。
0.086206438,10
0.086425551,12
0.089227066,20
0.089262508,24
0.089744425,30
0.090036815,40
0.090054172,28
0.090377569,28
0.090514071,28
0.090762872,28
0.090912691,27
为了重现性,我共享了here所使用的三个时间序列数据。
从第2列开始,我想读取当前行并将其与上一行的值进行比较。如果更大,我会继续比较。如果当前值小于上一行的值,我想将当前值(较小)除以上一个值(较大)。让我说清楚。例如,在上面提供的示例记录I中,第七行(28)小于第六行(40)中的值-因此它将是(28/40 = 0.7)。
这是我的示例代码。
import numpy as np
import pandas as pd
import csv
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
from statsmodels.graphics.tsaplots import plot_acf, acf
protocols = {}
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
for protname, fname in types.items():
col_time = []
col_window = []
with open(fname, mode='r', encoding='utf-8-sig') as f:
reader = csv.reader(f, delimiter=",")
for i in reader:
col_time.append(float(i[0]))
col_window.append(int(i[1]))
col_time, col_window = np.array(col_time), np.array(col_window)
diff_time = np.diff(col_time)
diff_window = np.diff(col_window)
diff_time = diff_time[diff_window > 0]
diff_window = diff_window[diff_window > 0] # To keep only the increased values
protocols[protname] = {
"col_time": col_time,
"col_window": col_window,
"diff_time": diff_time,
"diff_window": diff_window,
}
# Plot the quotient values
rt = np.exp(np.diff(np.log(col_window)))
for protname, fname in types.items():
col_time, col_window = protocols[protname]["col_time"], protocols[protname]["col_window"]
rt = np.exp(np.diff(np.log(col_window)))
plt.plot(np.diff(col_time), rt, ".", markersize=4, label=protname, alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("difference")
plt.legend()
plt.show()
这给了我以下情节
但是,当我这样做
rt = np.exp(np.diff(np.log(col_window)))
它将当前每一行除以上一行,这不是我想要的。正如我在上面的问题示例中所解释的那样,仅当当前行值小于先前值时,才想将列2的当前行值除以列2的先前值。最后,绘制商对时间戳差异(上面我的代码中的col_time
)。我怎样才能解决这个问题?
答案 0 :(得分:2)
除非特别需要csv
模块,否则我建议使用numpy
method loadtxt
来加载文件,即
col_time,col_window = np.loadtxt(fname,delimiter=',').T
此行处理for
循环的前8行。请注意,必须进行转置操作(.T
才能将原始数据形状(N
行乘2
列转换为2
行乘N
列形状已解压缩到col_time
和col_window
中。还要注意,loadtxt
自动将数据加载到numpy.array
对象中。
关于您的实际问题,我将使用切片和遮罩:
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_mask = leading_window < trailing_window
quotient = leading_window[decreasing_mask] / trailing_window[decreasing_mask]
quotient_times = col_time[decreasing_mask]
然后可以将quotient_times
与quotient
作图。
一种替代方法是使用numpy
method where
来获取掩码为True
的索引:
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_inds = np.where(leading_window < trailing_window)[0]
quotient = leading_window[decreasing_inds] / trailing_window[decreasing_inds]
quotient_times = col_time[decreasing_inds]
请记住,以上所有代码仍然在第一个for
循环中发生,但是现在rt
在循环中的计算方式为quotient
。因此,在计算quotient_times
之后,进行绘制(也在第一个循环内):
# Next line opens a new figure window and then clears it
figure(); clf()
# Updated plotting call with the syntax from the answer
plt.plot(quotient_times,quotient,'.',ms=4,label=protname,alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("quotient")
plt.legend()
# You may not need this `plt.show()` line
plt.show()
# To save the figure, one option would be the following:
# plt.savefig(protname+'.png')
请注意,您可能需要将plt.show()
行从循环中删除。
为您拼凑起来,
import numpy as np
import matplotlib.pyplot as plt
protocols = {}
types = {"data1": "data1.csv", "data2": "data2.csv", "data3": "data3.csv"}
for protname, fname in types.items():
col_time,col_window = np.loadtxt(fname,delimiter=',').T
trailing_window = col_window[:-1] # "past" values at a given index
leading_window = col_window[1:] # "current values at a given index
decreasing_inds = np.where(leading_window < trailing_window)[0]
quotient = leading_window[decreasing_inds] /
trailing_window[decreasing_inds]
quotient_times = col_time[decreasing_inds]
# Still save the values in case computation needs to happen later
# in the script
protocols[protname] = {
"col_time": col_time,
"col_window": col_window,
"quotient_times": quotient_times,
"quotient": quotient,
}
# Next line opens a new figure window and then clears it
plt.figure(); plt.clf()
plt.plot(quotient_times,quotient, ".", markersize=4, label=protname, alpha=0.1)
plt.ylim(0, 1.0001)
plt.xlim(0, 0.003)
plt.title(protname)
plt.xlabel("time")
plt.ylabel("quotient")
plt.legend()
# To save the figure, one option would be the following:
# plt.savefig(protname+'.png')
# This may still be unnecessary, especially if called as a script
# (just save the plots to `png`).
plt.show()