同时遍历多个数据帧

时间:2016-09-07 13:28:20

标签: python pandas dataframe

我有三个具有相同列名的用户的三个数据帧,如时间,罗盘数据,加速度计数据,陀螺仪数据和摄像机平移信息。我想同时遍历所有数据帧以检查用户执行摄像机平移的特定时间并返回用户(例如在特定时间内检测到数据帧平移)。我曾尝试使用破折号来实现并行性,但徒劳无功。下面是我的代码

import pandas as pd
import glob
import numpy as np
import math
from scipy.signal import butter, lfilter
order=3
fs=30
cutoff=4.0

data=[]
gx=[]
gy=[]
g_x2=[]
g_y2=[]


dataList = glob.glob(r'C:\Users\chaitanya\Desktop\Thesis\*.csv')
for csv in dataList:
       data.append(pd.read_csv(csv))
for i in range(0, len(data)):
    data[i] = data[i].groupby("Time").agg(lambda x: x.value_counts().index[0])
    data[i].reset_index(level=0, inplace=True)

def butter_lowpass(cutoff,fs,order=5):
    nyq=0.5 * fs
    nor=cutoff / nyq
    b,a=butter(order,nor,btype='low', analog=False)
return b,a
def lowpass_filter(data,cutoff,fs,order=5):
    b,a=butter_lowpass(cutoff,fs,order=order)
    y=lfilter(b,a,data)
return y

for i in range(0,len(data)):
    gx.append(lowpass_filter(data[i]["Gyro_X"],cutoff,fs,order))
    gy.append(lowpass_filter(data[i]["Gyro_Y"],cutoff,fs,order))

    g_x2.append(gx[i]*gx[i])
    g_y2.append(gy[i]*gy[i])


g_rad=[[] for _ in range(len(data))]
g_ang=[[] for _ in range(len(data))]

for i in range(0,len(data)):
    for j in range(0,len(data[i])):
           g_ang[i].append(math.degrees(math.atan(gy[i][j]/gx[i][j])))


    data[i]["Ang"]=g_ang[i]


panning=[[] for _ in range(len(data))]
for i in range(0,len(data)):
    for j in data[i]["Ang"]:
        if 0-30<=j<=0+30:
            panning[i].append("Panning")
        elif 180-30<=j<=180+30:
            panning[i].append("left")
        else:
           panning[i].append("None")
    data[i]["Panning"]=panning[i]
result=[[] for _ in range(len(data))]
for i in range (0,len(data)):
    result[i].append(data[i].loc[data[i]['Panning']=='Panning','Ang'])

1 个答案:

答案 0 :(得分:1)

我将假设您想要及时同时遍历。在任何情况下,您希望三个数据帧在要遍历的维度中具有索引。

我将生成3个数据帧,其中行代表9秒周期内的随机秒数。

然后,我会将这些与pd.concatffill对齐,以便能够引用任何间隙的最新已知数据。

seconds = pd.date_range('2016-08-31', periods=10, freq='S')

n = 6
ssec = seconds.to_series()
sidx = ssec.sample(n).index

df1 = pd.DataFrame(np.random.randint(1, 10, (n, 3)),
                   ssec.sample(n).index.sort_values(),
                   ['compass', 'accel', 'gyro'])

df2 = pd.DataFrame(np.random.randint(1, 10, (n, 3)),
                   ssec.sample(n).index.sort_values(),
                   ['compass', 'accel', 'gyro'])

df3 = pd.DataFrame(np.random.randint(1, 10, (n, 3)),
                   ssec.sample(n).index.sort_values(),
                   ['compass', 'accel', 'gyro'])

df4 = pd.concat([df1, df2, df3], axis=1, keys=['df1', 'df2', 'df3']).ffill()
df4

enter image description here

然后您可以继续浏览iterrows()

for tstamp, row in df4.iterrows():
    print tstamp