一对一地连接数据帧

时间:2018-09-17 19:37:34

标签: python pandas dataframe concatenation

下面的代码读取文件,将它们保存在数据帧中,然后将所有文件连接起来,并在连接后每秒重新采样数据。由于这太难记忆了。我想要的是一步一步地做。例如,我读了两个文件,将它们连接起来,然后对其重新采样。然后读取下一个文件,将其与前两个文件的结果连接起来,并重新采样,依此类推,以10个文件为单位。如何更改代码。有人可以帮我吗?以下是我的代码

import pandas as pd
import os
#import matplotlib.pyplot as plt
#df1 = pd.read_hdf("E:\examples\hdf files\conew1.h5", 'df')
#df2 = pd.read_hdf("E:\examples\hdf files\conew2.h5", 'df')
#df3 = pd.read_hdf("E:\examples\hdf files\conew3.h5", 'df')
hdfdirectory = "E:\examples\hdf files"
number_of_dfs=1
df=None
for fi in os.listdir(hdfdirectory):

    hdfpath =  os.path.join(hdfdirectory, fi)
    print hdfpath
    df1 = pd.read_hdf(hdfpath, 'df')

    for i in range(number_of_dfs):
        if df is None:
            df=pd.DataFrame({'timestamp':df1.timestamp , 'url' : df1.url})
            dft  = df.set_index('timestamp').resample('S').count()
        else:
            temp=pd.DataFrame({'timestamp':df1.timestamp , 'url' :df1.url})
            tempt  = temp.set_index('timestamp').resample('S').count()
            df=pd.concat([dft,tempt])

1 个答案:

答案 0 :(得分:0)

Tried to create an example to illustrate my point. You might have to tweak a little but will get an idea

hdfdirectory = "E:\examples\hdf files"
df=None
for fi in os.listdir(hdfdirectory):

    hdfpath =  os.path.join(hdfdirectory, fi)
    print hdfpath
    df1 = pd.read_hdf(hdfpath, 'df')

    if df is None:
        df=pd.DataFrame({'timestamp':df1.timestamp , 'url' : df1.url})
        dft  = df.set_index('timestamp').resample('S').count()
        df=dft
    else:
        temp=pd.DataFrame({'timestamp':df1.timestamp , 'url' :df1.url})
        tempt  = temp.set_index('timestamp').resample('S').count()
        df=pd.concat([df,tempt])