Question

我想将几个CSV文件合并到一个数据框中。使用下面的代码，它可以实现我想要的功能，但是仅合并了最后一个文件。如何更改代码以合并所有文件？

V_timeSeries = pd.read_csv('timeSeries.csv')

#merge each file with time series on centiseconds
raw_files=['labelled_raw1.csv','labelled_raw2.csv','labelled_raw3.csv','labelled_raw4.csv','labelled_raw5.csv','labelled_raw6.csv' ]
first=True
for file in raw_files:
    V_raw=pd.read_csv(file)
    V_walk = V_raw.merge(V_timeSeries, on='centiseconds', how='outer') 
V_walk = V_walk.fillna(method='ffill') #where the dataframes have been merged, many rows will have NA, so the value is taken from the previous filled row and copied down. Now each centisecond is labelled with an activyt rather than only once ever 3000 centisecond.
V_walk = V_walk.loc[(V_walk['walking'] == 1) & (V_walk['imputed'] == 0) & (V_walk['moderate'] == 0) & (V_walk['sedentary'] == 0) & (V_walk['sleep'] == 0) & (V_walk['tasks-light'] == 0)]
V_walk = V_walk.drop(['acceleration', 'imputed', 'moderate', 'sedentary', 'sleep', 'tasks-light','MET'], axis=1)
print("walking_identified")

Answer 1

您一直在重写V_walk，因此仅存储了最后一个文件。

for file in raw_files: 
    V_raw=pd.read_csv(file)
    V_timeSeries = V_timeSeries.merge(V_raw, on = 'centiseconds', how = 
                   'outer')
 V_walk = V_timeSeries #if you want to use V_walk as the new name

将CSV文件合并到数据框

1 个答案: