我有500多个CSV文件(在一个文件夹中),它们具有相同的表格结构,用于年度电力负荷数据。我想串联所有文件,但是由于所有文件都将具有相同的时间戳索引,因此我需要添加一个具有USER ID的列作为第一列,以使用USER ID和时间戳来具有多索引。我正在尝试使用map和pd.concat。我创建了一个函数read_and_add_filename,该函数将文件路径作为输入并将其读取为CSV,将USER ID创建为一列,然后将所有读取的CSV连接为一个熊猫数据框
一位消费者ID为0_sb_171的样本数据 Sample data for USER 0_sb_171
什么样的数据框应该是这样的?
Sample data with two consumers 0_sb_171 & 1_sb_183 我正在尝试使用map和pd.concat。我创建了一个函数read_and_add_filename,该函数将文件路径作为输入并将其读取为CSV,将USER ID创建为一列,然后将所有读取的CSV连接为一个熊猫数据框
def read_and_add_filename(input_t):
df= pd.read_csv(input_t)
df['USER ID'] = Path(input_t).stem
#Main code
import pandas as pd
import os
import glob
from pathlib import Path
from dt_str_to_dt import read_and_add_filename
input_csv_path = ('C:/Users/zebaa/Dropbox/India.Indo/DER/Storage coordination/CSV files/User load csv')
#reads in all files present at input_csv_path, runs function and
df_concat = pd.concat(map( read_and_add_filename, glob.glob(os.path.join(input_csv_path, "*-id_*.csv"))))
我收到以下错误:
Traceback (most recent call last):
File "C:\Users\zebaa\Documents\Python code\Data_visualization.py", line 10, in <module>
df_concat = pd.concat(map( read_and_add_filename, glob.glob(os.path.join(input_csv_path, "*-id_*.csv"))))
File "C:\Users\zebaa\AppData\Roaming\Python\Python37\site-packages\pandas\core\reshape\concat.py", line 228, in concat
copy=copy, sort=sort)
File "C:\Users\zebaa\AppData\Roaming\Python\Python37\site-packages\pandas\core\reshape\concat.py", line 280, in __init__
raise ValueError('All objects passed were None')
ValueError: All objects passed were None
[Finished in 30.0s]