我正在尝试将6个文件读入7个不同的数据帧中,但无法弄清楚该怎么做。文件名可以是完全随机的,也就是我知道文件,但它不像data1.csv data2.csv。
我尝试使用类似这样的东西:
import sys
import os
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
f1='Norway.csv'
f='Canada.csv'
f='Chile.csv'
Norway = pd.read_csv(Norway.csv)
Canada = pd.read_csv(Canada.csv)
Chile = pd.read_csv(Chile.csv )
我需要读取不同数据帧中的多个文件。当我处理一个文件时,它工作正常
file='Norway.csv
Norway = pd.read_csv(file)
我收到错误消息:
NameError: name 'norway' is not defined
答案 0 :(得分:1)
您可以将所有.csv文件读取到一个数据框中。
for file_ in all_files:
df = pd.read_csv(file_,index_col=None, header=0)
list_.append(df)
# concatenate all dfs into one
big_df = pd.concat(dfs, ignore_index=True)
,然后将大数据帧拆分为多个(在您的情况下为7)。例如,-
import numpy as np
num_chunks = 3
df1,df2,df3 = np.array_split(big_df,num_chunks)
希望这会有所帮助。
答案 1 :(得分:0)
谷歌搜索了一段时间后,我决定将不同问题的答案合并为该问题的解决方案。此解决方案不适用于所有可能的情况。您必须对其进行调整以满足所有情况。
签出解决方案to this question
# import libraries
import pandas as pd
import numpy as np
import glob
import os
# Declare a function for extracting a string between two characters
def find_between( s, first, last ):
try:
start = s.index( first ) + len( first )
end = s.index( last, start )
return s[start:end]
except ValueError:
return ""
path = '/path/to/folder/containing/your/data/sets' # use your path
all_files = glob.glob(path + "/*.csv")
list_of_dfs = [pd.read_csv(filename, encoding = "ISO-8859-1") for filename in all_files]
list_of_filenames = [find_between(filename, 'sets/', '.csv') for filename in all_files] # sets is the last word in your path
# Create a dictionary with table names as the keys and data frames as the values
dfnames_and_dfvalues = dict(zip(list_of_filenames, list_of_dfs))