我正在读取多个csv文件,并将其组合到一个csv文件中。合并数据的预期结果如下所示:
0 4 6 8 10 12
1 2 5 4 2 1
5 3 0 1 5 10
....
但是,在下面的代码,我打算列从0,4,6,8,10,12去。
for indx, file in enumerate(files_File1):
if file.endswith('csv'): #reading csv filed in the designated folder
filepath = os.path.join(folder_File1, file) #reading csv filed in the designated folder
current = pd.read_csv(filepath, header=None) #reading csv filed in the designated folder
if indx == 0:
mydata_File1 = current.copy()
mydata_File1.columns.values[1] = 4
print(mydata_File1.columns.values)
else:
mydata_File1[2*indx+4] = current.iloc[:,1]
print(mydata_File1.columns.values)
但是,结果看起来像这样,列从0,2,4,6,8,10,12开始。
0 4 2 6 8 10 12
1 2 5 4 2 1
5 3 0 1 5 10
....
我不太确定是什么原因导致列名为“ 2”。
有什么主意吗?
答案 0 :(得分:0)
如果您真的只是想合并.csv文件,则不需要熊猫。
#! python3
import glob
folder_File1 = r"C:\Users\Public\Documents\Python\CombineCSVFiles"
csv_only = r"\*.csv"
files_File1 = glob.glob(f'{folder_File1}{csv_only}')
new_csv = f'{folder_File1}\\newcsv.csv'
lines = []
for file in files_File1:
with open(file) as filein:
if filein.name == new_csv:
pass
else:
for line in filein:
line = line.strip() # or some other preprocessing
lines.append(line) # storing everything in memory!
with open(new_csv, 'w') as out_file:
out_file.writelines(line + u'\n' for line in lines)
答案 1 :(得分:0)
如果出于某些原因需要熊猫,则可以使用。您的代码引用了mydata_File1.columns.values,它是列的名称,而不是列中的值。如果这不能回答您的问题,请根据@ juanpa.arrivillaga的评论提供更完整的答案。
#! python3
import os
import pandas as pd
import glob
folder_File1 = r"C:\Users\Public\Documents\Python\CombineCSVFiles"
csv_only = r"\*.csv"
files_File1 = glob.glob(f'{folder_File1}{csv_only}')
new_csv = f'{folder_File1}\\newcsv.csv'
mydata_File1 = []
for indx, file in enumerate(files_File1):
if file == new_csv:
pass
else:
current = pd.read_csv(file, header=None) #reading csv filed in the designated folder
print (current)
if indx == 0:
mydata_File1 = current.copy()
print(mydata_File1.values)
else:
pass
mydata_File1 = mydata_File1.append(current, ignore_index=True)
print(mydata_File1.values)
mydata_File1.to_csv(new_csv)