我使用下面的代码来组合所有csv文件:每个文件下面有10,000行:
billing_report_2014-02-01.csv billing_report_2014-02-02.csv :
fout=open("out.csv","a")
for num in range(1,10):
print num
for line in open("billing_report_2014-02-0"+str(num)+".csv"):
fout.write(line)
for num in range(10,29):
print num
for line in open("billing_report_2014-02-"+str(num)+".csv"):
fout.write(line)
fout.close()
但现在我想在out.csv文件中添加新的日期列,如何添加日期列,并将值“2014-02-01”添加到我将billing_report_2014-02-01附加到out.csv的每一行,和 对于我将billing_report_2014-02-02附加到out.csv的每一行,“2014-02-02”的值如何处理?
答案 0 :(得分:3)
列出您要处理的文件名,然后从中获取数据,在输入文件上构建一个生成器,删除尾随的新行,并添加一个带有日期的新字段...例如:
filenames = [
'billing_report_2014-02-01.csv',
'billing_report_2014-02-02.csv'
]
with open('out.csv', 'w') as fout:
for filename in filenames:
to_append = filename.rpartition('_')[2].partition('.')[0]
with open(filename) as fin:
fout.writelines('{},{}\n'.format(line.rstrip(),to_append) for line in fin)
答案 1 :(得分:2)
我认为你最后可以添加日期:
for line in open("billing_report_2014-02-0"+str(num)+".csv"):
fout.write(line+',DATE INFORMATION')
我假设您的CSV实际上是以逗号分隔的,如果是分页符,则字符应为\ t
您还可以通过更改行来使用中间步骤:
line = line + ', DATE INFORMATION'
当您尝试添加文件名日期时,只需按变量添加:
line = line + ', 2014-02-0'+ str(num//10)
如果它始终是“,LLC”字符串表达式,则可以使用replace函数,请参阅下面的示例
>>> string = "100, 90101, California, Example company,LLC, other data"
>>> string.replace(',LLC',';LLC')
'100, 90101, California, Example company;LLC, other data'
>>>
把所有这些放在一起并尝试从@Jon CLements中获取一些灵感(KUDOS!):
def combine_and_add_date(year, month, startday, endday, replace_dict):
fout=open("out.csv","a")
for num in range(startday,endday+1):
daynum = str(num)
if len(daynum) ==1:
daynum = '0'+daynum
date_info = str(year+'-'month+'-'+daynum)
source_name = 'billing_report_'+date_info+'.csv'
for line in open(source_name):
for key in replace_dict:
line.replace(key,replact_dict[key])
fout.write(line+','+date_info)
fout.close()
我希望这有效,你应该(希望我是新手......)这样使用它,请注意字典旨在让你做出各种替换
combine_and_add_date("2014","02",1,28, {',LLC': ';LLC', ',PLC':';PLC'})
手指交叉