这是我的数据:
3116,97.208.97.100,13123721655,1806,919831533099,5/3/2016 11:29:22 PM,300000372932,2070200100101,919831533099,D,1,274,5/3/2016 11:29:55 PM,5/3/2016 11:34:04 PM,249,26,0,NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN,0,,2,131286102781596,0,0,4701,2190,60,10950,4701,249,5/3/2016 11:34:04 PM,200,NPS_AS7,97.56.30.7,5/3/2016 11:29:22PM,1,7145,10.236.51.243,0,0,0,3,1,3,765292,776242,0,0,0,0,0,0,0,0,0,0,0,0,0,0,N,,
我想将31列(5/3/2016 11:34:04 PM
)拆分为两列日期和时间(fields[30].split(" ")[0]+","+fields[30].split(" ")[1]
),以便在日期分区,但它显示错误
指数超出范围。
这是我的代码:
import os
import datetime as da
cdr_path = "/home/rosa/CDR/cgi/"
dest_path = "/home/rosa/CDR/cgo/"
date = da.datetime.today().strftime("%d%m%Y_%H")
out_file = dest_path+"hadoop_output_"+date+".log"
def preparefile(cdr_path, dest_path):
with open(out_file,"w") as fo: # open output file for writing
for filename in os.listdir(cdr_path): # list all cdr files for reading
file_name = cdr_path+filename
with open(file_name,"r") as fi:
for line in fi:
fields = line.rstrip("\n").split(',')
outline = ",".join(fields[:30])
outline += ","+fields[30].split(" ")[0]+","+fields[30].split(" ")[1]+","
[0]+","+fields[12].split("_")[1]+","
outline += ",".join(fields[31:63])+"\n"
fo.write(outline)
os.remove(file_name)
preparefile(cdr_path, dest_path)
我不知道它为什么不分裂。有人可以帮我找出来吗?