我正在尝试使用python(mainly this example)
将日志文件转换为csvimport csv
with open('/home/user/Downloads/my.log') as file:
lines = file.read().splitlines()
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
with open('test.csv', 'w+') as csvfile:
w = csv.writer(csvfile)
w.writerows(lines)
print "done"
我的原始日志文件如下所示:
2017-08-09 -> 11:30:01
Temp=29.0* Humidity=30.0%
2017-08-09 -> 11:40:01
Temp=29.0* Humidity=33.0%
从上面的示例代码中,我可以将其转换为此格式
2017-08-08 -> 08:50:01,Temp=33.0* Humidity=38.0%
2017-08-08 -> 09:00:01,Temp=37.0* Humidity=40.0%
但我需要我的最终csv看起来像这样
2017-08-08,08:50:01,33.0*,38.0%,
2017-08-08,09:00:01,37.0*,40.0%
我使用了lines = lines.replace("->",",")
,我得到了
AttributeError:' list'对象没有属性'替换'
据我所知,python无法替换在内存中处理的文本文件。那我该怎么办?我可以用什么方法来净化最终文本?
我的python知识不是先进的,还在学习。所以,如果有错误或错过了一步,请纠正我。
提前致谢
答案 0 :(得分:2)
更新:完整修改后的代码
public class Student {
private String id;
private String name;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
} }
输出
with open('/home/user/Downloads/my.log') as file, open("output.csv", "w") as outfile:
lines = file.readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
outfile.write("Date,Time,Temp,Humidity")
for line in lines:
outfile.write(line)
Wokring示例
Date,Time,Temp,Humidity
2017-08-08,08:50:01,33.0*,38.0%
2017-08-08,09:00:01,37.0*,40.0%
返回:
datafile="""2017-08-09 -> 11:30:01
Temp=29.0* Humidity=30.0%
2017-08-09 -> 11:40:01
Temp=29.0* Humidity=33.0%"""
lines = io.StringIO(datafile).readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
lines
答案 1 :(得分:2)
这会创建一个字符串列表:
lines = file.read().splitlines()
然后创建一个字符串列表列表:
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
替换适用于字符串,而不是列表。有很多方法可以解决这个问题:
# 1. do replace on original string, before splitting.
lines = file.read().replace("->", ",")
# 2. or do replace on elements of list, before creating list of lists
lines = file.read().splitlines()
lines = [i.replace("->",",") for i in lines]
# 3. or replace on each element in list of lists
# (not implemented)
答案 2 :(得分:1)
您使用replace
作为列表方法,因为lines
是包含字符串的列表,如果您想要使用map
或list comprehension
转换它:
lines = [i.replace("->",",") for i in lines]
或
lines = map(lambda x: x.replace("->",","), lines)
您可以链接转换以获得最终结果:
lines = map(lambda x: x.replace("->",",").replace("Temp=", "").replace(" Humidity=", ",").replace(" ", ""),
lines)
这里有live example
答案 3 :(得分:1)
您是否愿意使用正则表达式?如果是,请使用以下模式从每个字符串中提取组1,2,3和4:
(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)
您使用re
库。
import csv
import re
with open('/home/user/Downloads/my.log') as file:
lines = file.read().splitlines()
lines = [lines[x:x+2] for x in range(0, len(lines), 3)]
pattern = '(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)'
with open("output.csv", "w") as f:
f.write("Date,Time,Temp,Humidity\n")
print("Date,Time,Temp,Humidity")
for line in lines:
m = re.search(pattern, line)
f.write("{}, {}, {}, {}\n".format(m.group(1), m.group(2), m.group(3), m.group(4)))
print("{}, {}, {}, {}".format(m.group(1), m.group(2), m.group(3), m.group(4)))