从日志写入csv时替换字符串

时间:2017-08-09 07:20:51

标签: python string csv replace

我正在尝试使用python(mainly this example

将日志文件转换为csv
import csv

with open('/home/user/Downloads/my.log') as file:
    lines = file.read().splitlines()
    lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

    with open('test.csv', 'w+') as csvfile:
        w = csv.writer(csvfile)
        w.writerows(lines)
        print "done"

我的原始日志文件如下所示:

2017-08-09 -> 11:30:01
Temp=29.0*  Humidity=30.0%

2017-08-09 -> 11:40:01
Temp=29.0*  Humidity=33.0%

从上面的示例代码中,我可以将其转换为此格式

2017-08-08 -> 08:50:01,Temp=33.0*  Humidity=38.0%
2017-08-08 -> 09:00:01,Temp=37.0*  Humidity=40.0%

但我需要我的最终csv看起来像这样

2017-08-08,08:50:01,33.0*,38.0%, 
2017-08-08,09:00:01,37.0*,40.0%

我使用了lines = lines.replace("->",","),我得到了

  

AttributeError:' list'对象没有属性'替换'

据我所知,python无法替换在内存中处理的文本文件。那我该怎么办?我可以用什么方法来净化最终文本?

我的python知识不是先进的,还在学习。所以,如果有错误或错过了一步,请纠正我。

提前致谢

4 个答案:

答案 0 :(得分:2)

更新:完整修改后的代码

public class Student {

private String id;

private String name;

public String getId() {
    return id;
}

public void setId(String id) {
    this.id = id;
}

public String getName() {
    return name;
}

public void setName(String name) {
    this.name = name;
} }

输出

with open('/home/user/Downloads/my.log') as file, open("output.csv", "w") as outfile:
    lines = file.readlines()
    lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
    lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]

    outfile.write("Date,Time,Temp,Humidity")

    for line in lines:
        outfile.write(line)

Wokring示例

Date,Time,Temp,Humidity
2017-08-08,08:50:01,33.0*,38.0%
2017-08-08,09:00:01,37.0*,40.0%

返回:

datafile="""2017-08-09 -> 11:30:01
Temp=29.0*  Humidity=30.0%

2017-08-09 -> 11:40:01
Temp=29.0*  Humidity=33.0%"""

lines = io.StringIO(datafile).readlines()
lines = [','.join([i.strip() for i in lines[x:x+2]]) for x in range(0, len(lines), 3)]
lines = [i.replace("->",",").replace(" ","").replace("Temp=","").replace("Humidity=",",") for i in lines]
lines

答案 1 :(得分:2)

这会创建一个字符串列表:

lines = file.read().splitlines()

然后创建一个字符串列表列表:

lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

替换适用于字符串,而不是列表。有很多方法可以解决这个问题:

# 1. do replace on original string, before splitting.
lines = file.read().replace("->", ",") 

# 2. or do replace on elements of list, before creating list of lists
lines = file.read().splitlines()
lines = [i.replace("->",",") for i in lines]

# 3. or replace on each element in list of lists 
# (not implemented)

答案 2 :(得分:1)

您使用replace作为列表方法,因为lines是包含字符串的列表,如果您想要使用maplist comprehension转换它:

lines = [i.replace("->",",") for i in lines]

lines = map(lambda x: x.replace("->",","), lines)

您可以链接转换以获得最终结果:

lines = map(lambda x: x.replace("->",",").replace("Temp=", "").replace("  Humidity=", ",").replace(" ", ""), 
            lines)

这里有live example

答案 3 :(得分:1)

您是否愿意使用正则表达式?如果是,请使用以下模式从每个字符串中提取组1,2,3和4:

(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)

see it working here

您使用re库。

import csv
import re

with open('/home/user/Downloads/my.log') as file:
    lines = file.read().splitlines()
    lines = [lines[x:x+2] for x in range(0, len(lines), 3)]

    pattern = '(\d{4}-\d{2}-\d{2}) -> (\d{2}:\d{2}:\d{2}),Temp=([0-9.]+)\*\s+Humidity=([0-9.]+)'

    with open("output.csv", "w") as f:
        f.write("Date,Time,Temp,Humidity\n")
        print("Date,Time,Temp,Humidity")
        for line in lines:
            m = re.search(pattern, line)
            f.write("{}, {}, {}, {}\n".format(m.group(1), m.group(2), m.group(3), m.group(4)))
            print("{}, {}, {}, {}".format(m.group(1), m.group(2), m.group(3), m.group(4)))