Python通过文本文件列出并删除与重复值匹配的行

时间:2015-02-27 15:46:23

标签: python

我目前正在迭代文本文件并返回以下输出,以使我的脚本有效我想删除重复 包含例如181并保留一个,见下面的例子。

要解析的日志文件。

{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313851, "time": "2015-02-26T08:46:14.070298", "item": 180, }

Python代码。

#!/usr/bin/env python

with open("tras.json") as infile:
    for line in infile:

    if "time" in line:
        time=line.split()[4:6]

    if "item" in line:
        item=line.split()[6:8]
        print time + item

当前输出。

['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']

期望的输出。

['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']

干杯,

菲利普

1 个答案:

答案 0 :(得分:0)

完整的答案需要您更多地了解您的域名,但我希望此示例代码有用:

foundNumbers=set()
clearedData=list()
for dataItem in dataList:
    if dataItem[-1] not in foundNumbers:
        foundNumbers.add(dataItem[-1])
        clearedData.append(dataItem)