我目前正在迭代文本文件并返回以下输出,以使我的脚本有效我想删除重复 包含例如181并保留一个,见下面的例子。
要解析的日志文件。
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313850, "time": "2015-02-26T08:46:14.070298", "item": 181, }
{"id": "242", "status": 61313851, "time": "2015-02-26T08:46:14.070298", "item": 180, }
Python代码。
#!/usr/bin/env python
with open("tras.json") as infile:
for line in infile:
if "time" in line:
time=line.split()[4:6]
if "item" in line:
item=line.split()[6:8]
print time + item
当前输出。
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']
期望的输出。
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '181,']
['"time":', '"2015-02-26T08:46:14.070298",', '"item":', '180,']
干杯,
菲利普
答案 0 :(得分:0)
完整的答案需要您更多地了解您的域名,但我希望此示例代码有用:
foundNumbers=set()
clearedData=list()
for dataItem in dataList:
if dataItem[-1] not in foundNumbers:
foundNumbers.add(dataItem[-1])
clearedData.append(dataItem)