Python检查列表项是否在文本文件中

时间:2014-03-09 12:23:05

标签: python json text

output_file = open("processed.txt", "a")

with open('text.json') as json_file:
        object_list = json.load(json_file)


receive_txids = []
for object in object_list:
        if object['category'] == 'receive':
                receive_txids.append(object['txid'])



with open('list.txt', "r") as list_file:
        for txid in receive_txids:
                if txid not in list_file:
                        print "for ea txid " + txid

                        print "NEW TRANSACTION";
                        output_file.write(txid + '\n')

这是我的代码

这是list.txt内容

db8cd79b4a0eafc6368bb65d2ff34d7d9c3d2016bee8528ab945a3d4bbad982d
2a93476e65b5500bc3d69856ddd512854d4939f1aabf488ac2806ec346a898a3
45b629a779e6e0bf6d160c37833a27f1f2cc1bfa34632d166cccae83e69eb6fe
bf5b7bc1aeaf7fb43f5d39b549278bee6665872bd74274dd5fad80d043002a3e
1e5f49fa1d0df059b9d7da8452cde9fb5a312c823401f5ed4ed4eafb5f98c1b0
7dc7cd4afcebaf8f17575be8b9acf06adcaadfe7fa5528453246307aa36e6ea0
aefdb89b461c118529bec78b35fed46cc5d7050b39902552fa2408361284c746
ec6abb67828c79cbf0b74131f0acfddc509efc9743bed0811d2316007cdcc482

text.json看起来像这样:

[
    {
        "account" : "",
        "address" : "D8xWhR8LqSdSLTxRWwouQ3EiSnvcjLmdo6",
        "category" : "receive",
        "amount" : 1000.00000000,
        "confirmations" : 1963,
        "blockhash" : "4569322b4c8c98fba3ef4c7bda91b53b4ee82d268eae2ff7658bc0d3753c00ff",
        "blockindex" : 2,
        "blocktime" : 1394242415,
        "txid" : "45b629a779e6e0bf6d160c37833a27f1f2cc1bfa34632d166cccae83e69eb6fe",
        "time" : 1394242265,
        "timereceived" : 1394242265
    },
    {
        "account" : "",
        "address" : "D8xWhR8LqSdSLTxRWwouQ3EiSnvcjLmdo6",
        "category" : "receive",
        "amount" : 11.00000000,
        "confirmations" : 1194,
        "blockhash" : "eff3d32177bf19629fe0f8076807acbb02b34aedcbce1c27a19ce9872daecb7c",
        "blockindex" : 6,
        "blocktime" : 1394290663,
        "txid" : "bf5b7bc1aeaf7fb43f5d39b549278bee6665872bd74274dd5fad80d043002a3e",
        "time" : 1394290582,
        "timereceived" : 1394290582
    },
    {
        "account" : "",
        "address" : "DKLMkLZmiSVXtEavDpQ4dasjZvC178QoM9",
        "category" : "receive",
        "amount" : 1.00000000,
        "confirmations" : 1183,
        "blockhash" : "7b5d3ebeb994dbff0940504db9e407bd90cad8a5a1ace05dcba4bc508ca27aff",
        "blockindex" : 9,
        "blocktime" : 1394291510,
        "txid" : "1e5f49fa1d0df059b9d7da8452cde9fb5a312c823401f5ed4ed4eafb5f98c1b0",
        "time" : 1394291510,
        "timereceived" : 1394291578
    },
    {
        "account" : "",
        "address" : "DKLMkLZmiSVXtEavDpQ4dasjZvC178QoM9",
        "category" : "receive",
        "amount" : 1.00000000,
        "confirmations" : 1179,
        "blockhash" : "4d9bd6d2988bc749022c41d125f1134796aa314e0d0bde34eba855ad88e76a7f",
        "blockindex" : 21,
        "blocktime" : 1394291642,
        "txid" : "7dc7cd4afcebaf8f17575be8b9acf06adcaadfe7fa5528453246307aa36e6ea0",
        "time" : 1394291629,
        "timereceived" : 1394291629
    },
    {
        "account" : "",
        "address" : "DKLMkLZmiSVXtEavDpQ4dasjZvC178QoM9",
        "category" : "receive",
        "amount" : 1.00000000,
        "confirmations" : 1179,
        "blockhash" : "4d9bd6d2988bc749022c41d125f1134796aa314e0d0bde34eba855ad88e76a7f",
        "blockindex" : 20,
        "blocktime" : 1394291642,
        "txid" : "aefdb89b461c118529bec78b35fed46cc5d7050b39902552fa2408361284c746",
        "time" : 1394291637,
        "timereceived" : 1394291637
    },
    {
        "account" : "",
        "address" : "DKLMkLZmiSVXtEavDpQ4dasjZvC178QoM9",
        "category" : "receive",
        "amount" : 11.00000000,
        "confirmations" : 34,
        "blockhash" : "df34d9d44e87cd3315755d3e7794b10729fc3f5853c218ec237c43a89d918eb7",
        "blockindex" : 5,
        "blocktime" : 1394364125,
        "txid" : "ec6abb67828c79cbf0b74131f0acfddc509efc9743bed0811d2316007cdcc482",
        "time" : 1394348464,
        "timereceived" : 1394348464
    }
]

我不能为我的生活找出为什么这不起作用。每次迭代时都会打印“新交易”。

我想检查json.txt中的每个txid(事务id),如果它已经存在于list.txt中。如果没有,我想把它写成“processed.txt”

2 个答案:

答案 0 :(得分:2)

文件对象不支持包含测试;你需要阅读你的文本文件。

最简单的方法是将所有事务ID放在一个集合中,然后使用set操作删除list.txt中找到的所有txids。剩下的是要写入文件的新事务:

with open('text.json') as json_file:
    recieve_txids = {o['txid'] for o in json.load(json_file) if o['category'] == 'recieve'}

with open('list.txt', "r") as list_file:
    recieve_txids -= {l.strip() for l in list_file}

with open('processed.txt', "w") as output_file:
    for txid in recieve_txids:
        output_file.write(txid + '\n')

如果您仍然需要访问原始JSON对象,请使用以txid为键的字典,然后从文件中找到的字典中删除所有元素:

with open('text.json') as json_file:
    recieve_txids = {o['txid']: o for o in json.load(json_file) if o['category'] == 'recieve'}

with open('list.txt', "r") as list_file:
    for line in list_file:
        txid = l.strip()
        if txid in recieve_txids:
            del recieve_txids[txid]

答案 1 :(得分:0)

open()函数返回迭代器而非列表,因此您只能使用in运算符一次。在同一个迭代器上重复使用in只检查已经到达的文件返回的空列表是否为结束。

此外,每一行仍然包含尾随行分隔符,因此您应该使用strip('\n\r')来删除它们。

要快速检查项目是否在列表中,您应该使用set

这样的事情应该有效:

transaction_ids = set()
with open('list.txt', 'r') as list_file:
    for line in list_file:
        transaction_ids.add(line.rstrip('\n\r')

for txid in receive_txids:
    if txid not in transaction_ids:
        print "NEW TRANSACTION";