在python中解析相同的JSON字符串多次

时间:2017-12-06 06:52:06

标签: python json

我需要解析以下json输出,所以我可以解析Title条目

[{"Title":"000webhost","Name":"000webhost","Domain":"000webhost.com","BreachDate":"2015-03-01","AddedDate":"2015-10-26T23:35:45Z","ModifiedDate":"2015-10-26T23:35:45Z","PwnCount":13545468,"Description":"In approximately March 2015, the free web hosting provider <a href=\"http://www.troyhunt.com/2015/10/breaches-traders-plain-text-passwords.html\" target=\"_blank\" rel=\"noopener\">000webhost suffered a major data breach</a> that exposed over 13 million customer records. The data was sold and traded before 000webhost was alerted in October. The breach included names, email addresses and plain text passwords.","DataClasses":["Email addresses","IP addresses","Names","Passwords"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"png"},{"Title":"Lifeboat","Name":"Lifeboat","Domain":"lbsg.net","BreachDate":"2016-01-01","AddedDate":"2016-04-25T21:51:50Z","ModifiedDate":"2016-04-25T21:51:50Z","PwnCount":7089395,"Description":"In January 2016, the Minecraft community known as Lifeboat <a href=\"https://motherboard.vice.com/read/another-day-another-hack-7-million-emails-and-hashed-passwords-for-minecraft\" target=\"_blank\" rel=\"noopener\">was hacked and more than 7 million accounts leaked</a>. Lifeboat knew of the incident for three months before the breach was made public but elected not to advise customers. The leaked data included usernames, email addresses and passwords stored as straight MD5 hashes.","DataClasses":["Email addresses","Passwords","Usernames"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"svg"}]

要解析,我使用以下代码:

cat $myfile | python -c "import sys, json; print json.load(sys.stdin)[0]['Title']"

但这导致输出:

  

000webhost的

而我需要输出:

  

000webhost的

     

救生艇

2 个答案:

答案 0 :(得分:2)

如果要显示所有标题,则需要循环遍历数组中的项目。目前,您要求提供第一项[0]

你可以使用理解来提取标题:

[item['Title'] for item in json.load(sys.stdin)]

然后循环打印出每个标题:

for title in [item['Title'] for item in json.load(sys.stdin)]: print title

因此完整的命令行脚本将是:

cat $myfile | python -c "import sys, json; for title in [item['Title'] for item in json.load(sys.stdin)]: print title"

答案 1 :(得分:0)

你真的应该用正确的脚本来做这件事。此外,这是对cat的多余使用,您应该将Bash参数扩展放在双引号内以防止分词。如果您确定路径不包含空格,则可以省略引号,但这并不是一个好习惯。

无论如何,这段代码适用于Python 2和Python 3。

python -c "import sys,json;print('\n'.join([u['Title']for u in json.load(open(sys.argv[1]))]))" "$myfile"

<强>输出

000webhost
Lifeboat

以下是如何将其编写为正确的脚本。

import sys
import json

with open(sys.argv[1]) as f:
    data = json.load(f)
print('\n'.join([u['Title'] for u in data]))