我下载了这个200k Q / A的Jeopardy问题。我认为插入一些琐事机器人会很有趣。无论如何,它的大小只有50M,没有我可以看到的换行符。
我只是想把这个怪物的所有问题和答案都拉成文件格式,如:
"question":
这是文件的部分内容。我知道我不能一行一行,我知道我无法将整个内容加载到内存中。但是,我也知道我想要的是"answer":
后引号中的第一个内容,答案是[{"category": "HISTORY", "air_date": "2004-12-31", "question": "'For the last 8 years of his life, Galileo was under house arrest for espousing this man's theory'", "value": "$200", "answer": "Copernicus", "round": "Jeopardy!", "show_number": "4680"},
{"category": "ESPN's TOP 10 ALL-TIME ATHLETES", "air_date": "2004-12-31", "question": "'No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB seasons with the Reds, Giants & Braves'", "value": "$200", "answer": "Jim Thorpe", "round": "Jeopardy!", "show_number": "4680"},
{"category": "EVERYBODY TALKS ABOUT IT...", "air_date": "2004-12-31", "question": "'The city of Yuma in this state has a record average of 4,055 hours of sunshine each year'", "value": "$200", "answer": "Arizona", "round": "Jeopardy!", "show_number": "4680"},
...
后直接引用的第一句话。
{
"id"=>”0000001”,
"type"=>”cashier”,
"summary"=>”Henock”,
"self"=>"https://google.com/accounts/0000001”,
"html_url"=>"https://google.com/accounts/0000001”
}
{
"id"=>”0000002”,
"type"=>”cashier”,
"summary"=>”Vic”,
"self"=>"https://google.com/accounts/0000002”,
"html_url"=>"https://google.com/accounts/0000002”
}
{
"id"=>”0000003”,
"type"=>”cashier”,
"summary"=>”Mo”,
"self"=>"https://google.com/accounts/0000003”,
"html_url"=>"https://google.com/accounts/0000003”
}
答案 0 :(得分:0)
对于列表中的每个字典,请获取'question'
和'answer'
键:
for l in d:
print l['question'], l['answer']
输出:
'For the last 8 years of his life, Galileo was under house arrest for espousing this man's theory' Copernicus
'No. 2: 1912 Olympian; football star at Carlisle Indian School; 6 MLB seasons with the Reds, Giants & Braves' Jim Thorpe
'The city of Yuma in this state has a record average of 4,055 hours of sunshine each year' Arizona