从嵌套在列表中的字典中提取一个值,该列表嵌套在另一个字典中...嵌套在另一个字典中

时间:2020-06-18 22:25:10

标签: python

我目前正在尝试使用Google CSE对项目进行一些抓取。这几乎是我第一次爬网。我几个季度前在学校上了Python课,而抓取原定是我们的最后一个主题,但我们从来没有真正去过。反正...

这就是我想要做的:

使用Google CSE为“观鸟”和“喂鸟”提取Google新闻结果。从查询的结果中,我要提取文章标题,文章链接及其发布日期。然后,我想将所有内容都写到csv中。

这是到目前为止我要努力的事情(在https://gist.github.com/nikhilkumarsingh/5bce182ed57ae73f6cbde52fe846991b的大力帮助下,如果其他人正在寻找CSE简介,那就太好了!):

使用for循环返回标题和链接,以获取查询结果。现在,我只是打印出来以确保得到结果。稍后再写给csv。我的查询结果对象是一个名为“结果”的字典,它看起来像这样(我为要发布的大量代码表示歉意,但我的问题与嵌套有关,所以我认为这是最清晰的解释方法):< / p>

    {'kind': 'customsearch#search', 'url': {'type': 'application/json',
 'template': 'https://www.googleapis.com/customsearch/v1?q=
{searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe=
{safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr=
{cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq=
{hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter=
{siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms=
{excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite=
{relatedSite?}&dateRestrict={dateRestrict?}&lowRange=
{lowRange?}&highRange={highRange?}&searchType={searchType}&fileType=
{fileType?}&rights={rights?}&imgSize={imgSize?}&imgType=
{imgType?}&imgColorType={imgColorType?}&imgDominantColor=
{imgDominantColor?}&alt=json'}, 'queries': {'request': [{'title': 'Google 
Custom Search - bird watching', 'totalResults': '104000', 'searchTerms': 
'bird watching', 'count': 10, 'startIndex': 1, 'inputEncoding': 'utf8', 
'outputEncoding': 'utf8', 'safe': 'off', 'cx': 
'017465438656188383295:ul7lxhkonwq'}], 'nextPage': [{'title': 'Google 
Custom Search - bird watching', 'totalResults': '104000', 'searchTerms': 
'bird watching', 'count': 10, 'startIndex': 11, 'inputEncoding': 'utf8',
 'outputEncoding': 'utf8', 'safe': 'off', 'cx': 
'017465438656188383295:ul7lxhkonwq'}]}, 'context': {'title': 'google 
news'}, 'searchInformation': {'searchTime': 0.491713, 
'formattedSearchTime': '0.49', 'totalResults': '104000', 'formattedTotalResults': '104,000'}, 'items': [{'kind': 
'customsearch#result', 'title': 'Amy Cooper: White woman who called police 
on a black man in ...', 'htmlTitle': 'Amy Cooper: White woman who called 
police on a black man in ...', 'link': 
'https://news.google.com/articles/CAIiEDCQPCzyU2erjQLyLr_nLqUqGQgEKhAIACoH
CAowocv1CjCSptoCMPrTpgU?hl=en-US&gl=US&ceid=US%3Aen', 'displayLink': 
'news.google.com', 'snippet': 'May 26, 2020 ... White woman who called 
police on a black man bird-watching in Central Park \nhas been fired. By 
Amir Vera and Laura Ly, CNN. Updated 4:21\xa0...', 'htmlSnippet': 'May 26, 
2020 <b>...</b> White woman who called police on a black man <b>bird</b>-
<b>watching</b> in Central Park <br>\nhas been fired. By Amir Vera and 
Laura Ly, CNN. Updated 4:21&nbsp;...', 'formattedUrl': 
'https://news.google.com/.../CAIiEDCQPCzyU2erjQLyLr_ 
nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?...', 'htmlFormattedUrl': 
'https://news.google.com/.../CAIiEDCQPCzyU2erjQLyLr_ 
nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?...', 'pagemap': {'thumbnail': 
[{'src': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-
park-video-dog-video-african-american-trnd-screengrab-super-tease.jpg'}],
 'metatags': [{'template-top': 'us,news,art-vid-vls-col,col-top-news', 
'og:image': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-
central-park-video-dog-video-african-american-trnd-screengrab-super-
tease.jpg', 'twitter:card': 'summary_large_image', 'og:image:width': 
'1100', 'theme-color': '#000000', 'og:site_name': 'CNN', 'section': 'us', 
'vr:canonical': 'https://www.cnn.com/2020/05/26/us/central-park-video-dog-
video-african-american-trnd/index.html', 'article:content-tier': 'free', 
'og:description': 'The white woman who called police on a black man in 
Central Park during an encounter involving her unleashed dog has been 
fired from her job, her employer said Tuesday.', 'twitter:image': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 'og:pubdate': '2020-05-26T06:19:40Z', 'lastmod': '2020-05-26T20:21:18Z', 'pubdate': '2020-05-26T06:19:40Z', 'twitter:title': 'White woman who called police on a black man bird-watching in Central Park has been fired', 'og:type': 'article', 'thumbnail': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'author': 'Amir Vera and Laura Ly, CNN', 'og:title': 'White woman who 
called police on a black man bird-watching in Central Park has been 
fired', 'og:image:height': '619', 'fb:pages': '5550296508,18793419640', 
'referrer': 'unsafe-url', 'fb:app_id': '80401312489', 'viewport': 
'width=device-width, initial-scale=1.0, minimum-scale=1.0', 
'twitter:description': 'The white woman who called police on a black man 
in Central Park during an encounter involving her unleashed dog has been 
fired from her job, her employer said Tuesday.', 'og:url': 
'https://www.cnn.com/2020/05/26/us/central-park-video-dog-video-african-
american-trnd/index.html', 'article:opinion': 'false'}], 'cse_image': 
[{'src': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-
park-video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'width': '299', 'type': '1', 'height': '168'}], 'newsarticle': [{'image': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'keywords': 'us, Amy Cooper: White woman who called police on a black man 
in Central Park has been fired - CNN', 'author': 'Amir Vera and Laura Ly, 
CNN', 'ispartof': 'news', 'description': 'The white woman who called 
police on a black man in Central Park during an encounter involving her 
unleashed dog has been fired from her job, her employer said Tuesday.', 
'datecreated': '2020-05-26T06:19:40Z', 'url': 
'https://www.cnn.com/2020/05/26/us/central-park-video-dog-video-african-
american-trnd/index.html', 'articlebody': '(CNN)The white woman who called
 police on a black man in Central Park during an encounter involving her 
unleashed dog has been fired from her job, her employer said 
Tuesday."Following our internal...', 'datemodified': '2020-05-
26T20:21:18Z', 'articlesection': 'us', 'alternativeheadline': 'White woman who called police on a black man bird-watching in Central Park has been 
fired', 'headline': 'Amy Cooper: White woman who called police on a black 
man in Central Park has been fired - CNN', 'datepublished': '2020-05-
26T06:19:40Z', 'thumbnailurl': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg'}]}}

我提取链接和标题的代码如下:

for item in result['items']:
    print(item['title'], item['link'])

这就是我要坚持的内容:

文章发表日期的键,“ pubdate”嵌套在许多词典和列表中。我很难将其循环拉出。嵌套(无论是循环形式还是数据结构形式)可能是我在编码方面的最大弱点。

包含我感兴趣的所有信息的键是“项”,其值是字典列表:


'items': [{'kind': 'customsearch#result', 'title': 'Amy Cooper: White 
woman who called police on a black man in ...', 'htmlTitle': 'Amy Cooper: 
White woman who called police on a black man in ...', 'link': 
'https://news.google.com/articles/CAIiEDCQPCzyU2erjQLyLr_nLqUqGQgEKhAIACoH
CAowocv1CjCSptoCMPrTpgU?hl=en-US&gl=US&ceid=US%3Aen', 'displayLink': 
'news.google.com', 'snippet': 'May 26, 2020 ... White woman who called 
police on a black man bird-watching in Central Park \nhas been fired. By 
Amir Vera and Laura Ly, CNN. Updated 4:21\xa0...', 'htmlSnippet': 'May 26,
 2020 <b>...</b> White woman who called police on a black man <b>bird</b>-
<b>watching</b> in Central Park <br>\nhas been fired. By Amir Vera and 
Laura Ly, CNN. Updated 4:21&nbsp;...', 'formattedUrl': 
'https://news.google.com/.../CAIiEDCQPCzyU2erjQLyLr_ 
nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?...', 'htmlFormattedUrl': 
'https://news.google.com/.../CAIiEDCQPCzyU2erjQLyLr_ 
nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?...', 'pagemap': {'thumbnail': 
[{'src': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-
park-video-dog-video-african-american-trnd-screengrab-super-tease.jpg'}], 
'metatags': [{'template-top': 'us,news,art-vid-vls-col,col-top-news', 
'og:image': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-
central-park-video-dog-video-african-american-trnd-screengrab-super-
tease.jpg', 'twitter:card': 'summary_large_image', 'og:image:width': 
'1100', 'theme-color': '#000000', 'og:site_name': 'CNN', 'section': 'us', 
'vr:canonical': 'https://www.cnn.com/2020/05/26/us/central-park-video-dog-
video-african-american-trnd/index.html', 'article:content-tier': 'free', 
'og:description': 'The white woman who called police on a black man in 
Central Park during an encounter involving her unleashed dog has been
 fired from her job, her employer said Tuesday.', 'twitter:image': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'og:pubdate': '2020-05-26T06:19:40Z', 'lastmod': '2020-05-26T20:21:18Z', 
'pubdate': '2020-05-26T06:19:40Z', 'twitter:title': 'White woman who 
called police on a black man bird-watching in Central Park has been 
fired', 'og:type': 'article', 'thumbnail': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'author': 'Amir Vera and Laura Ly, CNN', 'og:title': 'White woman who 
called police on a black man bird-watching in Central Park has been 
fired', 'og:image:height': '619', 'fb:pages': '5550296508,18793419640', 
'referrer': 'unsafe-url', 'fb:app_id': '80401312489', 'viewport': 
'width=device-width, initial-scale=1.0, minimum-scale=1.0', 
'twitter:description': 'The white woman who called police on a black man 
in Central Park during an encounter involving her unleashed dog has been 
fired from her job, her employer said Tuesday.', 'og:url': 
'https://www.cnn.com/2020/05/26/us/central-park-video-dog-video-african-
american-trnd/index.html', 'article:opinion': 'false'}]

在列表aka = result ['items'] [0]中的第一个词典中,我们有键'pagemap',其值是另一个词典,在其中我们有键'metatags',其值是词典列表。此列表的第一个索引包含一个字典,该字典包含我正在寻找其值“ pubdate”的键(我在代码块中放置了几个空格,以便您可以轻松地找到该值):


'metatags': [{'template-top': 'us,news,art-vid-vls-col,col-top-news', 
'og:image': 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-
central-park-video-dog-video-african-american-trnd-screengrab-super-
tease.jpg', 'twitter:card': 'summary_large_image', 'og:image:width': 
'1100', 'theme-color': '#000000', 'og:site_name': 'CNN', 'section': 'us',
 'vr:canonical': 'https://www.cnn.com/2020/05/26/us/central-park-video-
dog-video-african-american-trnd/index.html', 'article:content-tier': 
'free', 'og:description': 'The white woman who called police on a black 
man in Central Park during an encounter involving her unleashed dog has 
been fired from her job, her employer said Tuesday.', 'twitter:image':
 'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'og:pubdate': '2020-05-26T06:19:40Z', 'lastmod': '2020-05-26T20:21:18Z',



'pubdate': '2020-05-26T06:19:40Z', 'twitter:title': 'White woman who 
called police on a black man bird-watching in Central Park has been 
fired', 'og:type': 'article', 'thumbnail': 
'https://cdn.cnn.com/cnnnext/dam/assets/200526102231-02-central-park-
video-dog-video-african-american-trnd-screengrab-super-tease.jpg', 
'author': 'Amir Vera and Laura Ly, CNN', 'og:title': 'White woman who 
called police on a black man bird-watching in Central Park has been 
fired', 'og:image:height': '619', 'fb:pages': '5550296508,18793419640', 
'referrer': 'unsafe-url', 'fb:app_id': '80401312489', 'viewport': 
'width=device-width, initial-scale=1.0, minimum-scale=1.0', 
'twitter:description': 'The white woman who called police on a black man 
in Central Park during an encounter involving her unleashed dog has been 
fired from her job, her employer said Tuesday.', 'og:url': 
'https://www.cnn.com/2020/05/26/us/central-park-video-dog-video-african-
american-trnd/index.html', 'article:opinion': 'false'}]


希望您能够通过这个相当粗糙的巢结构跟着我...

所以理想情况下,我正在寻找的是一个可以让我恢复活力的循环:

Amy Cooper: White woman who called police on a black man in ... https://news.google.com/articles/CAIiEDCQPCzyU2erjQLyLr_nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?hl=en-US&gl=US&ceid=US%3Aen
2020-05-26T06:19:40Z

等等,查询结果中的下一个故事。

我最近得到的是:

for item in result['items']:
        print(item['title'], item['link'])
        for date in result['items'][0]['pagemap']['metatags']:
            print (date['pubdate'])

这很接近,但只返回第一个故事的日期,即使循环继续到下一个故事:

Amy Cooper: White woman who called police on a black man in ... https://news.google.com/articles/CAIiEDCQPCzyU2erjQLyLr_nLqUqGQgEKhAIACoHCAowocv1CjCSptoCMPrTpgU?hl=en-US&gl=US&ceid=US%3Aen
2020-05-26T06:19:40Z
Christian Cooper shouldn't need a Harvard degree to survive birding ... https://news.google.com/articles/CAIiEOCKmxd9S5s5cwM5xs0AivoqGAgEKg8IACoHCAowjtSUCjC30XQwzqe5AQ?hl=en-US&gl=US&ceid=US%3Aen
2020-05-26T06:19:40Z
People called police on this black birdwatcher so many times that he ... https://news.google.com/articles/CAIiEOkNNX95htD_KKDYihI5JcoqGAgEKg8IACoHCAowjtSUCjC30XQwzqe5AQ?hl=en-US&gl=US&ceid=US%3Aen
2020-05-26T06:19:40Z
A black man bird-watching in Central Park asked a white woman to ... https://news.google.com/articles/CAIiENZfU5G5gfmzo2CysHOaY0sqFQgEKg0IACoGCAowuLUIMNFnMLnhAg?hl=en-US&gl=US&ceid=US%3Aen
2020-05-26T06:19:40Z
What's a Tough Call in Bird Watching? Identifying a Gull - WSJ https://news.google.com/articles/CAIiEMKd4gQ1olRNd5T2Ndlpiu8qGAgEKg8IACoHCAow1tzJATDnyxUwuK20AQ
2020-05-26T06:19:40Z
Any advice, tips, help, or words of nested for loop wisdom would be greatly appreciated!!!!

1 个答案:

答案 0 :(得分:1)

您每次都访问result['items'][0]中数组的第一个单元格。工作代码:

for item in result['items']:
    print(item['title'], item['link'])
    for date in item['pagemap']['metatags']:
        print(date.get('pubdate', 'Pubdate is not specified'))