使用BING网络搜索API的基于Python的PDF搜索

时间:2019-03-28 18:22:45

标签: python-2.7 bing-api

我想使用bing API从关键字中检索相关的pdf。但是我面临一些问题。

关键字列表:

"Hubble Space Telescope", "William Herschel", "Planetary Camera", "Milky Way"

我正在遵循以下代码:

 import requests

subscription_key = "My Key"
assert subscription_key

search_url = "https://api.cognitive.microsoft.com/bing/v7.0/search"
search_term = ["Hubble Space Telescope", "William Herschel", "Planetary Camera", "Milky Way"]

result=[]
for i in range(len(search_term)):
    headers = {"Ocp-Apim-Subscription-Key" : subscription_key}
    params  = {"q": search_term[i],"filetype":"pdf", "textDecorations":True, "textFormat":"HTML"}
    response = requests.get(search_url, headers=headers, params=params)
    response.raise_for_status()
    search_results = response.json()
    print search_results
    result.append(search_results)


from IPython.display import HTML

for i in range(len(result)):
    rows = "\n".join(["""<tr><td><a href=\"{0}\">{1}</a></td><td>{2}</td>
                    </tr>""".format(v["url"].encode("utf-8"),v["name"].encode("utf-8"),v["snippet"].encode("utf-8")) \
                      for v in result[i]["webPages"]["value"]])

HTML("<table>{0}</table>".format(rows))

print rows

尽管我在params部分添加了文件类型:pdf,但没有得到任何pdf。

有人可以建议我继续吗?

1 个答案:

答案 0 :(得分:0)

parser.parseURL('https://bridgetown.podbean.com/feed.xml', function (err, feed) { request(feed.items[0].enclosure.url).pipe(audioFile).on('close', function () { console.log("downloadfinsihed") s3.upload({ Bucket: bucketName, Key: "testperm3.mp3", Body: "Audio.mp3" }, function (err, data) { transcribeservice.startTranscriptionJob({LanguageCode: "en-US", Media:{MediaFileUri: data.Location}, MediaFormat: "mp3", TranscriptionJobName: "testing"}, function (err, data){ //console.log(data); }); }); }); }); 作为搜索词的一部分,而不作为单独的查询字符串参数。