Question

我试图在＆＃34;名词＆＃34;中获得所有意义。用户输入的单词的标题。这是我现在的代码：

import requests
from bs4 import BeautifulSoup
word=raw_input("Enter word: ").lower()
        url=('http://www.dictionary.com/browse/'+word)
        r=requests.get(url)
        soup=BeautifulSoup(r.content,"html.parser")
try:
    meaning=soup.find("div",attrs={"class":"def-content"}).get_text()
    print "Meaning of",word,"is: "
    print meaning
except AttributeError:
    print "Sorry, we were not able to find the word."
    pass
finally:
    print "Thank you for using our dictionary."

现在假设用户输入单词＆＃34; today＆＃34;我的输出将是：

   this present day:                 Today is beautiful.

我不明白为什么会留下这么多空间，为什么不是这个部分

＆＃34;今天很漂亮＆＃34;

下来。
无论如何，当您在site上查找该单词时，您可以看到有2个含义，但我的程序只显示了一个。
我希望输出为：

1.this present day:
Today is beautiful.
2.
this present time or age:
the world of today.

任何人都可以解释我的错误，我该如何解决？我不知道出了什么问题，所以请不要以为我试试。

Answer 1

您正在使用上述代码获得第一个名词。我已经重写了代码，如下所示：

from bs4 import BeautifulSoup
import requests

word = raw_input("Enter word: ").lower()
url = ('http://www.dictionary.com/browse/' + word)
r = requests.get(url)
bsObj = BeautifulSoup(r.content, "lxml")

nouns = bsObj.find("section", {"class": "def-pbk ce-spot"})

data = nouns.findAll('div', {'class': 'def-content'})
count = 1

for item in data:
    temp = ' '.join(item.get_text().strip().split())
    print str(count) + '. ' + temp
    count += 1

说明：

是。假设网站首先显示名词含义，我正在检索包含完整名词数据的第一部分。然后我在数据变量中找到该部分下的所有含义，并在循环中迭代它并获取数据中存在的每个含义的文本。然后删除所有额外的空格我拆分所提取的文本，并将其与一个空格连接，同时在开头添加一个数字

Answer 2

你可以通过pass strip = True去除文本的whitesapce到get_text（）

你没有得到所有文字的原因是你的seletor错了，你应该让范围更大。我添加了separator =＆＃39; \ n＆＃39; to get_text（）格式化输出。如果您有任何问题，可以阅读BeautifulSoup文档。

如何获取网站上显示的字典范围？

2 个答案:

你可以通过pass strip = True去除文本的whitesapce到get_text（）