我想从类中的类中提取“P”标记内的文本,但它无法正常工作
<div class="sques_quiz">
<div class="wp_quiz_question testclass"><span class="quesno">2. </span></p>
<p>What is capital of India?</p>
</div>
<div type="A" class="wp_quiz_question_options">[A] Delhi />[B] Kolkata<br
/>[C] Mumbai<br />[D] None of the above
</div>
<p><a class="wp_basic_quiz_showans_btn"
onclick="if(jQuery(this).hasClass('showanswer')){ jQuery(this).html('Show
Answer').removeClass('showanswer'); jQuery('.ques_answer_3652').slideUp();
}else { jQuery(this).html('Hide Answer').addClass('showanswer');
jQuery('.ques_answer_3652').slideDown();}">Show Answer</a></p>
<div class="wp_basic_quiz_answer ques_answer_3652" style="display:none;">
<div class="ques_answer"><b>Correct Answer:</b> A [Delhi ]</div>
<div class="answer_hint"><b>Answer Explanation:</b></p>
<p> Delhi is the capital city of india</p>
</div>
</div>
</div>
到目前为止,我的代码是
for foo in soup.find_all('div', attrs={'class': 'sques_quiz'}):
bar = foo.find("div", attrs={'class': 'wp_quiz_question testclass'})
for a in bar.find('p'):
print(a)
在bar.find('p')中给出错误: TypeError:'NoneType'对象不可迭代
我希望输出为
印度的首都是什么?
[A]德里
[B]加尔各答
[C]孟买
[D]以上都不是
正确答案:A [德里]
答案说明:德里是印度的首都
答案 0 :(得分:1)
这不是一个优雅的解决方案,但我认为它完成了工作。
int variable = 99, x, y;
char entry[3];
printf("variable:%d\n",variable); // prints 99
scanf("%s", &entry);
convertXY(&x, &y, entry);
printf("variable:%d\n",variable); // prints a random number
这是输出。
from bs4 import *
import requests
site = 'https://www.gktoday.in/gk-current-affairs-quiz-october-4-2017/'
request = requests.get(site).text
soup = BeautifulSoup(request,'html.parser')
answer_row = 0
for foo in soup.find_all('div', attrs={'class': 'sques_quiz'}):
#print(foo)
print(foo.find_next('p').text)
question = soup.find_all('div', {'class': 'wp_quiz_question_options'})[answer_row].text
answer = soup.find_all('div', {'class': 'ques_answer'})[answer_row].text
answer_hint = soup.find_all('div', {'class': 'answer_hint'})[answer_row]
answer_hint = answer_hint.text + answer_hint.find_next('p').text
print(question)
print(answer)
print(answer_hint)
print('')
answer_row += 1
答案 1 :(得分:1)
如评论中所述,您需要使用find_all
来迭代它,因为find
仅返回它找到的第一个元素。
此代码以您想要的格式提供结果:
import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.gktoday.in/gk-current-affairs-quiz-october-4-
2017/')
soup = BeautifulSoup(r.text, 'lxml')
for div in soup.find_all('div', {'class':'sques_quiz'})[:1]:
ques = div.find('div', {'class':'wp_quiz_question
testclass'}).find('p').text.strip()
options = div.find('div',
{'class':'wp_quiz_question_options'}).text.split('[')
ans = div.find('div', {'class':'ques_answer'}).text.strip()
exp = div.find('div', {'class':'answer_hint'}).text.strip()
print(ques)
print('['+options[1])
print('['+options[2])
print('['+options[3])
print('['+options[4])
print(ans)
print(exp)
这就是结果:
Who of the following have won the Nobel Prize in Chemistry 2017?
[A] Jean-Pierre Sauvage, Fraser Stoddart and Ben Feringa
[B] Tomas Lindahl and Paul L. Modrich
[C] Brian K. Kobilka and Robert J. Lefkowitz
[D] Jacques Dubochet, Joachim Frank and Richard Henderson
Correct Answer: D [Jacques Dubochet, Joachim Frank and Richard
Henderson]
Answer Explanation:
The Nobel Prize in Chemistry 2017 was awarded to Jacques Dubochet,
Joachim Frank and Richard Henderson for developing cryo-electron
microscopy for the high-resolution structure determination of
biomolecules in solution. It’s a method of simplifying and improving the
imaging of biomolecules.
要在页面上打印所有结果,请将此行for div in soup.find_all('div', {'class':'sques_quiz'})[:1]:
更改为:for div in soup.find_all('div', {'class':'sques_quiz'}):