在python

时间:2017-11-11 11:49:32

标签: python-3.x

我想从类中的类中提取“P”标记内的文本,但它无法正常工作

 <div class="sques_quiz">
 <div class="wp_quiz_question testclass"><span class="quesno">2. </span></p>
 <p>What is capital of India?</p>
 </div>
 <div type="A" class="wp_quiz_question_options">[A] Delhi />[B] Kolkata<br 
/>[C] Mumbai<br />[D] None of the above
            </div>
 <p><a class="wp_basic_quiz_showans_btn" 
 onclick="if(jQuery(this).hasClass('showanswer')){ jQuery(this).html('Show 
 Answer').removeClass('showanswer'); jQuery('.ques_answer_3652').slideUp(); 
  }else { jQuery(this).html('Hide Answer').addClass('showanswer'); 
  jQuery('.ques_answer_3652').slideDown();}">Show Answer</a></p>
 <div class="wp_basic_quiz_answer ques_answer_3652" style="display:none;">
 <div class="ques_answer"><b>Correct Answer:</b> A [Delhi ]</div>
 <div class="answer_hint"><b>Answer Explanation:</b></p>
  <p>   Delhi is the capital city of india</p>
  </div>
  </div>
  </div>

到目前为止,我的代码是

    for foo in soup.find_all('div', attrs={'class': 'sques_quiz'}):
        bar = foo.find("div", attrs={'class': 'wp_quiz_question testclass'})
        for a in bar.find('p'):
            print(a)  

在bar.find('p')中给出错误: TypeError:'NoneType'对象不可迭代

我希望输出为

印度的首都是什么?

[A]德里

[B]加尔各答

[C]孟买

[D]以上都不是

正确答案:A [德里]

答案说明:德里是印度的首都

2 个答案:

答案 0 :(得分:1)

这不是一个优雅的解决方案,但我认为它完成了工作。

int variable = 99, x, y;
char entry[3];

printf("variable:%d\n",variable); // prints 99

scanf("%s", &entry);

convertXY(&x, &y, entry);

printf("variable:%d\n",variable); // prints a random number

这是输出。

from bs4 import *
import requests

site = 'https://www.gktoday.in/gk-current-affairs-quiz-october-4-2017/'
request = requests.get(site).text

soup = BeautifulSoup(request,'html.parser')

answer_row = 0
for foo in soup.find_all('div', attrs={'class': 'sques_quiz'}):
    #print(foo)
    print(foo.find_next('p').text)
    question = soup.find_all('div', {'class': 'wp_quiz_question_options'})[answer_row].text
    answer = soup.find_all('div', {'class': 'ques_answer'})[answer_row].text
    answer_hint = soup.find_all('div', {'class': 'answer_hint'})[answer_row]
    answer_hint = answer_hint.text + answer_hint.find_next('p').text
    print(question)
    print(answer)
    print(answer_hint)
    print('')
    answer_row += 1

答案 1 :(得分:1)

如评论中所述,您需要使用find_all来迭代它,因为find仅返回它找到的第一个元素。

此代码以您想要的格式提供结果:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.gktoday.in/gk-current-affairs-quiz-october-4-
2017/')
soup = BeautifulSoup(r.text, 'lxml')
for div in soup.find_all('div', {'class':'sques_quiz'})[:1]:
    ques = div.find('div', {'class':'wp_quiz_question 
    testclass'}).find('p').text.strip()
    options = div.find('div', 
    {'class':'wp_quiz_question_options'}).text.split('[')
    ans = div.find('div', {'class':'ques_answer'}).text.strip()
    exp = div.find('div', {'class':'answer_hint'}).text.strip()
    print(ques)
    print('['+options[1])
    print('['+options[2])
    print('['+options[3])
    print('['+options[4])
    print(ans)
    print(exp)

这就是结果:

Who of the following have won the Nobel Prize in Chemistry 2017?
[A] Jean-Pierre Sauvage, Fraser Stoddart and Ben Feringa 
[B] Tomas Lindahl and Paul L. Modrich
[C] Brian K. Kobilka and Robert J. Lefkowitz
[D] Jacques Dubochet, Joachim Frank and Richard Henderson

Correct Answer: D [Jacques Dubochet, Joachim Frank and Richard 
Henderson]
Answer Explanation:
The Nobel Prize in Chemistry 2017 was awarded to Jacques Dubochet, 
Joachim Frank and Richard Henderson for developing cryo-electron  
microscopy for the high-resolution structure determination of 
biomolecules in solution. It’s a method of simplifying and improving the 
imaging of biomolecules.

要在页面上打印所有结果,请将此行for div in soup.find_all('div', {'class':'sques_quiz'})[:1]:更改为:for div in soup.find_all('div', {'class':'sques_quiz'}):