我怎么能用python解析所有的HTML代码?

时间:2018-03-14 22:56:26

标签: python html python-3.x parsing beautifulsoup

我需要从具有授权的网站解析部分HTML。 但是当我尝试这样做时,我的脚本无法找到所有标签 这部分:

<tbody>              
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td class="ng-binding">name</td>
    <td></td>
</tr><!-- end ngIf: bsks -->
<!-- ngIf: (bsks | size)>0 --><tr class="bsstr ng-scope" ng-if="(bsks | size)>0">
    <td></td>
    <td></td>
    <td></td>
    <td><b class="ng-binding">сумма</b></td>
    <td></td>
</tr><!-- end ngIf: (bsks | size)>0 -->
<!-- ngIf: (bsks | size) === 0 -->
<!-- ngRepeat: item in bsks | orderBy: date --><!-- ngIf: (bsks | size) > 0 --><tr class="bsstr ng-scope" ng-repeat="item in bsks | orderBy: date" ng-if="(bsks | size) > 0">
    <td>

我是初学者,请帮我解析这部分内容 如何获得所需的所有标签? 该网站还有另一个授权页面(url = self.BASE_URL + 'api/v1/login/auth?info=1'

class Auth:
    BASE_URL = 'http.............'

    def auth(self):
        params = {
            'user': u'g1625719',
            'pass': u'472001',
            'from_site': 1,
            'dev': u'16e753be3dc097354e3328e47c3701a9'
        }
        session = requests.Session()
        url = self.BASE_URL + 'api/v1/login/auth?info=1'
        r = session.post(url, params)
        print(r.text)

    def get_url(self):
        url = self.BASE_URL + '#!/line/cart/checklist/'
        print(url)
        response = urllib.request.urlopen(url)
        return response.read()

    def parse(self):
        soup = BeautifulSoup(self.get_url(), 'html.parser')
        table = soup.body.find('div', {'class': 'example-animate-container'})
        print(table)

这是错误的工作。

1 个答案:

答案 0 :(得分:0)

尝试使用find_all(https://www.crummy.com/software/BeautifulSoup/bs4/doc/#searching-the-tree

let newIndexPath = IndexPath(item: collectionViewDataSource.count, section: 0)
collectionViewDataSource.append("new string")
collectionView.insertItems(at: [newIndexPath])