Question

我正在使用python和美丽的汤做网页表。网页是中文的。在表中，每行有四个元素（全部用中文）当我这样做时，它会打印正确的中文字符 - 所有行的所有四个元素。

            table = soup.find('table', {'class': "tc_table"})               
            trs = table.find_all('tr')
            for tr in trs:
                    ls = []
                    for td in tr.find_all('td'):
                            ls.append(td.text)   

                    ls = [x.encode('utf-8') for x in ls]
                    for i in ls:
                            print(i)

但是，我需要的是每行只有第三和第四个元素。因此，我将我的代码修改为以下内容，但似乎存在IndexError：

文件＆＃34; taocan.py＆＃34;，第83行，在get_data_from_link中

打印（LS [2]）

IndexError：列表索引超出范围

      table = soup.find('table', {'class': "tc_table"})               
      trs = table.find_all('tr')
      for tr in trs:
              ls = []
              for td in tr.find_all('td'):
                      ls.append(td.text)   

              ls = [x.encode('utf-8') for x in ls]
              print(ls[2])
              print(ls[3])

任何人都可以帮我解决这个问题吗？

Web表格抓取：无法从unicode列表

0 个答案: