Python Embedded适用于范围循环

时间:2017-07-25 22:31:52

标签: python loops range

我希望以下命令从此范围内的地址中获取日期,但我似乎无法让它运行多次。我正在使用Python 3.正如您在下面看到的那样,该网站的网址附加了i以便阅读http://zinc.docking.org/substance/10; http://zinc.docking.org/substance/11 ......等等。这是代码:

import bs4 as bs
import urllib.request
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
for row in table1.findAll('tr'):
    row1 = row.findAll('td')
ate = row1[0].getText()
print(ate)

这是我的输出:

$python3 Date.py
November 11th, 2005

然而,脚本应该给我3个日期。这段代码工作,所以我知道row [0]实际上包含一个值。我觉得有一些简单的格式错误,但我不知道从哪里开始故障排除。当我格式化它时#34;正确"这是代码:

import bs4 as bs
import urllib.request
import pandas as pd
import csv
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
    table2 = soup.find("table", attrs={"class": "protomers"})
    for row in table1.findAll('tr'):
        row1 = row.findAll('td')
        ate = row1[0].getText()
        print(ate)

我得到的错误如下:

Traceback (most recent call last):
File "Stack.py", line 11, in <module>
ate = row1[1].getText()
IndexError: list index out of range

第一个代码有效,所以我知道row [0]确实包含一个值。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

您可能想要修复缩进:

import bs4 as bs
import urllib.request
site = "http://zinc.docking.org/substance/"
for i in range(10, 16): 
    site1 = str("%s%i" % (site, i))
    sauce = urllib.request.urlopen(site1).read()
    soup = bs.BeautifulSoup(sauce, 'lxml')
    table1 = soup.find("table", attrs={"class": "substance-properties"})
    for row in table1.findAll('tr'):
        row1 = row.findAll('td')
        Date = row1[0].getText()
        print(Date)

编辑:您应该重命名Date变量,即保留名称。此外,按照惯例,Python变量是小写的。