从静态xml文件打印数据

时间:2018-05-15 17:50:11

标签: python urllib3

我正在尝试在我下载的xml文件中查找产品名称。我已经想出如何使用while循环显示每个结果。我的问题是,我想只显示前10个结果。此外,我需要能够单独调用每个结果。

例如:print(read_xml_code.start_tag_5)将在XML文件中打印第5个产品。 print(read_xml_code.start_tag_10)将打印第10个

到目前为止,这是我的代码:

# Define the Static webpage XML file
static_webpage_1 = 'StaticStock/acoustic_guitar.html'


def Find_static_webpage_product_name():
    # Open and read the contents of the first XML file
    read_xml_code = open(static_webpage_1, encoding="utf8").read()
    # Find and print the static page title.
    start_tag = '<title><![CDATA['
    end_tag = ']]></title>'
    end_position = 0
    starting_position = read_xml_code.find(start_tag, end_position)
    end_position = read_xml_code.find(end_tag, starting_position)
    while starting_position != -1 and end_position!= -1:
        print(read_xml_code[starting_position + len(start_tag) : end_position]+ '\n')
        starting_position = read_xml_code.find(start_tag, end_position)
        end_position = read_xml_code.find(end_tag, starting_position)

#call function
Find_static_webpage_product_name()

1 个答案:

答案 0 :(得分:0)

python标准库(python 3)中有一个HTML解析器: https://docs.python.org/3/library/html.parser.html

您可以轻松等待 标记事件,并使用成员变量进行一些计数。

另外,不要忘记关闭资源(with open(static_webpage_1, encoding="utf8") as f: ...)