无法将已删除的数据从列表转换为常规字符串

时间:2017-05-11 20:00:10

标签: python web-scraping

当我运行我的抓取工具时,它会将结果作为列表获取。但是,我希望将常规字符串显示在两列中。谢谢你的任何建议。

TERM=vt100

得到这样的结果: enter image description here

但是,我期望输出如下: enter image description here

3 个答案:

答案 0 :(得分:2)

您的TitleLink实际上并不包含单个元素,但两者都包含所有标题和链接的列表(这些XPath表达式匹配多个元件)。

因此,为了获得title, link对的列表,您需要zip()他们在一起:

pairs = zip(titles, links)

完成后,您可以使用for循环遍历这些对,并打印左对齐的项目,以便获得列:

print('{:<70}{}'.format(title, link))

(有关如何打印左对齐项目的详细信息,请参阅this answer。)

一切都在一起:

import requests
from lxml import html

url = "http://www.wiseowl.co.uk/videos/"


def startpoint(links):
    response = requests.get(links)
    tree = html.fromstring(response.text)
    titles = tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/text()")
    links = tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/@href")
    pairs = zip(titles, links)

    for title, link in pairs:
        # Replace '70' with whatever you expect the maximum title length to be
        print('{:<70}{}'.format(title, link))

startpoint(url)

答案 1 :(得分:1)

尝试按顺序迭代这两个列表,如下所示:

import requests
from lxml import html

url="http://www.wiseowl.co.uk/videos/"
def Startpoint(links):
    response = requests.get(links)
    tree = html.fromstring(response.text)
    Title= tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/text()")
    Link=tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a/@href")
    for i,j in zip(Title, Link):
        print('{:<70}{}'.format(i,j))

Startpoint(url)

答案 2 :(得分:1)

您可以遍历每个链接并打印标题和网址。

import requests
from lxml import html

url="http://www.wiseowl.co.uk/videos/"
def Startpoint(links):
    response = requests.get(links)
    tree = html.fromstring(response.text)
    links = tree.xpath("//p[@class='woVideoListDefaultSeriesTitle']/a")
    for link in links:
        print('{title:<70}{url}'.format(title=link.text, url=link.attrib.['href']))

Startpoint(url)