Question

首先，如果标题不是很清楚，我很抱歉;我不太清楚如何解释我想用标题做什么;反正。

我从网站上获得了一些信息;我已经拥有了我想要的信息，但是当我运行脚本时，我得到如下输出：

Ivern Jungle
Starting Items                                                  
Hunter's Talisman
Refillable Potion
Warding Totem
First Goal                                                      


Stalker's Blade
Tracker's Knife
Boots of Speed
Hunter's Potion
Vision Ward
Sweeping Lens
Second Goal

当我希望它是这样的时候：

Ivern Jungle

Starting Items                                                  
Hunter's Talisman
Refillable Potion
Warding Totem


First Goal                              
Stalker's Blade
Tracker's Knife
Boots of Speed
Hunter's Potion
Vision Ward
Sweeping Lens
Second Goal

我用代码尝试过一些东西;这是我能让它按照我的意愿运作的唯一方法。 Ivern jungle是标题; Starting Items是另一个标题，First Goal是另一个标题;在我获得头衔之前，然后是其他信息（项目）之前。这是我现在的代码。

        for build_names in guide_page.xpath(".//div[@class='build-container box-shadow-lb']"
                                            "/div[1]/div[1]/div[1]/div[1]/div[1]"):

            for title in build_names.xpath("div[1]/h2/text() | div[3]/div[1]/div/h2/text() | "
                                           "div[3]/div[1]/div/div/div/a/div[2]/span/text()"):
                print(title)

我从title for循环中获取大部分信息，因为这是我设法使其正确的方法;如果有更有效的方法去做;请让我知道

顺便说一句，该信息来自特定网站，但网站可以更改，从另一个特定网站我得到这样的信息：

Kled The Talker # Title
Kled Tank/Ad Top    # Title                                             
Mercury's Treads
The Black Cleaver
Titanic Hydra
Frozen Mallet
Dead Man's Plate
Guardian Angel
Kled Ad/LifeSteal   # Title                                             
Mercury's Treads
The Black Cleaver
Ravenous Hydra
Death's Dance
Maw of Malmortius
Guardian Angel

正如你所看到的，我之间没有任何空格;如果你转到first website，你可以看到在项目部分，每个标题右侧的注释都来自项目部分;我认为那些是将空格放在输出中的因为second website中没有注释。嗯，这是我的主要问题;我该如何格式化输出？如果我没有太清楚地解释自己，请告诉我，我会更新问题，谢谢！：）

Answer 1

通过更频繁地使用类属性，您可以更轻松地导航树。这样，您可以像这样重写脚本：

for div in page.xpath('//div[contains(@class, "item-wrap")]'):
    print("\n{bar}\n{title}\n{bar}".format(
        bar="#"*20, 
        title=div.xpath('.//h2/text()')[0].strip()))
    print('\n'.join(x.strip() for x in div.xpath(
        './/div[contains(@class, "main-items")]//span/text()')))

输出摘录：

####################
Starting Items
####################
Hunter's Talisman
Refillable Potion
Warding Totem

####################
First Goal
####################
Stalker's Blade
Tracker's Knife
Boots of Speed
Hunter's Potion
Vision Ward
Sweeping Lens

####################
Second Goal
####################
Rod of Ages
Boots of Mobility
Ionian Boots of Lucidity
Boots of Swiftness
Sorcerer's Shoes
Oracle Alteration

这些xpath在您链接的第二页上同样有效。

Python xpath - 以正确的顺序获取信息

1 个答案: