导航

时间:2016-10-15 14:16:53

标签: python string for-loop

我在Python中有以下代码(在PyCharm社区版中):

def defer_tags(sentence):

    for letter in sentence:
        print(letter)
        if letter == '<':
            end_tag = sentence.find('>')
            sentence = sentence[end_tag+1:]
            print(sentence)

defer_tags("<h1>Hello")

它产生了以下输出:

current letter =  <
new_sentence =  Hello
current letter =  h
current letter =  1
current letter =  >
current letter =  H
current letter =  e
current letter =  l
current letter =  l
current letter =  o

为什么loop(letter)导航整个字符串(sentence),即使sentence的值在循环内发生了变化?

我在更改后打印出sentence的值,但它没有反映在循环迭代中。

2 个答案:

答案 0 :(得分:0)

要明确,请尝试使用美丽的汤:

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('<h1>Hello<h1>')
>>> soup.text
u'Hello'

答案 1 :(得分:-1)

从标签中捕捉短语的更好方法就是使用re。

import re
def defer_tags(sentence):
    return re.findall(r'>(.+)<', sentence)

defer_tags('<h1>Hello<h1>')
> ['Hello']
defer_tags('<h1>Hello</h1><h2>Ahoy</h2>')
> ['Hello', 'Ahoy']

如果标签已满,这将有效。即<h2>Hello</h2><h1>Ahoy</h1> <h2>XX</h2>