Question

这是我的代码：

    for name in doc_preparate.cssselect('.dbl1:first-child'):
        if name.text != u"Продукция":
            print name.text

我不知道为什么它不起作用。结果如下：

Артрозан
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
...

P.S。

我试过了：

    for name in doc_preparate.cssselect('.dbl1:first-child'):
        print type(name.text)
        if u"Продукция" not in name.text:
            print name.text

但它也不起作用：（

如何解决此问题？

Answer 1

可能是因为您正在尝试使用等号进行字符串比较。这有隐藏的问题，即字符串是字符列表。这在c中更为明显，如果你比较字符串，你会得到不好的结果，因为你是将第一个字符串的指针与第二个字符串的指针进行比较。

Python足够聪明，可以使用更明显的比较运算符，但如果你的字符串不完全相同，那么它将返回false。如果您的数据是用空格填充到一定数量的字符，那么您的字符串将在内部不同。

whitespace = 'Python   '
str = 'Python'

这些不评估相同。要查看输入中是否包含字符串，请使用

str in whitespace

但请注意，对于

，这将返回true

'Python' in 'Python    '
'Python' in 'PythonAnd other stuff   '

检查字符串上的python文档以获取更多信息和替代方法。

Answer 2

检查name.text的类型。

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "allo"
>>> b= u"allo"
>>> type(a)
<type 'str'>
>>> type(b)
<type 'unicode'>
>>>

确保name.text的类型也是unicode。在Python 3中，所有字符串都是unicode。

如何将Unicode字符串与lxml元素和简单字符串进行比较？

2 个答案: