Question

我有一个像s = ['a','\xe7\xbe\x8e\xe7','b']这样的单词列表，我希望删除'\xe7\xbe\x8e\xe7'之类的成员，但我想不出任何有用的方法。我从来没有处理过这种编码或解码过的单词。我希望在python中有任何建议。谢谢！

Answer 1

def is_ascii(s):
    return all(ord(c) < 128 for c in s)
s=[e for e in s if is_ascii(e)]

试试这个。它将删除包含非ascii字符的条目（如\xe7\xbe\x8e\xe7）。希望这有帮助！

Answer 2

您可以使用isalnum检查列表中的每个字词是alphanumeric 功能。如果word是字母数字，那么请保留它，否则丢弃它。这可以使用列表理解来实现

>>> s = ['a','\xe7\xbe\x8e\xe7','b']
>>> [a for a in s if a.isalnum()]
>>> ['a', 'b']

注意：isalnum检查字符串是否为字母数字，即包含字母和/或数字。如果您只想允许字母，请使用isalpha代替

Answer 3

试试这个：

import itertools

s = ['a','\xe7\xbe\x8e\xe7','b']
for i in range(s.count("\xe7\xbe\x8e\xe7")):
    s.remove('\xe7\xbe\x8e\xe7')

然后所有出现的＆＃34; \ xe7 \ xbe \ x8e \ xe7＆＃34;将从列表中删除。