我有一个列表,该列表包含我要删除的unicode元素')'和\ n以及列表中的空格。基本上创造一个"清洁"列表副本。
我尝试将此SO解决方案Remove specific characters from a string in python和python docs strings引用为2.7。
我使用bs4导入创建我的列表,以最小化大小。
def isNotBlank(myString):
if myString and myString.strip():
return True
return False
names = soup.find_all('span', class_="TextLarge")
bucket_list = []
for name in names:
for item in name.contents:
for value in item.split('('):
if isNotBlank(value):
bucket_list.append(value)
translation_table = dict.fromkeys(map(ord, ')(@\\n#$'), None)
[x.translate(translation_table) for x in bucket_list ]
所以print(names)返回
[<span class="TextLarge">Mossfun (11) (Rtg:103)</span>, <span class="TextLarge">58.0</span>, <span class="TextLarge scratched">Atmospherical (8)
(Rtg:99)</span>, <span class="TextLarge">56.5</span>, <span class="TextLarge scratched">Chloe In Paris (7)
(Rtg:97)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Bound For Earth (5) (Rtg:92)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Fine Bubbles (4) (Rtg:91)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Brook Road (9) (Rtg:90)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Shamalia (10) (Rtg:89)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge scratched">Tawteen (6) (Rtg:88)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Ygritte (2) (Rtg:77)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">Tahni Dancer (1) (Rtg:76)</span>, <span class="TextLarge">55.5</span>, <span class="TextLarge">All Salsa (3) (Rtg:72)</span>, <span class="TextLarge">55.5</span>]
和bucket_list返回
[u'Mossfun ', u'11) ', u'Rtg:103)', u'58.0', u'Atmospherical ', u'8) \n ', u'Rtg:99)', u'56.5', u'Chloe In Paris ', u'7) \n ', u'Rtg:97)', u'55.5', u'Bound For Earth ', u'5) ', u'Rtg:92)', u'55.5', u'Fine Bubbles ', u'4) ', u'Rtg:91)', u'55.5', u'Brook Road ', u'9) ', u'Rtg:90)', u'55.5', u'Shamalia ', u'10) ', u'Rtg:89)', u'55.5', u'Tawteen ', u'6) ', u'Rtg:88)', u'55.5', u'Ygritte ', u'2) ', u'Rtg:77)', u'55.5', u'Tahni Dancer ', u'1) ', u'Rtg:76)', u'55.5', u'All Salsa ', u'3) ', u'Rtg:72)', u'55.5']
希望
[['Mossfun', 11, 103, 58.0],[Atmospherical, 8, 99, 56.5]]
目前,它传递所有字符的翻译
答案 0 :(得分:1)
你忽略了这里的返回值;你翻译得很好(尽管实际上没有处理换行符):
>>> bucket_list = [u'Mossfun ', u'11) ', u'Rtg:103)', u'58.0', u'Atmospherical ', u'8) \n ', u'Rtg:99)', u'56.5', u'Chloe In Paris ', u'7) \n ', u'Rtg:97)', u'55.5', u'Bound For Earth ', u'5) ', u'Rtg:92)', u'55.5', u'Fine Bubbles ', u'4) ', u'Rtg:91)', u'55.5', u'Brook Road ', u'9) ', u'Rtg:90)', u'55.5', u'Shamalia ', u'10) ', u'Rtg:89)', u'55.5', u'Tawteen ', u'6) ', u'Rtg:88)', u'55.5', u'Ygritte ', u'2) ', u'Rtg:77)', u'55.5', u'Tahni Dancer ', u'1) ', u'Rtg:76)', u'55.5', u'All Salsa ', u'3) ', u'Rtg:72)', u'55.5']
>>> translation_table = dict.fromkeys(map(ord, ')(@\\n#$'), None)
>>> [x.translate(translation_table) for x in bucket_list ]
['Mossfu ', '11 ', 'Rtg:103', '58.0', 'Atmospherical ', '8 \n ', 'Rtg:99', '56.5', 'Chloe I Paris ', '7 \n ', 'Rtg:97', '55.5', 'Boud For Earth ', '5 ', 'Rtg:92', '55.5', 'Fie Bubbles ', '4 ', 'Rtg:91', '55.5', 'Brook Road ', '9 ', 'Rtg:90', '55.5', 'Shamalia ', '10 ', 'Rtg:89', '55.5', 'Tawtee ', '6 ', 'Rtg:88', '55.5', 'Ygritte ', '2 ', 'Rtg:77', '55.5', 'Tahi Dacer ', '1 ', 'Rtg:76', '55.5', 'All Salsa ', '3 ', 'Rtg:72', '55.5']
但结果存储在新列表中;原始字符串不就地更改,因为它们是不可变的。将结果分配回bucket_list
,并使用\n
而不是\\n
修复换行问题:
translation_table = dict.fromkeys(map(ord, ')(@\n#$'), None)
bucket_list = [x.translate(translation_table) for x in bucket_list ]
你可能想要投入str.strip()
来摆脱剩余的空白;结果将是:
>>> [x.translate(translation_table).strip() for x in bucket_list ]
['Mossfun', '11', 'Rtg:103', '58.0', 'Atmospherical', '8', 'Rtg:99', '56.5', 'Chloe In Paris', '7', 'Rtg:97', '55.5', 'Bound For Earth', '5', 'Rtg:92', '55.5', 'Fine Bubbles', '4', 'Rtg:91', '55.5', 'Brook Road', '9', 'Rtg:90', '55.5', 'Shamalia', '10', 'Rtg:89', '55.5', 'Tawteen', '6', 'Rtg:88', '55.5', 'Ygritte', '2', 'Rtg:77', '55.5', 'Tahni Dancer', '1', 'Rtg:76', '55.5', 'All Salsa', '3', 'Rtg:72', '55.5']
在这种特定情况下,str.strip()
也将处理新行。