如何一起使用多个替换或一次使用Python

时间:2018-06-11 23:52:30

标签: python web-scraping

我有一些不清楚的文字,有如此多的标签和ascii,如下,

val =

"\nRated\xa0\n           I have been to this place for dinner tonight.
        \nWell I didn't found anything extraordinary there but indeed a meal worth 
        the price. The number of barbeque item and other both were good.\n\nFood: 3.5/5\"

所以为了清楚这个标签,我正在使用

  val.text.replace('\t', '').replace('\n', '').encode('ascii','ignore').
decode("utf-8").replace('Rated','').replace('  ','')

并且使用多次替换我得到了我的o / p为 -

I have been to this place for dinner tonight. Well I didn't found anything extraordinary there but indeed a meal worth the price. The number of barbeque item and other both were good. Food: 3.5/5

我想知道有什么办法,所以我可以立即使用替换只用于类似的替换。就像在这种情况下 -

replace('\t', '').replace('\n', '').replace('  ','')

2 个答案:

答案 0 :(得分:1)

您可以使用.translate删除\n\t,然后将替换用于空格的运行:

>>> val.translate(None,'\n\t').replace('  ','')
"Rated I have been to this place for dinner tonight.Well I didn't found anything extraordinary there but indeed a meal worth the price. The number of barbeque item and other both were good.Food: 3.5/5"

replace(' ','')运行偶数空格会有问题(它们只会被删除)。您可以考虑使用正则表达式:

>>> re.sub(r'(\b  *\b)',' ',val.translate(None,'\n\t'))
"Rated I have been to this place for dinner tonight.Well I didn't found anything extraordinary there but indeed a meal worth the price. The number of barbeque item and other both were good.Food: 3.5/5"

答案 1 :(得分:0)

我甚至不使用 - from ... - podSelector: matchLabels: app: nginx-ingress # Allow pods that have the app=nginx-ingress label ,但我仍然认为这是最好的方式:

replace

输出:

import string
val = """\nRated\xa0\n           I have been to this place for dinner tonight.
        \nWell I didn't found anything extraordinary there but indeed a meal worth 
        the price. The number of barbeque item and other both were good.\n\nFood: 3.5/5\"""
        """
print(''.join([i for i in ' '.join(val.split()) if i in string.ascii_letters+' ']))