我有下面的数据集,我想整理一下。
Review Title : Very poor
Upvotes : 1
Downvotes : 0
Review Content :
Hank all time this device ... fews day speakar sound not clear output
Review Title : Don't waste your money
Upvotes : 1
Downvotes : 1
Review Content :
Don't buy this product , its not good .just a waste of money.it starts showing small defects from starting few months of use and then after one year after warranty is over its mother was not working .and u can .ever fix it
Sorry I didn't like this phone
我想使用python将这些数据整形为以下格式。
Review Title : Very poor
Upvotes : 1
Downvotes : 0
Review Content : Hank all time this device ... fews day speakar sound not clear output
Review Title : Don't waste your money
Upvotes : 1
Downvotes : 1
Review Content : Don't buy this product , its not good .just a waste of money.it starts showing small defects from starting few months of use and then after one year after warranty is over its mother was not working .and u can .ever fix it Sorry I didn't like this phone
我想在冒号之后移动文本,但我不知道如何。
答案 0 :(得分:1)
import re
text = '''your_text_here'''
text = re.sub("Review Content :\s+", "Review Content : ", text)
text = re.sub("Review Title : ", "\n\nReview Title : ", text)
text = text.strip()
print(text)
使用re library可以更轻松地对字符串进行操作:
sub
替换了"回顾内容"之后的空白字符链。只有1个空间。多亏了你的内容和#34;评论内容"标签sub
在"评论标题"之前添加了2个换行符。标签strip()
从字符串的开头和结尾删除空格,这有效地删除了在第一个" Review Title"之前添加的两个换行符。在上一步中