如果我有一个包含大量数据的.txt文件,请说明目前格式化的电影评论:
1达芬奇密码书真棒。 1这是我读过的第一个clive cussler,但即使像Relic和Da Vinci代码这样的书也比这更合理。 1我很喜欢达芬奇密码。 1我很喜欢达芬奇密码。 1我喜欢达芬奇密码,但它最终似乎没有自己的代码。 1这甚至都不夸张)午夜我们去沃尔玛买了达芬奇密码,这当然是惊人的。
如何更改此文件或将其内容写入新文件,以便在每个句子结束后,下一个文件在新行而不是同一个文件上开始?
答案 0 :(得分:1)
您可以在"."
拆分文本,然后使用字符串格式:
import re
new_s = ['{}\n'.format(i) for i in re.split('\.\s*', open('filename.txt').read())]
with open('movie_listing.txt', 'a') as f:
f.write(''.join(new_s))
输出(在movie_listing.txt
中):
1 The Da Vinci Code book is just awesome
1 this was the first clive cussler i've ever read, but even books like Relic, and Da Vinci code were more plausible than this
1 i liked the Da Vinci Code a lot
1 i liked the Da Vinci Code a lot
1 I liked the Da Vinci Code but it ultimatly didn't seem to hold it's own
1 that's not even an exaggeration ) and at midnight we went to Wal-Mart to buy the Da Vinci Code, which is amazing of course