我想找一个短语:"删除这个"。我想只保留两次出现的短语,并删除其他所有内容。
text.text.text.text
text.text.text.text
text.text.text.text
text.text.text.text
delete this
text.text.text.text
text.text.text.text
text.text.text.text
delete this
text.text.text.text
text.text.text.text
这是我目前的代码:
import urllib2
import unicodecsv as csv
import os
import sys
import io
import time
import datetime
import pandas as pd
from bs4 import BeautifulSoup
import sys
import re
def to_2d(l,n):
return [l[i:i+n] for i in range(0, len(l), n)]
f = open('air.txt', 'r')
x = f.readlines()
filename=r'output.csv'
resultcsv = open(filename,"wb")
output = csv.writer(resultcsv, delimiter=';',quotechar = '"', quoting=csv.QUOTE_NONNUMERIC, encoding='latin-1')
maindatatable = to_2d(x, 4)
if 'delete this' in maindatatable.text:
stop = 1
break
print maindatatable
output.writerows(maindatatable)
resultcsv.close()
答案 0 :(得分:1)
您可以使用str.split
:
with open('air.txt', 'r') as f:
x = f.read()
req_text = x.split('delete this')[1: -1]
data = []
for text in req_text:
for line in text.strip().splitlines():
data.append([line])
要写入csv文件,只需打开它并拨打writer.writerows
:
with open('output.csv', "wb") as f
output = csv.writer(f, delimiter=';',quotechar = '"', quoting=csv.QUOTE_NONNUMERIC, encoding='latin-1')
output.writerows(data)
将其保存到文件中:
text.text.text.text
text.text.text.text
text.text.text.text
使用delete
代替delete this
:
req_text = x.split('delete')[1: -1]
data = []
for text in req_text:
text = text.split('\n', 1)[1]
for line in text.strip().splitlines():
data.append([line])
答案 1 :(得分:0)
这是一个带开关的基本结构。即使有多个delete_this
对,它也应该有效:
read = False
with open('data.txt') as txt:
for line in txt:
if line.strip() == 'delete this':
read = not read
elif read:
print line,
data.txt
为:
text.text.text.text1
text.text.text.text2
text.text.text.text3
text.text.text.text4
delete this
text.text.text.text5
text.text.text.text6
text.text.text.text7
delete this
text.text.text.text8
text.text.text.text9
输出:
text.text.text.text5
text.text.text.text6
text.text.text.text7
答案 2 :(得分:0)
我现在要假设分隔符是完整的行。这是达到你想要的一种方式:
import sys
delimiter = "delete this\n"
result = []
with open('air.txt', 'r') as inf:
for line in inf:
if line == delimiter:
break
else:
sys.exit("opening delimiter missing")
for line in inf:
if line != delimiter:
result.append(line)
else:
break
else:
sys.exit("closing delimiter missing")
只有在循环中没有执行else
语句时,才会执行附加到for
语句的break
子句。这可以确保各种奇怪的文件末尾条件不会弄乱您的逻辑。
with
语句是一种使文件可用的便捷方式,并确保无论发生什么,它都会在使用后正确关闭。
result
列表可以转换为带有简单构造的字符串:
output = "".join(result)