我有一个看起来像这样的文件:
<VirtualHost *:80>
ServerName Url1
DocumentRoot Url1Dir
</VirtualHost>
<VirtualHost *:80>
ServerName Url2
DocumentRoot Url2Dir
</VirtualHost>
<VirtualHost *:80>
ServerName REMOVE
</VirtualHost>
<VirtualHost *:80>
ServerName Url3
DocumentRoot Url3Dir
</VirtualHost>
我想删除这段代码(它没有改变):
<VirtualHost *:80>
ServerName REMOVE
</VirtualHost>
我尝试使用下面的代码找到整段代码,但它似乎无法正常工作。
with open("out.txt", "wt") as fout:
with open("in.txt", "rt") as fin:
for line in fin:
fout.write(line.replace("<VirtualHost *:80>\n ServerName REMOVE\n</VirtualHost>\n", ""))
我试图为我的问题找到解决方案,但空手而归,所以非常感谢任何帮助。
在您投票之前我真的很想听到原因。
答案 0 :(得分:4)
最快的方法是将整个文件读入字符串,执行替换,然后将字符串写入所需的文件。例如:
#!/usr/bin/python
with open('in.txt', 'r') as f:
text = f.read()
text = text.replace("<VirtualHost *:80>\n ServerName REMOVE\n</VirtualHost>\n\n", '')
with open('out.txt', 'w') as f:
f.write(text)
答案 1 :(得分:1)
这是有限自动机解决方案,可以在开发过程中稍后进行修改。一开始可能看起来很复杂,但请注意,您可以独立查看每个状态值的代码。您可以在纸上绘制图形(节点为圆形,箭头为方向边),以便了解所做的工作
status = 0 # init -- waiting for the VirtualHost section
lst = [] # lines of the VirtualHost section
with open("in.txt") as fin, open("out.txt", "w") as fout:
for line in fin:
#-----------------------------------------------------------
# Waiting for the VirtualHost section, copying.
if status == 0:
if line.startswith("<VirtualHost"):
# The section was found. Postpone the output.
lst = [ line ] # first line of the section
status = 1
else:
# Copy the line to the output.
fout.write(line)
#-----------------------------------------------------------
# Waiting for the end of the section, collecting.
elif status == 1:
if line.startswith("</VirtualHost"):
# The end of the section found, and the section
# should not be ignored. Write it to the output.
lst.append(line) # collect the line
fout.write(''.join(lst)) # write the section
status = 0 # change the status to "outside the section"
lst = [] # not neccessary but less error prone for future modifications
else:
lst.append(line) # collect the line
if 'ServerName REMOVE' in line: # Should this section to be ignored?
status = 2 # special status for ignoring this section
lst = [] # not neccessary
#-----------------------------------------------------------
# Waiting for the end of the section that should be ignored.
elif status == 2:
if line.startswith("</VirtualHost"):
# The end of the section found, but the section should be ignored.
status = 0 # outside the section
lst = [] # not neccessary
答案 2 :(得分:1)
虽然上述答案是一种务实的方法,但它首先是脆弱而不灵活的 这是一些不那么脆弱的东西:
import re
def remove_entry(servername, filename):
"""Parse file , look for entry pattern and return new content
:param str servername: The server name to look for
:param str filename: The file path to parse content
:return: The new file content excluding removed entry
:rtype: str
"""
with open(filename) as f:
lines = f.readlines()
starttag_line = None
PATTERN_FOUND = False
for line, content in enumerate(lines):
if '<VirtualHost ' in content:
starttag_line = line
# look for entry
if re.search(r'ServerName\s+' + servername, content, re.I):
PATTERN_FOUND = True
# next vhost end tag and remove vhost entry
if PATTERN_FOUND and '</VirtualHost>' in content:
del lines[starttag_line:line + 1]
return "".join(lines)
filename = '/tmp/file.conf'
# new file content
print remove_entry('remove', filename)