我编写了一个在两个分隔符之间提取字符串的函数。但是在某些文件中,这些分隔符会出现几次,所以我想提取所有这些分隔符。 在我的实际功能中,它只提取它遇到的第一个然后退出。
我该如何解决?
def extraction_error_CF(file):
f=open(file,'r')
file=f.read()
f.close()
start = file.find('Error validating') #1st delimiter
end = file.find('</SPAN><BR>', start) # 2nd delimiter
if start!=-1 and end!=-1: #If these two delimiters are present...
return(file[start:end])
else:
return""
答案 0 :(得分:0)
对于HTML / XML,您应该完全使用像BeautifulSoup这样的强大模块, 但如果您真的只想要两个分隔符之间的内容,您可以使用相同的功能,但将结果添加到列表中(例如)然后您可以将其打印出来
def extraction_error_CF(file):
f=open(file,'r')
file=f.read()
f.close()
# Patterns
first = "Error validating"
second = "</span><br>"
# For all the matches
results = []
# Iterate the whole file
start = file.find(first)
end = file.find(second)
while start != -1 and end != -1:
# Add everything between the patterns
# but not including the patterns
results.append(file[start+len(first):end])
# Removing the text that already passed
file = file[end+len(second):]
start = file.find(first)
end = file.find(second)
# Return the content of the list as a string
if len(results) != 0:
return "".join(r for r in results)
else:
return None
print(extraction_error_CF("test"))
答案 1 :(得分:0)
import re
def extraction_error_CF(file): # Get error from CF upload
f=open(file,'r')
file=f.read()
f.close()
start = re.findall('Error validating(.*)</SPAN><BR>',file)
if start != -1:
return start
else:
return""
这就是我的所作所为,感谢所有人!