我是python的新手,需要一些指导。
我有一个Text文件,其中包含多个模拟的输出结果,我需要在每个块之间提取特定值。见下面的样本:
**********************************************
SIMULATION NUMBER = 1 SEED NUMBER: 1430403561
INTERVAL 1, NUMBER OF STORMS 0
INTERVAL 2, NUMBER OF STORMS 1
STORM RESPONSES
1 544.95
INTERVAL 3, NUMBER OF STORMS 0
INTERVAL 4, NUMBER OF STORMS 0
INTERVAL 5, NUMBER OF STORMS 0
INTERVAL 6, NUMBER OF STORMS 1
STORM RESPONSES
1 526.68
INTERVAL 7, NUMBER OF STORMS 0
INTERVAL 8, NUMBER OF STORMS 0
INTERVAL 9, NUMBER OF STORMS 0
INTERVAL 10, NUMBER OF STORMS 0
INTERVAL 11, NUMBER OF STORMS 0
INTERVAL 12, NUMBER OF STORMS 1
STORM RESPONSES
1 518.77
INTERVAL 13, NUMBER OF STORMS 0
INTERVAL 14, NUMBER OF STORMS 0
INTERVAL 15, NUMBER OF STORMS 0
INTERVAL 16, NUMBER OF STORMS 0
INTERVAL 17, NUMBER OF STORMS 0
INTERVAL 18, NUMBER OF STORMS 0
INTERVAL 19, NUMBER OF STORMS 1
STORM RESPONSES
1 614.23
**********************************************
所需信息介于每个“************************************ ******“ - 这些之间的信息表示需要搜索的单个”块“或模拟运行。
基本上我需要的是搜索“INTERVAL”值小于或等于30的块,“NUMBER OF STORMS”大于0,并且它们之上的“INTERVAL”的相关“STORM RESPONSES”更大比648.
我需要一个汇总输出表,其中的行说明每个模拟块的查询是TRUE还是FALSE(此特定文件有1000个模拟)。
非常感谢任何帮助。我确信我可以在Excel中解决这个问题,但我觉得我可以用Python来解决这个问题(并且可以更精简)。
这是我到目前为止所做的:
import os
import sys
f = open('D:\log.txt')
chunks = [] #each chunk is a section of text that is what is between *** lines
tmp_text = ''
for line in f:
if line.strip() == '******...***':
if tmp_text != '': #I don't know if file starts with *** or not
chunks.append(tmp_text)
tmp_text = ''
else:
tmp_text += line
if tmp_text != '':
chunks.append(tmp_text) #in case the file does not end in ****
f.close()
#chunks will be in the order that you expect them.
for chunk in chunks:
for line in chunk :
if "INTERVAL " + x<=30 + ", NUMBER OF STORMS " + x<=3 or "INTERVAL " + x<=30 + ", NUMBER OF STORMS " + x<=3
我对如何在低于648的“STORM RESPONSES”之下提取valueS感到困惑。另外,我在“for line in chunk”之后添加的“if”语句是否会起作用?
import os
import sys
f = open('D:\LBI_Easement_Issues\log.txt')
chunks = [] #each chunk is a section of text that is what is between *** lines
interval = 1
numberstorms = 1
tmp_text = ''
for line in f:
if line.strip() == '******...***':
if tmp_text != '': #I don't know if file starts with *** or not
chunks.append(tmp_text)
tmp_text = ''
else:
tmp_text += line
if tmp_text != '':
chunks.append(tmp_text) #in case the file does not end in ****
f.close()
#chunks will be in the order that you expect them.
for chunk in chunks:
for line in chunk :
print line
query = "True" if "INTERVAL " + str(interval) + ", NUMBER OF STORMS " + str(numberstorms) or "INTERVAL " + str(interval) + ", NUMBER OF STORMS " + srt(numberstorms) else "False"
print query
print "Complete"
答案 0 :(得分:0)
假设您可以在星号行之间进行处理,这可能会有所帮助......
f = open('filename.csv')
chunks = [] #each chunk is a section of text that is what is between *** lines
tmp_text = ''
for line in f:
if line.strip() == '******...***':
if tmp_text != '': #I don't know if file starts with *** or not
chunks.append(tmp_text)
tmp_text = ''
else:
tmp_text += line
if tmp_text != '':
chunks.append(tmp_text) #in case the file does not end in ****
f.close()
#chunks will be in the order that you expect them.
for chunk in chunks:
call_your_function_to_parse_chunk_of_text_here(chunk)