我有一个txt文件,我正在阅读以下格式:
Event B 0 40
Event B 0 75
Event B 1 30
Event A
Event B 1 50
Event B 1 70
Event A
Event A
Event B 2 40
我正在尝试编写以下逻辑:
对于每个活动A: 打印第一个事件B的第1列和第2列,最后一个事件A
所以输出如下:
Event B 0 40
Event B 0 75
Event B 1 30
Event A 0 40
Event B 1 50
Event B 1 70
Event A 1 50
Event A N/A N/A
Event B 2 40
etc...
我可以在文件中读取ok作为列表:
with open(event_file) as schedule:
schedule = schedule.readlines()
for i in range(0, len(schedule)):
if schedule[i][0] == 'Event A':
if schedule[i-X][0] == 'Event A':
print(schedule[i-X+1]) # Where X is how many lines before Event A the last one was... but I really dont know how to determine this.. Nor do I know if any of this is the right way to go about it.
我希望我有意义。
答案 0 :(得分:3)
你只需要记住最后一个事件B:
txt = """Event B , 0 , 40
Event B , 0 , 75
Event B , 1 , 30
Event A
Event B , 1 , 50
Event B , 1 , 70
Event A
Event A
Event B , 2 , 40
"""
# split your data:
data = [ [k.strip() for k in row.strip().split(",")] for row in txt.split("\n")]
rv = []
b = None
for d in data:
if d[0] == "Event A":
# either add the remembered B or N/A's
if b:
rv.append([ d[0], b[1],b[2] ])
else:
rv.append([ d[0], "N/A","N/A" ])
b = None # delete remebered b
continue
elif b == None: # remember first b
b = d
if d and d[0]: # if not empty, add to results
rv.append(d)
print (rv) # print results
输出:
[['Event B', '0', '40'],
['Event B', '0', '75'],
['Event B', '1', '30'],
['Event A', '0', '40'],
['Event B', '1', '50'],
['Event B', '1', '70'],
['Event A', '1', '50'],
['Event A', 'N/A', 'N/A'],
['Event B', '2', '40']]
答案 1 :(得分:1)
您可以使用正则表达式从行中提取数据并实现逻辑。这是另一种方法:
import re
#Get rows with complete column
regex1 = r"^Event\s+(A|B)\s+(\w+)\s+(\w+)\s*$"
#Get rows with incomplete column
regex2 = r"^Event\s+(A|B)\s+$"
with open(event_file) as schedule:
schedule = schedule.readlines()
last_B = ()
for string in schedule:
string_search = re.search(regex1, string)
if string_search:
event = string_search.group(1)
if event == 'B':
column1 = string_search.group(2)
column2 = string_search.group(3)
print((event,column1,column2))
if len(last_B) == 0:
last_B = (event,column1,column2)
continue
string_search = re.search(regex2, string)
if string_search:
event = string_search.group(1)
if event == 'A' and len(last_B) == 3:
A = (event, last_B[1],last_B[2])
last_B = ()
else:
A = (event, 'N/A', 'N/A')
print(A)
continue
<强>输出:强>
('B', '0', '40')
('B', '0', '75')
('B', '1', '30')
('A', '0', '40')
('B', '1', '50')
('B', '1', '70')
('A', '1', '50')
('A', 'N/A', 'N/A')
('B', '2', '40')
答案 2 :(得分:0)
这是一次非常艰难的尝试。可能会偏离标准,但我希望它有所帮助。
for i in range(0, len(schedule))
last_a = 0
if schedule[i][0] == 'Event A'
res = ('N/A', 'N/A')
for j in range(last_a, i)
if schedule[j][0] == 'Event B'
res = (schedule[j][1], schedule[j][2])
break
print 'Event A' + res[0] + res[1]
last_a = i
else
print 'Event B' + schedule[i][1] + schedule[i][2]
答案 3 :(得分:0)
我非常怀疑你的代码是否正常工作,因为你永远不会解析/拆分行,所以schedule[i][0]
总是指向第一个字符,而不是指向整个Event A
子字符串。
无论哪种方式,执行所需操作的一种方法是只缓存最后 Event B
列并将它们附加到下一个Event A
,然后清空缓存,冲洗并重复,例如:
empty_cache = " N/A N/A" # use this as cache when no previous Event B available
with open(event_file) as f: # open your file
cache = empty_cache # begin with an empty cache
for line in f: # loop the event file line by line
line = line.rstrip() # clear out the whitespace at the line's end
if line[:7] == "Event B" and cache == empty_cache: # first Event B encountered
cache = line[7:]
elif line[:7] == "Event A": # Event A encountered
line = line[:7] + cache # update the Event A with the cache
cache = empty_cache # empty out the cache
print(line) # print the current line
根本不需要解析/拆分事件行,前提是您的数据与您呈现的完全一样。