我是python的新手,我需要帮助修改我的脚本。我真的被卡住了。任何输入将不胜感激。 TIA!
original_file中的文本如下所示:
<Time>
1159
</Time>
<Date>
03042016
</Date>
<Time>
1300
</Time>
<Date>
03052016
</Date>
...
我的剧本:
with open("original_file.txt", "r") as input_file, \
open("result_file.txt", "w") as output_file:
input_file.seek(0)
copy = False
for line in input_file:
if line.strip() == "<Time>" or line.strip() == "<Date>":
copy = True
elif line.strip() == "</Time>" or line.strip() == "</Date>" :
copy = False
elif copy:
output_file.write(line)
我的脚本有效,但输出如下:
1159
03042016
1300
03052016
我想要的输出:
1159, 03042016
1300, 03052016
答案 0 :(得分:0)
维护代码,看起来应该是这样的
with open("original_file.txt", "r") as input_file, \
open("result_file.txt", "w") as output_file:
input_file.seek(0)
values = []
for line in input_file:
if line.strip() == "<Time>" or line.strip() == "<Date>":
pass
elif line.strip() == "</Time>":
pass
elif line.strip() == "</Date>":
line = ",".join(values)
output_file.write(line)
values
else:
line.append(line.strip())
但是,您可以使用某些库,例如BeautifulSoup
,它是一个XML解析器。你的工作大大减少了
使用这个库,你的代码最终会是这样的:
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("original_file.txt.html"))
with open("result_file.txt", "w") as output_file:
results = zip(soup.find_all('Time'), soup.find_all('Date'))
for time, date in results:
output_file.write(",".join([time,date]))