我的文件中的文字如下:
text1 5,000 6,000
text2 2,000 3,000
text3
5,000 3,000
text4 1,000 2000
text5
7,000 1,000
text6 2,000 1,000
有没有办法在Python中清除它,以便在文本行后面有缺少的数字时,后续行上的数字可以放在上面的行上:
text1 5,000 6,000
text2 2,000 3,000
text3 5,000 3,000
text4 1,000 2000
text5 7,000 1,000
text6 2,000 1,000
谢谢!
答案 0 :(得分:3)
假设每行应该有三个“单词”,你可以使用
tokens = (x for line in open("file") for x in line.split())
for t in zip(tokens, tokens, tokens):
print str.join(" ", t)
编辑:由于上述先决条件显然不成立,这是一个实际查看数据的实现:
from itertools import groupby
tokens = (x for line in open("file") for x in line.split())
for key, it in groupby(tokens, lambda x: x[0].isdigit()):
if key:
print str.join(" ", it)
else:
print str.join("\n", it),
答案 1 :(得分:1)
假设逻辑行在以空格开头的行上“继续”(并包含任意数量的记录),您可以使用:
>>> collapse_space = lambda s: str.join(" ", s.split())
>>>
>>> logical_lines = []
>>> for line in open("text"):
... if line[0].isspace():
... logical_lines[-1] += line #-- append the continuation to the last logical line
... else:
... logical_lines.append(line) #-- start a new logical line
...
>>> l = map(collapse_space, logical_lines)
>>>
>>> print str.join("\n", l)
text1 5,000 6,000
text2 2,000 3,000
text3 5,000 3,000
text4 1,000 2000
text5 7,000 1,000
text6 2,000 1,000