我有一个看起来像这样的文件(即连续2/3行的随机组合):
String A
String B
String C
<Blank Row>
String D
String E
<Blank Row>
String F
String G
String H
<Blank Row>
String I
String J
String K
<Blank Row>
String L
String M
我希望输出文件在有3个连续行时移除中间行并转置剩余的2行。如果只有2行,它们应该被转置。最终结果应如下所示。
String A,String C
String D,String E
String F,String H
String I,String K
String L,String M
有关如何完成此任务的任何指示?
答案 0 :(得分:1)
您可以使用groupby
模块中的count
和itertools
以及list comprehension
。
这个答案有点hacky,但是要做到这一点。请参阅注释以更好地理解背后的逻辑。
我假设您的输入是您在名为my_input_file
的文件中提供的输入,而您的输出文件名为output_file
:
from itertools import groupby, count
# Read the file and split by the space between Value and its number
# Leave the case where the empty string '' exists without splitting its spaces
with open("my_input_file", 'r') as f:
data = (k.split() if k != '' else k for k in f.read().splitlines())
# Group the fields splitted, which are lists, in data
# And separate them by the field where the string 'Blank' exists
sub = [list(v) for _, v in groupby(data, lambda x: isinstance(x, list))]
final = []
for elm in sub:
# if the lenght of the grouped elements is > 1
if len(elm) >1:
# Convert the number of the values into an int
# For further calculations
dd = map(lambda x: [x[0], int(x[1])], elm)
# Group the consecutive numbers of elem
for _,v in groupby(dd , lambda x,y=count(): x[1] - next(y)):
# If there is a consecutive numbers
bb = list(v)
if len(bb) >1:
# Conveert them into strings. Then, append the first and the final one to the final list
final.append(' '.join(map(str, bb[0])) + ',' + ' '.join(map(str, bb[-1])))
# If there is'nt any consecutif numbers. Append the element to the final list
else:
final.append(" ".join(map(str, bb[0])))
# create the output file
with open("output_file", 'a') as f:
for k in final:
f.write(k + '\n')
此代码将输出包含以下内容的文件:
Value 1,Value 3
Value 4,Value 5
Value 6,Value 8
Value 9,Value 11
Value 12,Value 13
如果您有任何问题,请测试此代码并留下您的反馈,或者如果您发现错误,请报告错误。
修改强>
根据您的上一次编辑。
如果您的输入文件是:
What Test
Makes No Sense
is This
My name
Is Sample 123
Your Name
is ABC 2134
What is you
technical question don't know
name?
诀窍很简单。您可以仅使用groupby
模块中的itertools
:
from itertools import groupby
with open("my_input_file", 'r') as f:
data = f.read().splitlines()
final = [list(v) for _, v in groupby(data, lambda x: x != '')]
with open("ouput_file", 'a') as f:
for k in final:
if k != ['']:
f.write(k[0] + ',' + k[-1] + '\n')
并且,您的输出文件将是:
What Test ,is This
My name ,Is Sample 123
Your Name ,is ABC 2134
What is you ,name?
答案 1 :(得分:0)
为了旋转:你知道所有行在末尾都有一个新行
with open("PATH TO FILE.txt", r) as file:
input = file.read()
input.replace("\n", "")
表示只有空格的行,或识别它们。到目前为止:
with open("PATH TO FILE.txt", r) as file:
input = file.read()
if not line.strip():
input.replace("\n", "")
并且你可以保持计数或做一个while循环,这样你就可以计算,直到你只用空格来划线,并且在计数时将每一行放在一个列表或其他东西中, 如果你计算3抓住第一个和第三个,否则抓住两个。请记住重置计数