Question

请考虑以下文本文件摘录

Distance,Velocity,Time
(m),(m/s),(s)
1,1,1
2,1,2
3,1,3

我希望将其转换为此：

Distance(m),Velocity(m/s),Time(s)
1,1,1
2,1,2
3,1,3

换句话说，我想连接包含文本的行，并且希望将它们按列连接。

我最初是在处理从软件生成的文本文件。我已将其成功转换为csv格式的仅数字列及其标题。但是我每一列都有多个标题。而且我需要每个标题行中的所有信息，因为列属性在文件之间会有所不同。如何在python中以一种聪明的方式做到这一点？

edit：谢谢您的建议，它对我有很大帮助。我使用了Daweos解决方案，并添加了动态行数，因为标题行的数量可能从2到7不等，具体取决于生成的输出。这是我最后得到的代码段。

# Get column headers
a = 0
header_rows= 0
with open(full,"r") as input: 
    Lines= ""

    for line in input:
        l = line
        g = re.sub(' +',' ',l)
        y = re.sub('\t',',',g)
        numlines += 1
        if len(l.encode('ANSI')) > 250:
            # finds header start row
            a += 1               
        if a>0:
            # finds header end row
            if "---" in line:
                header_rows = numlines - (numlines-a+1)
                break
            else:
          #     Lines is my headers string
                Lines = Lines + "%s" % (y) + ' '
    output.close()

# Create concatenated column headers 
rows = [i.split(',') for i in Lines.rstrip().split('\n')]
cols = [list(c) for c in zip(*rows)]
for i in (cols):
    for j in (rows):
        newcolz = [list(c) for c in zip(*rows)]
print(newcolz)

Answer 1

我将按照以下方式进行操作：

txt = " Distance,Velocity,Time \n (m),(m/s),(s) \n 1,1,1 \n 2,1,2 \n 3,1,3 \n "
rows = [i.split(',') for i in txt.rstrip().split('\n')]
cols = [list(c) for c in zip(*rows)]
newcols = [[i[0]+i[1],*i[2:]] for i in cols]
newrows = [','.join(i) for i in zip(*newcols)]
print(newtxt)

输出：

 Distance (m),Velocity(m/s),Time (s)
 1,1,1
 2,1,2
 3,1,3

关键在于使用zip来转置数据，因此我可以处理列而不是行。 [[i[0]+i[1],*i[2:]] for i in cols]负责实际的连接，因此，如果标头跨越3行，则可以执行[[i[0]+i[1]+i[2],*i[3:]] for i in cols]，依此类推。

Answer 2

我不知道有什么可以做的，所以可以直接编写一个自定义函数。在下面的示例中，该函数使用字符串以及默认为,的分隔符。

它将把每个字符串分成一个列表，然后使用使用zip的列表理解功能将列表配对。然后加入配对。

最后，它将再次使用分隔符将合并的标头连接起来。

def concat_headers(header1, header2, seperator=","):
    headers1 = header1.split(seperator)
    headers2 = header2.split(seperator)
    consolidated_headers = ["".join(values) for values in zip(headers1, headers2)]
    return seperator.join(consolidated_headers)


data = """Distance,Velocity,Time\n(m),(m/s),(s)\n1,1,1\n2,1,2\n3,1,3\n"""
header1, header2, *lines = data.splitlines()
consolidated_headers = concat_headers(header1, header2)
print(consolidated_headers)
print("\n".join(lines))

输出

Distance(m),Velocity(m/s),Time(s)
1,1,1
2,1,2
3,1,3

Answer 3

您实际上并不需要功能，因为可以使用csv模块来做到这一点：

import csv

data_filename = 'position_data.csv'
new_filename = 'new_position_data.csv'

with open(data_filename, 'r', newline='') as inp, \
     open(new_filename, 'w', newline='') as outp:
    reader, writer = csv.reader(inp), csv.writer(outp)
    row1, row2 = next(reader), next(reader)
    new_header = [a+b for a,b in zip(row1, row2)]
    writer.writerow(new_header)
    # Copy the rest of the input file.
    for row in reader:
        writer.writerow(row)

是否有将两个标题行合并为一个的函数？

3 个答案: