Python脚本根据特定标准进行转置

时间:2017-05-18 17:06:26

标签: python python-2.7 python-3.x

我有一个包含以下格式数据的文件:

abc 123 456
cde 45 32
efg 322 654
abc 445 856
cde 65 21
efg 147 384
abc 815 078
efg 843 286
等等。如何使用Python将其转换为以下格式:

abc 123 456 cde 45 32 efg 322 654
abc 445 856 cde 65 21 efg 147 348
abc 815 078 efg 843 286
此外,如果在abc之后缺少cde,则应该插入空格,因为它是固定宽度的文件。     使用open(' abc.txt')作为文件:             对于file.readlines()中的行                 而行[:3] ==' abc'                 lines.replace(' \ n''&#39)

我是python的新手,非常感谢这方面的帮助!!

1 个答案:

答案 0 :(得分:0)

@ Ritesh1111,即使你是新人,你也应该开始尝试。 尝试使用以下代码进行少量假设,可能不是100%准确,但您可以从这里开始:

with open('abc.txt') as rd:
    data = rd.readlines()

full_detail={}
seq=['abc', 'cde', 'efg']  # Assuming the order of the sequences, also 1st must be present.   
for row in seq:
    full_detail[row] = []
last_processed = None

def get_next_seq_pos(last_seq_postition, seq):
    return last_seq_postition+1 if last_seq_postition < len(seq) - 1 else 0

for line in data:
    cline = line.strip().split(" ")
    if last_processed is None:
        data_seq_postition = seq.index(cline[0].strip())
        temp_data = " ".join([str(i) for i in cline[1:]])
        full_detail[seq[data_seq_postition]].append(temp_data)
        last_processed = seq[data_seq_postition]
    else:
        last_seq_postition = seq.index(last_processed)
        data_seq_postition = seq.index(cline[0].strip())
        next_seq_postition = get_next_seq_pos(last_seq_postition, seq)
        while(next_seq_postition != data_seq_postition):
            full_detail[seq[next_seq_postition]].append("   ")
            last_processed = seq[next_seq_postition]
            last_seq_postition = seq.index(last_processed)
            next_seq_postition = get_next_seq_pos(last_seq_postition, seq)
        full_detail[seq[data_seq_postition]].append(" ".join([str(i) for i in cline[1:]]))
        last_processed = seq[data_seq_postition]

data_length = len(full_detail[seq[0]])
for i in range(0,data_length):
    for row in seq:
        print row, full_detail[row][i],
    print
  

输出

abc 123 456 cde 45 32 efg 322 654

abc 445 856 cde 65 21 efg 147 384

abc 815 078 cde。 。 efg 843 286