Python使用非唯一标头解析.tsv并重组为json

时间:2018-08-01 06:07:25

标签: python json csv parsing

我想解析一个.tsv文件,这不是问题。 问题在于它具有非唯一的标头-就像下面的示例一样,我希望它重新格式化为.json文件。

输入格式(tsv):



    time / s    O2 @ Pos. 1 (1/1)   sign. ratio warning time / s    O2 @ Pos. 2 (2/1)   sign. ratio warning time / s    O2 @ Pos. 3 (3/1)   sign. ratio warning time / s    O2 @ Pos. 4 (4/1)   sign. ratio warning time / s    O2 @ Pos. 5 (5/1)   sign. ratio warning time / s    O2 @ Pos. 6 (6/1)   sign. ratio warning time / s    O2 @ Pos. 7 (7/1)   sign. ratio warning time / s    O2 @ Pos. 8 (8/1)   sign. ratio warning time / s    O2 @ Pos. 9 (9/1)   sign. ratio warning time / s    O2 @ Pos. 10 (10/1) sign. ratio warning time / s    O2 @ Pos. 11 (11/1) sign. ratio warning time / s    O2 @ Pos. 12 (12/1) sign. ratio warning
    37  7,6 10,4:1      75  5,1 11,1:1      114 9,0 7,9:1       153 16,5    9,1:1       191 12,6    10,5:1      229 27,4    13,2:1      267 85,8    12,9:1      313 10,0    10,5:1      351 83,5    8,4:1       390 -100,0  0,0:1   Sig.:background and intensity low!  428 34,6    11,1:1      467 89,6    12,4:1  
    521 9,1 10,0:1      560 8,3 10,9:1      601 8,9 9,2:1       641 16,0    9,6:1       682 12,8    10,8:1      720 27,4    13,3:1      760 85,8    13,1:1      807 9,3 10,9:1      846 79,5    9,0:1       887 13,0    9,1:1       926 36,5    11,1:1      965 87,6    12,7:1  


因此有12个重复的四元组:



    time / s    O2 @ Pos. 1 (1/1)   sign. ratio warning


输出格式(json):

请注意,只有前三个四元组可以使其简短



    {
        "0 row": {
            "0 quadrupel": {
                "time / s": "37",
                "O2 @ Pos. 1 (1/1)": "7,6",
                "sign. ratio": "10,4:1",
                "warning": ""
            },
            "1 quadrupel": {
                "time / s": "75",
                "O2 @ Pos. 2 (2/1)": "5,1",
                "sign. ratio": "11,1:1",
                "warning": ""
            },
            "2 quadrupel": {
                "time / s": "114",
                "O2 @ Pos. 3 (3/1)": "9,0",
                "sign. ratio": "7,9:1",
                "warning": ""
            },
            and so on for the next quadrupel
        },
        "1 row": {
            "0 quadrupel": {
                "time / s": "521",
                "O2 @ Pos. 1 (1/1)": "9,1",
                "sign. ratio": "10,0:1",
                "warning": ""
            },
            "1 quadrupel": {
                "time / s": "560",
                "O2 @ Pos. 2 (2/1)": "8,3",
                "sign. ratio": "10,9:1",
                "warning": ""
            },
            "2 quadrupel": {
                "time / s": "601",
                "O2 @ Pos. 3 (3/1)": "8,9",
                "sign. ratio": "9,2:1",
                "warning": ""
            }
            and so on for the next quadrupel
        }
    }

我需要一个能解决上述问题的鲁棒算法。 Quadrupel的计数不是固定的,在上面的示例中为12。但是结构始终相同。

非常感谢-任何帮助我都很高兴

matthias

0 个答案:

没有答案