导入数据,每个值包含列标签

时间:2019-05-17 12:51:21

标签: python import

我的文本文件中没有标题。每行中的值都有一个标签,指示它们属于哪一列。我想将这些标签用作列名,并在列下输入数据。

我想从文本文档中导入以下内容(请注意,列下的值的排列不是恒定的):

Column1=variable11&Column2=variable12&Column3=variable13&Column4=variable14
Column2=variable22&Column1=variable12&Column3=variable23
Column1=variable13&Column3=variable33&Column2=variable32&Column4=variable34&Column5=variable35

我希望结果是这样的表:

Column1         Column2         Column3         Column4         Column5
variable11      variable12      variable13      variable14  
variable21      variable22      variable23      
variable31      variable32      variable33      variable34      variable35

1 个答案:

答案 0 :(得分:0)

您可以为其使用Pandas数据框:

import pandas as pd

a='''Column1=variable11&Column2=variable12&Column3=variable13&column4=variable14
Column2=variable22&Column1=variable12&Column3=variable23
Column1=variable13&Column3=variable33&Column2=variable32&Column4=variable34&Column5=variable35'''

result = []

for line in a.split('\n'):
    dict_line = {}
    for chunk in line.split('&'):
        col, var = chunk.split('=')
        dict_line[col] = var
    result.append(dict_line)
pd.DataFrame(result)

将返回数据框:

    Column1     Column2     Column3     Column4     Column5     column4
0   variable11  variable12  variable13  NaN         NaN         variable14
1   variable12  variable22  variable23  NaN         NaN         NaN
2   variable13  variable32  variable33  variable34  variable35  NaN

此数据框中的空单元格用NaN

填充