在熊猫中拆分,清理列表并将其转换为数据框

时间:2020-07-06 10:58:57

标签: python-3.x pandas list

我有一个列表列表,如下所示:

[['id:ZC0000218734', 'version: forth', 'date:2020-07-06'], ['v1:\n                            undefined', 'v2: undefined'], ['type:park', 'address:zhejiang...'], ['type:park', 'address:zhejiang...']]

我如何删除重复列表(ie: ['type:park', 'address:zhejiang...'],被:分割,清除\n和空格,然后将其转换为数据框?

预期结果如下:

             id version      date  ...         v2       type      address
0  ZC0000218734   forth  2020/7/6  ...  undefined  undefined  zhejiang...

谢谢。

2 个答案:

答案 0 :(得分:3)

    let counter = 42;

    self.addEventListener("message", function (event: any) {
        let client = event.source;
        client.postMessage(counter);
    }

打印:

lst = [['id:ZC0000218734', 'version: forth', 'date:2020-07-06'], ['v1:\n                            undefined', 'v2: undefined'], ['type:park', 'address:zhejiang...'], ['type:park', 'address:zhejiang...']]

d = {v.split(':')[0]: v.split(':')[1].strip() for l in lst for v in l}

df = pd.DataFrame([d])
print(df)

答案 1 :(得分:2)

我尝试避免对map和split使用双重str.strip,然后在嵌套列表理解中创建字典,最后传递给DataFrame构造函数:

L = [['id:ZC0000218734', 'version: forth', 'date:2020-07-06'], ['v1:\n                            undefined', 'v2: undefined'], ['type:park', 'address:zhejiang...'], ['type:park', 'address:zhejiang...']]

out = dict([map(str.strip, y.split(':')) for x in L for y in x])

df = pd.DataFrame([out])
print (df)
             id version        date         v1         v2  type      address
0  ZC0000218734   forth  2020-07-06  undefined  undefined  park  zhejiang...