Question

我想对作为列表的列数据进行处理。喜欢：

输入：

col-A

[{'name':'1','age':'12'}, {'name':'2','age':'12'}]

[{'name':'3','age':'18'}, {'name':'7','age':'15'}]

....

输出：

col-A

[{'1-age':'12'}, {'2-age':'12'}]

[{'3-age':'18'}, {'7-age':'15'}]

....

我的代码是：

def deal(dict_col, prefix_key):
    key_value = dict_col[prefix_key]+'-'
    dict_col.pop(prefix_key, None)
    items = copy.deepcopy(dict_col)
    for key, value in items.items():
        dict_col[key_value+key] = dict_col.pop(key)
    return dict_col  

prefix = "name"
[[deal(sub_item, prefix) for sub_item in item] for item in df[col-A]]

某些项目将被多次处理。因为deal方法的返回值会实时交换为项目？

例如：

对于交易方式，我们

输入：

{'name':'1','age':'12'}

输出：

{'1-age':'12'}

那么下一个输入可能是{'1-age':'12'}，现在我们没有名字或年龄了。

如何解决这个问题？

Answer 1

我相信您需要.get函数来选择默认值（如果字典中不存在）：

def deal(dict_col, prefix_key):
    key_value = dict_col.get(prefix_key, 'not_exist')+'-'
    dict_col.pop(prefix_key, None)
    items = copy.deepcopy(dict_col)
    for key, value in items.items():
        dict_col[key_value+key] = dict_col.pop(key)
    return dict_col

Answer 2

您可以在此处使用pandas apply方法来获取一些代码：

import pandas as pd

d = {'col-A' : [[{'name' : '1', 'age': '12'}, {'name' : '2', 'age': '12'}],[{'name' : '3', 'age': '18'},{'name' : '7', 'age': '15'}]]}

df = pd.DataFrame(d)

def deal(row, prefix):
    out_list = []
    for sub_dict in row:
        out_dict = {}
        out_str = sub_dict.get(prefix) + '-'
        for k,v in sub_dict.items():
            out_dict[out_str + k] = v
        out_list.append(out_dict)
    return out_list
prefix = 'name'
df['col-A'] = df['col-A'].apply(lambda x : deal(x, prefix))

print(df)

如果您愿意，可以将一些代码以单行的方式推送：

def deal(row, prefix):
    out_list = []
    for sub_dict in row:
        out_dict = dict((sub_dict[prefix] + '-' + k , sub_dict[k]) for k in sub_dict.keys() if k != prefix)
        out_list.append(out_dict)
    return out_list
prefix = 'name'
df['col-A'] = df['col-A'].apply(lambda x : deal(x, prefix)

仅出于乐趣，您甚至可以将其简化为一行（由于可读性差，不建议使用：

prefix = "name"
df['col-A'] = df['col-A'].apply(lambda row : [dict((sub_dict[prefix] + '-' + k , sub_dict[k]) for k in sub_dict.keys() if k != prefix) for sub_dict in row])

如何处理熊猫数据框中的列？

2 个答案: