Question

目前，我的数据格式如下：

即。

[ ('ab', {'a' : [apple1], 'b': [ball1]}), ('cd', {'a' : [apple2], 'b':   [ball2]})]

列表[元组[任何，字典{'key'：列表}]]

目标是创建以下形式的熊猫数据框：

start   a             b
ab    apple1         ball1
cd    apple2         ball2

我尝试通过以下方式进行操作：

df = pd.DataFrame(columns=['start', 'a', 'b'])
for start, details in mylist:
    df = df.append({'start' : start}, ignore_index= True)
    df = df.append({'a' : details['a']} , ignore_index= True)
    df = df.append({'b': details['b']}, ignore_index=True)

我正在尝试找出一种优化的方法。

Answer 1

`pd.DataFrame.from_dict`

熊猫与字典或词典一起使用效果很好。您之间有一些东西。在这种情况下，转换为字典很简单：

L = [('ab', {'a' : ['apple1'], 'b': ['ball1']}),
     ('cd', {'a' : ['apple2'], 'b': ['ball2']})]

res = pd.DataFrame.from_dict(dict(L), orient='index')
res = res.apply(lambda x: x.str[0])

print(res)

         a      b
ab  apple1  ball1
cd  apple2  ball2

Answer 2

赞：

form = [ ('ab', {'a' : ['apple1'], 'b': ['ball1']}), ('cd', {'a' : ['apple2'], 'b':   ['ball2']})]

# separate 'start' from rest of data - inverse zip
start, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data))

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

df.insert(loc=0, column='start', value=start)

print(df)
     start     a      b
0    ab   apple1  ball1
1    cd   apple2  ball2

或者，如果要开始成为数据框的索引：

# separate 'start' from rest of data - inverse zip
index, data = zip(*form)

# create dataframe
df = pd.DataFrame(list(data), index=index)
df.index.name = 'start' 

# remove data from lists in each cell
df = df.applymap(lambda l: l[0])

print(df)
start     a      b
ab   apple1  ball1
cd   apple2  ball2

带有值和字典的Pandas元组到数据框

2 个答案:

`pd.DataFrame.from_dict`