我有一个pandas列,该列中的每个单元格都包含带有每张照片的颜色属性的字典列表,例如:
[{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]
我希望能够将此包含字典列表的列拆分为多个新的pandas列。例如,我想要一个名为“ black”的列,其值为“ 1.0”,一个名为“ brown”的列,其值为“ 0.72”,等等。
我正在努力做到这一点。将不胜感激提示。 谢谢!
答案 0 :(得分:1)
a = [{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]
c= []
co = []
for d in a:
c.append(d['color'])
co.append(d['confidence'])
df = pd.DataFrame()
df['color'] = c
df['confidence'] = co
df = df.transpose()
#make the first column header
df.columns = df.iloc[0]
df = df[1:]
Output:
df
Out[159]:
color black brown gray other red blond white
confidence 1 0.72 0.62 0.52 0.01 0.01 0
'''
If this answer is correct, kindly accept and upvote the answer. Else, comment the doubt or issue, I would be happy to help
答案 1 :(得分:1)
让我们尝试一下:
pd.DataFrame(df['col'].tolist()).set_index('color').T
输出:
color black brown gray other red blond white
confidence 1.0 0.72 0.62 0.52 0.01 0.01 0.0
答案 2 :(得分:1)
谢谢。这对我有用。我受到Tejas答案的启发:
from ast import literal_eval
df["black"]=""
df["brown"]=""
df["gray"]=""
df["other"]=""
df["red"]=""
df["blond"]=""
df["white"]=""
for k,v in df.iterrows():
res = literal_eval(df["Color_list"][k])
for d in res:
df[d["color"]][k]=d["confidence"]
答案 3 :(得分:0)
您可以对apply
使用自定义函数,该函数返回一个Series
来完成此操作:
数据
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame(
{
"A": ["a", "b"],
"B": [
[
{"color": "black", "confidence": 1.0},
{"color": "brown", "confidence": 0.72},
{"color": "gray", "confidence": 0.62},
{"color": "other", "confidence": 0.52},
{"color": "red", "confidence": 0.01},
{"color": "blond", "confidence": 0.01},
{"color": "white", "confidence": 0.0},
],
[
{"color": "black", "confidence": 0.8},
{"color": "brown", "confidence": 0.5},
{"color": "gray", "confidence": 0.4},
{"color": "other", "confidence": 0.32},
{"color": "red", "confidence": 0.11},
],
],
}
)
print(df)
A B
0 a [{'color': 'black', 'confidence': 1.0}, {'colo...
1 b [{'color': 'black', 'confidence': 0.8}, {'colo...
方法
由于每个单元格都是字典列表,因此我们需要将每个单元格变成其自己的Series
,其中索引是"color"
,而值是"confidence"
。 apply
负责将这些Series
对象粘在一起并输出新的DataFrame
def clean_cell(records, index, values):
return (pd.DataFrame(records)
.set_index(index)
.rename_axis(None)
[values])
record_df = df["B"].apply(clean_cell, args=("color", "confidence"))
print(record_df)
black brown gray other red blond white
0 1.0 0.72 0.62 0.52 0.01 0.01 0.0
1 0.8 0.50 0.40 0.32 0.11 NaN NaN