使用重复的单元格值作为键将pandas DataFrame转换为字典

时间:2019-10-28 09:32:11

标签: python pandas dataframe dictionary

我有一个这样的数据框:

Sol Col1    v1  Col2   v2    Col3  v3   Col4    v4  
1   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
2   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
3   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   0       
4   Y_1_1   0   Y_1_2   0   Y_1_3   1   Y_1_4   0   
5   Y_1_1   0   Y_1_2   0   Y_1_3   1   Y_1_4   0   
6   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   0       
7   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
8   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   1       
9   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0

我想在这样的字典中进行转换:

dic = {1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
       2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
       ...}

我想知道我应该用str变量(v1v2等替换列v3Y_1_1Y_1_2的标题吗? ),只需删除具有变量名称(col1col2,...)的列即可。

我找到了一些将数据框转换为字典的示例,但是如果我没记错的话,它们中的任何一个都无助于解决我的问题。

是否有Python方式进行此转换?

2 个答案:

答案 0 :(得分:2)

如果列col1colN中的值相同,则可以使用:

#create index by `Sol` column
df = df.set_index('Sol')

#select first row, shift and create dictionary
d = df.iloc[0].shift().to_dict()

#select each `v1` column by indexing, rename columns and convert to dict
out = df.iloc[:, 1::2].rename(columns=d).to_dict('index')
print (out)

{1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
 2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 3: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 4: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 5: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 6: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 7: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 8: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 1}, 
 9: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}}

如果可能,col1colN列中的值不同,则对zip对和unpair值使用字典理解:

d = {k: dict(zip(list(v.values())[::2], list(v.values())[1::2])) 
       for k, v in df.set_index('Sol').to_dict('index').items()}
print (d)

{1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
 3: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 4: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 5: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 6: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 7: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 8: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 1}, 
 9: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}}

答案 1 :(得分:0)

我可以使用以下命令为您提供值作为键:

df.drop(['Sol'], axis=1).transpose().reset_index(drop=True).to_dict()

这将导致

{0: {0: 'Y_1_1', 1: 0, 2: 'Y_1_2', 3: 1, 4: 'Y_1_3', 5: 0, 6: 'Y_1_4', 7: 0},
 1: {0: 'Y_1_1', 1: 0, 2: 'Y_1_2', 3: 1, 4: 'Y_1_3', 5: 0, 6: 'Y_1_4', 7: 0},
 2: {0: 'Y_1_1', 1: 0, 2: 'Y_1_2', 3: 0, 4: 'Y_1_3', 5: 0, 6: 'Y_1_4', 7: 0}, ...

这对你有用吗?