我有一个带有3列A,B,C的数据框df。我希望A列是索引和键,B和C列是A的值。
我尝试了以下方法:
def cellDict():
df_set_index('A')['B','C']
x= df.set_index('A')['B']
y= df.set_index('A')['C']
z= zip(x,y)
def getCellDetails():
try:
cellDB_DF= pd.read_excel('cell_DB.xlsx')
cellLatDB= cellDB_DF['Latitude'].to_dict()
cellLongDB= cellDB_DF['Longitude'].to_dict()
cellDict= cellDF.set_index('Cell_ID')['Latitude']['Longitude'].to_dict()
print cellDict
except Exception as e:
print e.message
排除的结果类似于
df{cellID}=('latitude','longitude')
答案 0 :(得分:2)
# Sample data.
df = pd.DataFrame({'A': [1, 2, 3], 'B': [100, 200, 300], 'C': [400, 500, 600]})
>>> df
A B C
0 1 100 400
1 2 200 500
2 3 300 600
然后使用字典理解:
>>> {key: (a, b) for key, a, b in df.values}
{1: (100, 400), 2: (200, 500), 3: (300, 600)}
根据@piRSquared的建议,您还可以转置数据帧,然后使用to_dict
函数,将list
指定为方向变量。
df.set_index('A').T.to_dict('list')
他的其他建议提供了非常有效的解决方案:
dict(zip(df.A, zip(df.B, df.C)))
计时 (Python 3.7和pandas 0.24.2)
# Set-up 10k row dataframe.
df = pd.DataFrame({'A': range(10000), 'B': range(10000), 'C': range(10000)})
# Method 1
%timeit -n 10 {key: (a, b) for key, a, b in df.values}
# 14.8 ms ± 3.62 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Method 2
%timeit -n 10 df.set_index('A').T.to_dict('list')
# 520 ms ± 41.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Method 3
%timeit -n 10 dict(zip(df.A, zip(df.B, df.C)))
# 7.7 ms ± 3.32 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Method 4
%timeit -n 10 {k: (a, b) for k, a, b in zip(*map(df.get, df))}
# 9.61 ms ± 3.81 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)