说我有两个以下形式的字典:
{'A':[1,2,3,4,5,6,7],
'B':[12,13,14,15,16,17,18} - Belongs to category "M"
{'A':[8,9,10,11,12,13,14],
'B':[18,19,20,21,22,23,24]} - Belongs to category "P"
现在,结果数据框应为-
Name . Value . Category
A . 1 . M
A . 8 . P
A . 10 . P
B . 12 . M
以此类推。这样的事情如何实现?
答案 0 :(得分:4)
与user3483203所建议的相比,这里的方法更可笑。这样可以避免不必要的迭代,速度更快(对于足够大的数据集),并且更加惯用。
m = {'A':[1,2,3,4,5,6,7],
'B':[12,13,14,15,16,17,18]}
p = {'A':[8,9,10,11,12,13,14],
'B':[18,19,20,21,22,23,24]}
p_df = pd.DataFrame(p).melt(value_name='value')
m_df = pd.DataFrame(m).melt(value_name='value')
p_df['category'] = 'P'
m_df['category'] = 'M'
result = pd.concat([m_df, p_df], ignore_index=True)
m = {'A': list(range(0, 100_000)), 'B': list(range(100_000, 200_000))}
p = {'A': list(range(200_000, 300_000)), 'B': list(range(300_000, 400_000))}
我们在这里:
%%timeit
p_df = pd.DataFrame(p).melt(value_name='value')
m_df = pd.DataFrame(m).melt(value_name='value')
p_df['category'] = 'P'
m_df['category'] = 'M'
result = pd.concat([m_df, p_df], ignore_index=True)
每个循环120毫秒±3.16毫秒(平均±标准偏差,共运行7次,每个循环10个循环)
%%timeit
categories = ['M', 'P']
dcts = [m, p]
dfs = [
pd.DataFrame([[k, el, cat] for k, v in dct.items() for el in v])
for dct, cat in zip(dcts, categories)
]
cols = {'columns': {0: 'Name', 1: 'Value', 2: 'Category'}}
result = pd.concat(dfs).reset_index(drop=True).rename(**cols)
每个循环207 ms±8.9 ms(平均±标准偏差,共运行7次,每个循环1次)
答案 1 :(得分:2)
设置
d1 = {'A': [1, 2, 3, 4, 5, 6, 7], 'B': [12, 13, 14, 15, 16, 17, 18]}
d2 = {'A': [8, 9, 10, 11, 12, 13, 14], 'B': [18, 19, 20, 21, 22, 23, 24]}
categories = ['M', 'P']
dcts = [d1, d2]
假设您知道哪个类别与哪个词典一起使用,则可以重组字典并使用concat
:
dfs = [
pd.DataFrame([[k, el, cat] for k, v in dct.items() for el in v])
for dct, cat in zip(dcts, categories)
]
cols = {'columns': {0: 'Name', 1: 'Value', 2: 'Category'}}
pd.concat(dfs).reset_index(drop=True).rename(**cols)
Name Value Category
0 A 1 M
1 A 2 M
2 A 3 M
3 A 4 M
4 A 5 M
5 A 6 M
6 A 7 M
7 B 12 M
8 B 13 M
9 B 14 M
10 B 15 M
11 B 16 M
12 B 17 M
13 B 18 M
14 A 8 P
15 A 9 P
16 A 10 P
17 A 11 P
18 A 12 P
19 A 13 P
20 A 14 P
21 B 18 P
22 B 19 P
23 B 20 P
24 B 21 P
25 B 22 P
26 B 23 P
27 B 24 P