以下是一个简化的示例:
val = [10,23,45,31,78,43,1,67,82]
indx = [1,4,5,8]
indx2 = [3,6,7]
indx3 = [0,2]
samp = {}
samp[0] = indx
samp[1] = indx2
samp[2] = indx3
说我有一个字典(样本)有两个组:组0和组1。 字典中包含向量val中的值的标记。
我想通过创建8 X 2矩阵,根据字典中的给定组提取val中的所有值, 我在哪里按索引在两列中有组和值,所以它看起来像这样:
val group
10 2
23 0
45 2
31 0
87 0
43 1
1 1
67 0
82 1
我该怎么做?
答案 0 :(得分:1)
一种获取方式
[(j, next(k for k,v in samp.items() if i in v)) for i,j in enumerate(val)]
输出:
[(10, 2),
(23, 0),
(45, 2),
(31, 1),
(78, 0),
(43, 0),
(1, 1),
(67, 1),
(82, 0)]
答案 1 :(得分:0)
使用dictionary comprehension
反转字典中的键,值对,然后使用map
:
$3
df = pd.DataFrame(val,columns=['val'])
d = {value1:key for key,value in samp.items() for value1 in value}
df['group'] = df.index.map(d)
print(df)
val group
0 10 2
1 23 0
2 45 2
3 31 1
4 78 0
5 43 0
6 1 1
7 67 1
如果值是numpy数组怎么办?
print(d)
{1: 0, 4: 0, 5: 0, 8: 0, 3: 1, 6: 1, 7: 1, 0: 2, 2: 2}
答案 2 :(得分:0)
这是不使用熊猫的解决方案,它输出(8,2)numpy矩阵:
val = [10,23,45,31,78,43,1,67,82]
indx = [1,4,5,8]
indx2 = [3,6,7]
indx3 = [0,2]
indices = [indx, indx2, indx3]
def get_group(x):
for i,indx_arr in enumerate(indices):
if x in indx_arr:
return i
pairs = [(v,get_group(i)) for i,v in enumerate(val)]
np.asarray(pairs)
array([[10, 2],
[23, 0],
[45, 2],
[31, 1],
[78, 0],
[43, 0],
[ 1, 1],
[67, 1],
[82, 0]])