我想规范字典data
中的所有值,然后将它们再次存储在具有相同键的另一个字典中,并且对于每个键,值都应存储在1D数组中,所以我执行了以下操作:
>>> data = {1: [0.6065306597126334], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}
>>> norm = {k: [v / sum(vals) for v in vals] for k, vals in data.items()}
>>> norm
{1: [1], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}
现在假设字典data
的其中一个键仅包含零值,例如第一个键1
的值:
>>> data = {1: [0.0], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}
然后归一化此字典的值将是[nan]
的值,因为除以零
>>> norm = {k: [v / sum(vals) for v in vals] for k, vals in data.items()}
__main__:1: RuntimeWarning: invalid value encountered in double_scalars
>>> norm
{1: [nan], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}
所以我插入了if statement
来解决此问题,但是我不能将每个键的值存储为ID数组
代码
>>> norm = {}
>>> for k, vals in data.items():
... values = []
... if sum(vals) == 0:
... values.append(list(vals))
... else:
... for v in vals:
... values.append(list([v/sum(vals)]))
... norm[k]=values
...
>>> norm
{1: [[1.0]], 2: [[0.4498162176582741], [0.4498162176582741], [0.10036756468345168]], 3: [[0.4498162176582741], [0.4498162176582741], [0.10036756468345168]], 4: [[0.5], [0.5]]}
我想将norm
词典作为
norm = {1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}
此外,对于字典data
,如果它是键,则它包含一个零值,但是有更好的解决方案将其标准化,因为我认为我的解决方案效率不高!
P.S:我在for循环norm[k]= np.array(values)
而不是norm[k]=values
的末尾尝试过,但结果不符合要求。
答案 0 :(得分:1)
append
将一个元素添加到列表中,并且该元素可以是列表,这就是为什么当前在列表中有一个列表的原因。理想情况下,您应该使用extend
将第一个列表与另一个列表连接起来。
答案 1 :(得分:1)
如答案中所述,extend
可用于解决您的问题。如果您确实想使用append
,则可以使用列表的第一个元素。
norm = {}
for k, vals in data.items():
values = []
if sum(vals) == 0:
values.append(vals[0])
else:
for v in vals:
values.append([v / sum(vals)][0])
norm[k] = values
有关添加与扩展的示例,请参见difference between append vs extend list methods in python
关于优化。无法完全删除for循环,但是您可以在保持可读性的同时简化解决方案:
norm = {}
for k, vals in data.items():
if sum(vals) == 0:
norm[k] = vals
else:
norm[k] = [x / sum(vals) for x in vals]
答案 2 :(得分:0)
sum(vals) == 0
时,您的字典/列表理解失败:
>>> data = {1: [0.0], 2: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 3: [0.6065306597126334, 0.6065306597126334, 0.1353352832366127], 4: [0.6065306597126334, 0.6065306597126334]}
>>> {k: [v / sum(vals) for v in vals] for k, vals in data.items()}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <dictcomp>
File "<stdin>", line 1, in <listcomp>
ZeroDivisionError: float division by zero
您可以引入三元表达式来处理这种情况:
>>> {k: [v / sum(vals) if sum(vals)!=0 else 1.0 for v in vals] for k, vals in data.items()}
{1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}
如果您想避免对sum(vals)
进行多次评估:
>>> {k: [v / s if s!=0 else 1.0 for v in vals] for k,vals,s in ((k, vals, sum(vals)) for k, vals in data.items())}
{1: [1.0], 2: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 3: [0.4498162176582741, 0.4498162176582741, 0.10036756468345168], 4: [0.5, 0.5]}
((k, vals, sum(vals)) for k, vals in data.items())
是一个生成器,为每个项目返回k
,vals
和sum(vals)
。