Question

我无法将10列中的多个值合并为一个set。我想使用一个集合，因为每列都有重复的值，我希望获得所有值（医疗代码）的列表，而不重复列表中的任何值。我能够从第一列创建初始设置但是当我尝试添加其他列时，我得到“不可用的类型错误”。

这是我的代码：

data_sorted = data.fillna(0).sort_values(['PAT_ID', 'VISIT_NO'])
set_ICD1 = set(data_sorted['ICD_1'].unique())
print(len(set_ICD1))
set_ICD = set_ICD1.add(data_sorted['ICD_2'])

print(len(set_ICD))

这是我得到的错误：

11586 # (not part of the error this is the length of the initial set)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-e3966ec54661> in <module>()
  1 set_ICD1 = set(data_sorted['ICD_1'].unique())
  2 print(len(set_ICD1))
----> 3 set_ICD = set_ICD1.add(data_sorted['ICD_2'].unique())
  4 
  5 print(len(set_ICD))

TypeError: unhashable type: 'numpy.ndarray'

任何建议或提示如何解决这个问题将不胜感激！

Answer 1

如果您想一次向set添加多个元素，则需要使用update方法而不是add：

set_ICD1.update(data_sorted['ICD_2'])

如果它是NumPy数组，你应该使用ravel()（如果它是n维 - 这会使它变平）和tolist()（表现）：< / p>

set_ICD1.update(data_sorted['ICD_2'].ravel().tolist())

Python将多列中的值添加到set（）

1 个答案: