根据数组键对齐numpy数组

时间:2015-07-28 15:22:13

标签: python arrays numpy nlp tuples

我想知道如何在NLP中使用每个数组值的两个numpy数组" s" key"(我意识到使用NLTK可能会更容易做到这一点,但是我试图不将它用于这个实现。)

例如,假设我有两个数组定义为:

array1 = [['dolor' 5] ['sit' 3] ['amet' 1]]
array2 = [['scripsit' 10] ['sit' 1] ['amet' 1]]

我希望输出数组如下:

array1 = [['scripsit' 0] ['dolor' 5] ['sit' 3] ['amet' 1]]
array2 = [['scripsit' 10] ['dolor' 0] ['sit' 1] ['amet' 1]]

这可能吗?

1 个答案:

答案 0 :(得分:2)

First you can get the unique keys then create a dict view of arrays and use a list comprehension to create the desire out put :

>>> all_keys=np.unique(np.array((array1,array2)).T[0])
>>> dict1=dict(array1)
>>> dict2=dict(array2)

>>> array1=np.array([[i,dict1.get(i,0)] for i in all_keys])
>>> array1
array([['amet', '1'],
       ['dolor', '5'],
       ['scripsit', '0'],
       ['sit', '3']], 
      dtype='|S8')
>>> array2=np.array([[i,dict2.get(i,0)] for i in all_keys])
>>> array2
array([['amet', '1'],
       ['dolor', '0'],
       ['scripsit', '10'],
       ['sit', '1']], 
      dtype='|S8')

Note :This approach will produce new arrays with same order.