从查找表更改列表的元素

时间:2019-02-26 10:27:08

标签: python

我有一个像y_train =[ 1 1 1 1 3 3 3 4 4 5 6 6 6]这样的列表。我想更改某些元素的值。例如,将每个1更改为0,将每个3更改为1,将每个4更改为2,依此类推。同样重要的是,先前更改的值不应被覆盖。目前,我将forenumerate

一起使用
 for n, i in enumerate(A):
    if i == 1:
        y_train[n] = 0
    elif i == 3:
        y_train[n] = 1
    elif i == 4:
        y_train[n] = 2
    elif i == 5:
        y_train[n] = 3
    elif i == 6:
        y_train[n] = 4
    else :
        y_train[n] = 5

但是我需要一种更整洁的pythonic方式,以for each element in y_train lookup [1 3 4] change with [0 1 2]

这样的语法来完成此操作

4 个答案:

答案 0 :(得分:3)

我认为您正在寻找dict。代表查找表是完美的选择。

In [1]: lookup_table = {1:0, 3:1, 4:2}                                                                                                                                            

In [2]: y_train =[ 1, 1, 1, 1, 3, 3, 3, 4, 4, 5, 6, 6, 6]                                                                                                                          

In [3]: new_y_train = [lookup_table.get(x, x) for x in y_train]                                                                                                                    

In [4]: new_y_train                                                                                                                                                                
Out[4]: [0, 0, 0, 0, 1, 1, 1, 2, 2, 5, 6, 6, 6]

在这里,当查找表没有条目时,我使用get方法提供原始值作为后备值,但是如果您确信查找表是详尽无遗的,则可能不会

答案 1 :(得分:1)

您可以从提供的值构建查找字典。如果您想转换所有值,就像我怀疑的那样,您只需要使用基于y_train中唯一项的dict理解即可获得所有可用值的映射(可以使用set访问)

y_train =[ 1, 1, 1, 1, 3, 3, 3, 4, 4, 5, 6, 6, 6]
lookup = {val:i for i, val in enumerate(sorted(set(y_train)))}
#Output: {1: 0, 3: 1, 4: 2, 5: 3, 6: 4}

y_train = [lookup[y] for y in y_train]
#Output: [0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]

答案 2 :(得分:0)

我认为这可以满足您的条件。

y_train =[ 1, 1, 1, 1, 3, 3, 3, 4,4, 5, 6, 6, 6]
l = list(set(y_train))
y = list()
for i in y_train:
    if i in l:
        y.append(l.index(i))
    else :
        y.append(5)

答案 3 :(得分:0)

您应该使用numpy ...

import numpy as np

if __name__ == '__main__':
    data = np.array([
        [1, 'a'],
        [1, 'b'],
        [1, 'c'],
        [2, 'a'],
        [2, 'b'],
        [2, 'c'],
        [3, 'a'],
        [3, 'b'],
        [3, 'c']
    ])
    print(data)
    # [['1' 'a']
    #  ['1' 'b']
    #  ['1' 'c']
    #  ['2' 'a']
    #  ['2' 'b']
    #  ['2' 'c']
    #  ['3' 'a']
    #  ['3' 'b']
    #  ['3' 'c']]

    col_to_change = data[:, 0].astype('int64')
    conditions = [
        (col_to_change == 1),
        (col_to_change == 2),
        (col_to_change == 3)
    ]
    to_ = [10, 20, 30]

    final_col = np.select(conditions, to_, default='')
    print(final_col)
    # ['10' '10' '10' '20' '20' '20' '30' '30' '30']
    data[:, 0] = final_col
    print(data)
    # [['10' 'a']
    #  ['10' 'b']
    #  ['10' 'c']
    #  ['20' 'a']
    #  ['20' 'b']
    #  ['20' 'c']
    #  ['30' 'a']
    #  ['30' 'b']
    #  ['30' 'c']]