如何根据有序列表替换pandas dataframe列中的元素?

时间:2018-09-06 23:40:12

标签: python pandas list dataframe append

让我们说我有这个熊猫数据框:

index  a        b
1    'pika'   'dog'
2    'halo'   'cat'
3    'polo'   'dog'
4    'boat'   'man'
5    'moan'   'tan'
6    'nope'   'dog'

我有一个这样的列表:

colors = ['black' , 'green', 'yellow']

如何用元素替换列dog中的所有b

colors列表中以相同的顺序

基本上,我希望它看起来像这样:

index  a        b
1    'pika'  'black'
2    'halo'   'cat'
3    'polo'  'green'
4    'boat'   'man'
5    'moan'   'tan'
6    'nope'  'yellow'

4 个答案:

答案 0 :(得分:3)

使用pd.DataFrame.loc和布尔索引:

df.loc[df['b'].eq('dog'), 'b'] = colors

print(df)

   index     a       b
0      1  pika   black
1      2  halo     cat
2      3  polo   green
3      4  boat     man
4      5  moan     tan
5      6  nope  yellow

答案 1 :(得分:0)

另一种使用numpy put的方式

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': ['pika', 'halo', 'polo', 'boat', 'moan', 'nope'],
                   'b': ['dog', 'cat', 'dog', 'man', 'tan', 'dog']})
colors = ['black' , 'green', 'yellow']

df

    a       b
0   pika    dog
1   halo    cat
2   polo    dog
3   boat    man
4   moan    tan
5   nope    dog

-

# 'wrap' mode is not needed when replacement list is same
# size as the number of target values
np.put(df.b, np.where(df.b == 'dog')[0], colors, mode='wrap')

df

    a       b
0   pika    black
1   halo    cat
2   polo    green
3   boat    man
4   moan    tan
5   nope    yellow

答案 2 :(得分:0)

使用itertools.cycledf.applylambda

In [100]: import itertools as it

In [101]: colors_gen = it.cycle(colors)

In [102]: df1['c'] = df1['b'].apply(lambda x: next(colors_gen) if x == 'dog' else x)

In [103]: df1
Out[103]:
      a    b       c
0  pika  dog   black
1  halo  cat     cat
2  polo  dog   green
3  boat  man     man
4  moan  tan     tan
5  nope  dog  yellow

这也适用于较大的DataFrames

In [104]: df2 = pd.DataFrame({'a': ['pika', 'halo', 'polo', 'boat','moan','nope','etc','etc'], 'b':['dog','cat','dog','man','tan','dog','dog','dog']})

In [106]: df2['c'] = df2['b'].apply(lambda x: next(colors_gen) if x == 'dog' else x)

In [107]: df2
Out[107]:
      a    b       c
0  pika  dog   black
1  halo  cat     cat
2  polo  dog   green
3  boat  man     man
4  moan  tan     tan
5  nope  dog  yellow
6   etc  dog   black
7   etc  dog   green

答案 3 :(得分:0)

您可以使用

进行检查
n=(df.b=="'dog'").sum()

df.loc[df.b=="'dog'",'b']=(['black' , 'green', 'yellow']*(n//3))[:n]