使用数据框基于列['A']值为新列['E']赋值

时间:2017-03-21 01:11:48

标签: python pandas dataframe

在下面的示例中。我正在尝试生成一个“E”列,根据A列上的条件语句分配[1或2]。

我尝试了各种选项,但是却出现了切片错误。 (是否应该为新列'E'分配值?

df2 = df.loc [df ['A'] =='foo'] ['E'] = 1

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
#      A      B  C   D
# 0  foo    one  0   0
# 1  bar    one  1   2
# 2  foo    two  2   4
# 3  bar  three  3   6
# 4  foo    two  4   8
# 5  bar    two  5  10
# 6  foo    one  6  12
# 7  foo  three  7  14

print('Filter the content')
df2= df.loc[df['A'] == 'foo']
print(df2)

#      A      B  C   D   E 
# 0  foo    one  0   0   1
# 2  foo    two  2   4   1
# 4  foo    two  4   8   1
# 6  foo    one  6  12   1
# 7  foo  three  7  14   1

df3= df.loc[df['A'] == 'bar']
print(df3)

#      A      B  C   D   E
# 1  bar    one  1   2   2
# 3  bar  three  3   6   2
# 5  bar    two  5  10   2

#Combile df2 and df3 back to df and print df
print(df)
#      A      B  C   D   E
# 0  foo    one  0   0   1
# 1  bar    one  1   2   2 
# 2  foo    two  2   4   1
# 3  bar  three  3   6   2
# 4  foo    two  4   8   1
# 5  bar    two  5  10   2
# 6  foo    one  6  12   1
# 7  foo  three  7  14   1

3 个答案:

答案 0 :(得分:3)

这是怎么回事?

df['E'] = np.where(df['A'] == 'foo', 1, 2)

答案 1 :(得分:1)

这就是我认为你想做的事情。在数据框中创建一个E列,如果A == foo则为1,如果A!= foo,则为2。

df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
df['E']=np.ones([df.shape[0],])*2
df.loc[df.A=='foo','E']=1
df.E=df.E.astype(int)
print(df)

注意:您建议的解决方案df2= df.loc[df['A'] == 'foo']['E'] = 1使用串行切片,而不是利用loc。要按第一个条件对df行进行切片并返回E列,您应该使用df.loc[df['A']=='foo','E']

注意II:如果您有多个条件,您还可以使用.replace()并传入字典。在这种情况下,将foo映射到1,将bar映射到2,依此类推。

答案 2 :(得分:0)

为了简洁(字符)

df.assign(E=df.A.ne('foo')+1)

     A      B  C   D  E
0  foo    one  0   0  1
1  bar    one  1   2  2
2  foo    two  2   4  1
3  bar  three  3   6  2
4  foo    two  4   8  1
5  bar    two  5  10  2
6  foo    one  6  12  1
7  foo  three  7  14  1

为了简洁(时间)

df.assign(E=(df.A.values != 'foo') + 1)

     A      B  C   D  E
0  foo    one  0   0  1
1  bar    one  1   2  2
2  foo    two  2   4  1
3  bar  three  3   6  2
4  foo    two  4   8  1
5  bar    two  5  10  2
6  foo    one  6  12  1
7  foo  three  7  14  1