KeyError:在熊猫中,当尝试使用.loc方法将布尔列值更改为字符串时

时间:2018-09-06 04:55:37

标签: python-2.7 pandas jupyter-notebook data-analysis

我正在运行一个条件循环,以基于“单独”列的值在DataFrame(TDF)中创建一个新列。 如果值为0,则在字符串“ alone”中添加字符串“ alone”,否则添加“ with family”。 我正在使用代码: 我收到错误消息:

tdf['alone'].loc[['alone'] >0]= 'with family'
tdf['alone'].loc[['alone'] ==0] = 'alone'

运行上述行后,出现以下错误:

KeyError: 'cannot use a single bool to index into setitem'

我提到了这个same question,收集到的信息是我需要在row_indexer中加入tdf['alone'].loc[[row_indexer,['alone']] = 'alone',但是我不确定如何获取row_indexer中的值< / p>

3 个答案:

答案 0 :(得分:3)

pandas.Series.clip

将值仅剪切为01并将其用于切片数组

tdf.assign(alone=np.array(['alone', 'with family'])[tdf.alone.clip(0, 1)])

         alone  col
0  with family    1
1  with family    1
2  with family    9
3        alone    4
4  with family    2
5        alone    3

pandas.Series.map

tdf.assign(alone=tdf.alone.map(lambda x: 'with family' if x else 'alone'))

         alone  col
0  with family    1
1  with family    1
2  with family    9
3        alone    4
4  with family    2
5        alone    3

map

版本2

tdf.assign(alone=tdf.alone.map(lambda x: {0: 'alone'}.get(x, 'with family')))

         alone  col
0  with family    1
1  with family    1
2  with family    9
3        alone    4
4  with family    2
5        alone    3

设置

从@jezrael借来的

tdf = pd.DataFrame({'alone':[4,4,5,0,5,0],
                   'col':[1,1,9,4,2,3]})

答案 1 :(得分:2)

需要boolean indexingloc和布尔掩码-比较DataFrame的列与值0,而不是一个项目列表[alone]

tdf.loc[tdf['alone'] > 0, 'alone'] = 'with family'
tdf.loc[tdf['alone'] ==0, 'alone'] = 'alone'

如果不可能使用负数,请使用numpy.where

tdf['alone'] = np.where(tdf['alone'] == 0,  'alone', 'with family')

示例

tdf = pd.DataFrame({'alone':[4,4,5,0,5,0],
                   'col':[1,1,9,4,2,3]})

print (tdf)
   alone  col
0      4    1
1      4    1
2      5    9
3      0    4
4      5    2
5      0    3

tdf['alone'] = np.where(tdf['alone'] == 0,  'alone', 'with family')
print (tdf)

         alone  col
0  with family    1
1  with family    1
2  with family    9
3        alone    4
4  with family    2
5        alone    3

解决方案也是错误的,因为chained assignments-它可以创建一个副本来更新tdf['alone']的副本,而您不会看到它:

#added boolean mask tdf['alone'] > 0
tdf['alone'].loc[tdf['alone'] > 0 ]= 'with family'

答案 2 :(得分:0)

[['alone'] > 0]将Python列表['alone']与整数0进行比较。请改用以下内容:

tdf.loc[tdf['alone'] > 0, 'alone'] = 'with family'
tdf.loc[tdf['alone'] == 0, 'alone'] = 'alone'