Question

我想在pandas DataFrame的前n行的某些列中设置一个值。

>>> example = pd.DataFrame({'number':range(10),'name':list('aaabbbcccc')},index=range(20,0,-2)) # nontrivial index
>>> example
   name  number
20    a       0
18    a       1
16    a       2
14    b       3
12    b       4
10    b       5
8     c       6
6     c       7
4     c       8
2     c       9

我想设置＆＃34;数字＆＃34;对于第一个，例如，5行到数字19.我真正想要的是设置＆＃34;数字＆＃34;的最低值。到那个值，所以我先排序。如果我的索引是微不足道的，我可以做

example.loc[:5-1,'number'] = 19 # -1 for inclusive indexing
# or 
example.ix[:5-1,'number'] = 19

但是由于它没有，这将产生以下工件（其中所有索引值最多为4）：

>>> example
   name  number
20    a      19
18    a      19
16    a      19
14    b      19
12    b      19
10    b      19
8     c      19
6     c      19
4     c      19
2     c       9

使用.iloc []会很好，除了它不接受列名。

example.iloc[:5]['number'] = 19

有效，但提供了SettingWithCopyWarning。

我目前的解决方案是：

>>> example.sort_values('number',inplace=True)
>>> example.reset_index(drop=True,inplace=True)
>>> example.ix[:5-1,'number'] = 19
>>> example
  name  number
0    a      19
1    a      19
2    a      19
3    b      19
4    b      19
5    b       5
6    c       6
7    c       7
8    c       8
9    c       9

因为我必须在几个列中重复这个，所以我必须多次这样做并且每次都重置索引，这也花费了我的索引（但没关系）。

有没有人有更好的解决方案？

Answer 1

我会使用.iloc，因为如果重复某些索引，.loc可能会产生意外的结果。

example.iloc[:5, example.columns.get_loc('number')] = 19

Answer 2

example.loc[example.index[:5], 'number'] = 19

设置为第一行pandas DataFrame

2 个答案: