如何使用新值更新列的特定DataFrame切片?

时间:2017-04-24 02:16:35

标签: python pandas numpy

我有一个DataFrame,它有一个列'pred',它是空的,我希望用一些特定的值更新它。他们最初是一个numpy阵列,但我把它们放在一个名为“this”的系列中:     打印(类型(预测))     

print(predictions)
['collection2' 'collection2' 'collection2' 'collection1' 'collection2'
 'collection1']

this = pd.Series(predictions, index=test_indices)

print(type(data))
<class 'pandas.core.frame.DataFrame'>

print(data.shape)
(35, 4)

print(data.iloc[test_indices])
     class         pred                                          text  \
223  collection2   []  Fellow-Citizens of the Senate and House of Rep...   
20   collection1   []  The period for a new election of a citizen to ...   
12   collection1   []  Fellow Citizens of the Senate and of the House...   
13   collection1   []  Whereas combinations to defeat the execution o...   
212  collection2   []  MR. PRESIDENT AND FELLOW-CITIZENS OF NEW-YORK:...   
230  collection2   []  Fellow-Countrymen:\nAt this second appearing t...   

                                                 title  
223                               First Annual Message  
20                                    Farewell Address  
12                    Fifth Annual Message to Congress  
13   Proclamation against Opposition to Execution o...  
212                               Cooper Union Address  
230                           Second Inaugural Address 

print(type(this))
<class 'pandas.core.series.Series'>

print(this.shape)
(6,)

print(this)
0    collection2
1    collection1
2    collection1
3    collection1
4    collection2
5    collection2

我以为我可以这样做:

data.iloc[test_indices, [4]] = this

但结果是

IndexError: positional indexers are out-of-bounds

data.ix[test_indices, ['pred']] = this
KeyError: '[0] not in index'

2 个答案:

答案 0 :(得分:1)

尝试:

data.loc[data.index[test_indices], 'pred'] = this

答案 1 :(得分:1)

我更喜欢.ix over .loc。你可以使用

data.ix[bool_series, 'pred'] = this

这里,bool_series是一个布尔系列,包含要为其更新值的行的True,否则为False。例如:

bool_series = ((data['col1'] > some_number) & (data['col2'] < some_other_number))

但是,请确保您已经有一个&#39; pred&#39;在使用data.ix [bool_series,&#39; pred&#39;]之前的列。否则,它会出错。