Question

我很难理解用python编写的这行代码：

drop_index = assets_df[assets_df['id']==Id].sort_values(['date_signed_contract','start_date'], ascending = False)[1::].index.tolist()

ascending = False是否意味着它在下降？此外[1::]指的是什么？只是想找到正确的想法。

Answer 1

np.random.seed(41)

N= 10
L = pd.date_range('2015-01-01', '2016-01-01')
assets_df = pd.DataFrame({'date_signed_contract': np.random.choice(L, N),
                          'start_date':np.random.choice(L, N),
                          'id':np.random.randint(3, size=N),
                          'col':np.random.randint(10, size=N)})
print (assets_df)
   col date_signed_contract  id start_date
0    4           2015-03-22   1 2015-07-18
1    1           2015-11-18   2 2015-03-26
2    2           2015-09-01   2 2015-12-09
3    3           2015-03-31   1 2015-04-16
4    4           2015-10-10   0 2015-06-29
5    5           2016-01-01   1 2015-12-11
6    4           2015-08-25   1 2015-07-23
7    5           2015-05-12   1 2015-04-03
8    7           2015-06-13   2 2015-06-30
9    6           2015-06-28   2 2015-05-26

Id = 1
drop_index =(assets_df[assets_df['id']==Id]
             .sort_values(['date_signed_contract','start_date'], ascending = False).iloc[1:])
            .index.tolist())
print (drop_index)
[6, 7, 3, 0]

<强>解释：

首先按boolean indexing过滤条件：

Id = 1 
print (assets_df[assets_df['id']==Id])
   col date_signed_contract  id start_date
0    4           2015-03-22   1 2015-07-18
3    3           2015-03-31   1 2015-04-16
5    5           2016-01-01   1 2015-12-11
6    4           2015-08-25   1 2015-07-23
7    5           2015-05-12   1 2015-04-03

然后sort_values，ascending = False表示descending：

print (assets_df[assets_df['id']==Id]
              .sort_values(['date_signed_contract','start_date'], ascending = False))
   col date_signed_contract  id start_date
5    5           2016-01-01   1 2015-12-11
6    4           2015-08-25   1 2015-07-23
7    5           2015-05-12   1 2015-04-03
3    3           2015-03-31   1 2015-04-16
0    4           2015-03-22   1 2015-07-18

通过切片删除第一行（不需要上一个:，因为默认步骤为1），更好的是iloc：

print (assets_df[assets_df['id']==Id]
             .sort_values(['date_signed_contract','start_date'], ascending = False).iloc[1:])
   col date_signed_contract  id start_date
6    4           2015-08-25   1 2015-07-23
7    5           2015-05-12   1 2015-04-03
3    3           2015-03-31   1 2015-04-16
0    4           2015-03-22   1 2015-07-18

获取名为index的第一列：

print (assets_df[assets_df['id']==Id]
              .sort_values(['date_signed_contract','start_date'], ascending = False).iloc[1:]
              .index)
Int64Index([6, 7, 3, 0], dtype='int64')

将index转换为list：

print (assets_df[assets_df['id']==Id]
              .sort_values(['date_signed_contract','start_date'], ascending = False).iloc[1:]
              .index.tolist())
[6, 7, 3, 0]

代码理解python

1 个答案: