Python和Pandas - 确定账单是否逾期

时间:2016-01-19 18:08:34

标签: python pandas time

我有一个包含时间序列和分类数据的数据框。

  ╔═════════════════════════════════════════════╗
  ║ Name       BillDate             Bill Status ║
  ╠═════════════════════════════════════════════╣
  ║ Company A  2015-07-22 15:51:00  Paid        ║
  ║ Company B  2015-01-31 12:01:00  Unpaid      ║
  ║ Company C  2016-01-12 00:00:00  Unpaid      ║
  ╚═════════════════════════════════════════════╝

我正在尝试添加另一个专栏,告诉我该法案是否已根据两个因素逾期。第一个因素是当前日期是BillDate + 180天或更多,第二个因素是Bill Status未付款。

我可能对如何做到这一点非常密切。我的想法是做以下事情:

   billpayperiod = timedelta(days = 180)
   currentdate = datetime.now()
   df['Bill Due Date'] = df['BillDate'].apply(lambda x: x + billpayperiod)

然后创建一些将检查是否

的函数
 currendate > Bill Due Date and Bill Status = unpaid. 
 If True = Overdue
 If False = No Due,
 If Bill Status = paid, then Paid. 

感谢您对以下方面的看法: 这种方法有意义吗? 2.帮助创建执行检查的功能

因为我在excel方面要好得多,所以我会用它来做这件事:

  Create the Bill Date + 180 column (name it DueDate
  Set a cell = currentdate
  Create a new column: formula    IF(BillStatus="Paid","Paid",IF(AND(BillStatus="Unpaid",currentdate>DueDate),"Overdue","Not Overdue"))    

2 个答案:

答案 0 :(得分:1)

IIUC这将做你想做的事:

data.frame()

我们可以在timedeltas上调用dt.days并比较绝对值:

In [21]:
df[(((df['BillDate'] - dt.datetime.now()).dt.days).abs() > 180) & (df['Bill Status'] == 'Unpaid')]

Out[21]:
        Name            BillDate Bill Status
1  Company B 2015-01-31 12:01:00      Unpaid

修改

要设置新状态,您可以定义几个面具并使用In [25]: (df['BillDate'] - dt.datetime.now()).dt.days Out[25]: 0 -182 1 -354 2 -8 Name: BillDate, dtype: int64 In [24]: (df['BillDate'] - dt.datetime.now()).dt.days ((df['BillDate'] - dt.datetime.now()).dt.days).abs() Out[24]: 0 182 1 354 2 8 Name: BillDate, dtype: int64

np.where

答案 1 :(得分:1)

您可以使用

在pandas中轻松添加列
#create columns 'newStatus' and set default to No due
df['newStatus'] = 'No Due'

然后您可以使用.loc和上面答案中的索引将其设置为特定值

df.loc[indices,column] = value

例如:

#create indices for unpaid bills, and for bills that are due
iUnpaid = df['Bill Status']=='Unpaid'
iDue = (((df['BillDate'] - dt.datetime.now()).dt.days).abs() > 180)

#update corresponding values
df.loc[iUnpaid & iDue,'newStatus'] = 'Due'
df.loc[iUnpaid & ~iDue,'newStatus'] = 'No Due'