查找与熊猫重复值的数量

时间:2020-09-13 03:35:33

标签: python pandas dataframe

我有一个数据框:

df = pd.DataFrame(
{'id': ['1', '2', '3', '4', '5', '6', '7', '8'],
 'datetime': ['24.06.2013 00:13:49',
  '24.06.2013 00:14:27',
  '24.06.2013 00:17:45',
  '24.06.2013 00:21:54',
  '24.06.2013 00:21:59',
  '24.06.2013 00:22:05',
  '24.06.2013 00:25:14',
  '24.06.2013 00:26:04'],
 'card_num': ['10', '10', '27', '10', '34', '10', '7', '3'],
 'type': ['cash_withdrawal',
  'cash_withdrawal',
  'refill',
  'cash_withdrawal',
  'payment',
  'cash_withdrawal',
  'payment',
  'cash_withdrawal'],
 'result': ['refusal',
  'refusal',
  'successful',
  'refusal',
  'successful',
  'successful',
  'successful',
  'successful'],
 'summ': [10000, 8000, 42431, 4000, 2347, 3500, 105, 999]})

要求找到与欺诈交易类似的条件,

  • 20分钟内进行卡交易
  • 用于提款或付款的卡交易
  • 卡交易> 3
  • 具有“拒绝”状态的前三笔或更多卡交易,具有“成功”状态的第四笔或更多卡交易
  • 每笔交易均少于前一笔

我已经执行以下操作:

df_report = df[(df.type != 'refill') & (df.result == 'successful')]
# left those lines where the type is not equal refusal and the result is successful
card = df_report.card_num
# get an array of these card numbers
suspicious = df[df.card_num.isin(card)]
# apply a filter to the main dataframe according 
# to the condition that the cards of the main df are contained in the filtered cards

接下来,我需要删除卡上操作<4的那些卡,我不知道该怎么做,您能告诉我吗? 此外,该数据帧将需要在结果列中进行过滤,以使这些卡片在成功和拒绝的情况下都保持不变。

1 个答案:

答案 0 :(得分:0)

要找到重复的分类值的顺序运行,可以执行以下操作:

<div class="col-xs-8 site-stats-count">
<ul style="margin-top:0px;">
<li class="bg-blue">
<strong class="mob-hide">Active  <span class="active_per"></span></strong>
<strong class="mob-hide">973175<span class="up">     (14859<i class="fa fa-arrow-up"></i>)</span></strong>
<!--<span class='down'>3565 <i class='fa fa-arrow-down'></i></span>-->
<span class="mob-show">Active </span>
<span class="mob-show"><span class="active_per"></span> </span>
<span class="mob-show"><strong>973175<span class="up"><br/>(14859<i class="fa fa-arrow-up"></i>)</span></strong></span> </li></ul></div>