熊猫计算字典中值的出现

时间:2020-06-01 15:45:14

标签: python pandas

给出一个DF:

export class ItemPriceEditingComponent extends AppComponentBase implements OnInit {

    filter = {
        priceEditingPercent: [-100, 100]
    }

    constructor(injector: Injector) {
        super(injector);
    }

    ngOnInit() {
    }

}

我如何计算dict中出现“已关闭”的次数?

<div class="col-md-6">
  <label class="my form-inline">
   <input type="number" 
          class="form-control input-sm"
          min="-100" 
          max="100"
          [(ngModel)]="filter.priceEditingPercent[0]" 
          name="filter.priceEditingPercent[0]" />
   -
   <input type="number" 
          class="form-control input-sm"
          min="-100"
          max="100"
          [(ngModel)]="filter.priceEditingPercent[1]"
          name="filter.priceEditingPercent[1]" />

   &nbsp;{{l('PriceChangePercent')}}
  </label>

   <p-slider [(ngModel)]="filter.priceEditingPercent"
             name="filter.priceEditingPercent"
             [range]="true"
             [min]="-100"
             [max]="100">
   </p-slider>
</div>

我真的不知道该如何尝试

7 个答案:

答案 0 :(得分:4)

您可以执行apply

df['count'] = df.B.apply(pd.Series).eq('Closed').sum(1)

输出:

   A                                                  B  count
0  1  {'Mon': 'Closed', 'Tue': 'Open', 'Wed': 'Closed'}      2
1  2    {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Closed'}      1
2  3      {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Open'}      0

答案 1 :(得分:3)

您可以尝试将一系列字典转换为一个数据帧,然后转换为stack,然后将level = 0上的Closed个值求和,以得到每行计数:

df['Count_closed'] = pd.DataFrame(df['B'].tolist()).stack().eq("Closed").sum(level=0)

   A                                                  B  Count_closed
0  1  {'Mon': 'Closed', 'Tue': 'Open', 'Wed': 'Closed'}           2.0
1  2    {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Closed'}           1.0
2  3      {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Open'}           0.0

答案 2 :(得分:2)

我会做

df.B.astype(str).str.count('Closed')
Out[30]: 
0    2
1    1
2    0
Name: B, dtype: int64

df['Cnt']=pd.DataFrame(df.B.tolist()).eq('Closed').sum(1).values
Out[35]: 
0    2
1    1
2    0
dtype: int64

答案 3 :(得分:0)

直接.apply()解决方案:

df['Count'] = df.B.apply(lambda x: sum('Closed' in v for v in x.values()))
print(df)

打印:

   A                                                  B  Count
0  1  {'Mon': 'Closed', 'Tue': 'Open', 'Wed': 'Closed'}      2
1  2    {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Closed'}      1
2  3      {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Open'}      0

基准:

import perfplot
import pandas as pd


def f1(df):
    df['Count'] = df.B.apply(lambda x: sum('Closed' in v for v in x.values()))
    return df

def f2(df):
    df['count'] = df.B.astype(str).str.count('Closed')
    return df

# Commented out because of timed-out:
# def f3(df):
#     df['count'] = df.B.apply(pd.Series).eq('Closed').sum(1)
#     return df

def f4(df):
    df['count'] = pd.DataFrame(df['B'].tolist()).stack().eq("Closed").sum(level=0)
    return df

def setup(n):
    A = [*range(n)]
    B = [{'Mon': 'Closed', 'Tue': 'Open', 'Wed': 'Closed'} for _ in range(n)]
    df = pd.DataFrame({'A': A,
                       'B': B})
    return df

perfplot.show(
    setup=setup,
    kernels=[f1, f2, f4],
    labels=['apply(sum)', 'str.count()', 'stack.eq()'],
    n_range=[10**i for i in range(1, 7)],
    xlabel='N (* len(df))',
    equality_check=None,
    logx=True,
    logy=True)

结果:

enter image description here

因此,将apply()sum()一起使用似乎是最快的方法。

答案 4 :(得分:0)

请不要将字典放入数据框列中。您正在失去向量化运算的所有速度,并使值难以访问。

清理您的df

>>> df = pd.concat([df['A'], df['B'].apply(pd.Series)], axis=1)
>>> df 
   A     Mon   Tue     Wed
0  1  Closed  Open  Closed
1  2    Open  Open  Closed
2  3    Open  Open    Open

现在计数'Closed'很容易。

>>> df['count'] = df.eq('Closed').sum(1)
>>> df
   A     Mon   Tue     Wed  count
0  1  Closed  Open  Closed      2
1  2    Open  Open  Closed      1
2  3    Open  Open    Open      0

答案 5 :(得分:0)

使用辅助功能:

def aux_func(x):

    week_days = x.keys()
    count=0
    for day in week_days:
        if x[day]=='Closed':
            count+=1

    return count

counts = [aux_func(c) for c in df.loc[:,'B'] ]

df['counts'] = counts

答案 6 :(得分:0)

您可以在简单的列表理解中使用计数器。

from collections import Counter

df['count'] = [Counter(x.values())['Closed'] for x in df.B]

#   A                                                  B  Count
#0  1  {'Mon': 'Closed', 'Tue': 'Open', 'Wed': 'Closed'}      2
#1  2    {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Closed'}      1
#2  3      {'Mon': 'Open', 'Tue': 'Open', 'Wed': 'Open'}      0