Question

我有一个大型数据集存储为pandas面板。我想计算值的出现＆lt;面板中每个项目的minor_axis为1.0。到目前为止我所拥有的：

    #%% Creating the first Dataframe
    dates1 = pd.date_range('2014-10-19','2014-10-20',freq='H')
    df1 = pd.DataFrame(index = dates)
    n1 = len(dates)

    df1.loc[:,'a'] = np.random.uniform(3,10,n1)
    df1.loc[:,'b'] = np.random.uniform(0.9,1.2,n1)

    #%% Creating the second DataFrame
    dates2 = pd.date_range('2014-10-18','2014-10-20',freq='H')
    df2 = pd.DataFrame(index = dates2)
    n2 = len(dates2)

    df2.loc[:,'a'] = np.random.uniform(3,10,n2)
    df2.loc[:,'b'] = np.random.uniform(0.9,1.2,n2)

    #%% Creating the panel from both DataFrames
    dictionary = {}
    dictionary['First_dataset'] = df1
    dictionary['Second dataset'] = df2

    P = pd.Panel.from_dict(dictionary)

    #%% I want to count the number of values < 1.0 for all datasets in the panel
    ## Only for minor axis b, not minor axis a, stored seperately for each dataset
    for dataset in P:
        P.loc[dataset,:,'b'] #I need to count the numver of values <1.0 in this pandas_series

Answer 1

计算所有“b”值＆lt; 1.0，我首先通过交换短轴和项目来在自己的DataFrame中隔离b。

In [43]: b = P.swapaxes("minor","items").b

In [44]: b.where(b<1.0).stack().count()
Out[44]: 30

Answer 2

感谢您与我们一起思考，但经过数小时的尝试，我设法找到了一个非常简单的解决方案。我想我应该分享它以防其他人正在寻找类似的解决方案。

    for dataset in P:
        abc = P.loc[dataset,:,'b']
        abc_low = sum(i < 1.0 for i in abc)

Python Pandas Panel计数值的出现

2 个答案: