Question

不确定标题是否有意义。虐待尝试详细说明。

我只是试图获得百分比值最高的传感器。 F.ex，我想要测量最高值的传感器中排名前10％的传感器。代码中有两次尝试；尝试通过测量的体积（累积的）来获取它，而另一个以小时为单位（产生错误消息）。

“总计”列是在时间戳之间测量的数量。我将永远感谢您的任何答复。

数据集/数据框当前如下所示：

Time DeviceEui Volume 0 2019-11-12 09:50:22 0007090000CA1 3.822 0.013 1 1 2019-11-12 09:51:35 000709000099F 16.473 0.008 1 2 2019-11-12 09:51:41 0007090000CCE 33.170 0.000 1 3 2019-11-12 09:51:48 00070900009A4 31.163 0.016 1 4 2019-11-12 09:54:10 00070900009C9 7.030 0.026 1 5 2019-11-12 09:55:46 0007090000CA6 31.621 0.001 1 6 2019-11-12 09:56:53 00070900009CF 9.296 0.000 1 7 2019-11-12 09:57:40 00070900009B1 48.864 0.041 1 8 2019-11-12 09:58:17 0007090000145 33.384 0.006 1 9 2019-11-12 10:00:17 0007090000CAB 12.458 0.003 1 10 2019-11-12 10:00:56 0007090000CAE 25.885 0.000 1 11 2019-11-12 10:01:54 0007090000983 34.486 0.001 1 12 2019-11-12 10:02:10 00070900009D8 2.658 0.000 1 13 2019-11-12 10:02:25 0007090000139 12.466 0.002 1 14 2019-11-12 10:03:25 0007090000C98 4.062 0.030 1 15 2019-11-12 10:08:30 0007090000C85 5.880 0.084 1 16 2019-11-12 10:09:40 0007090000CA0 33.731 0.000 1 17 2019-11-12 10:13:59 00070900009CB 5.684 0.000 1 18 2019-11-12 10:15:02 0007090000151 3.673 0.027 1 19 2019-11-12 10:15:32 0007090000CA5 9.718 0.013 1 Total Day_Of_Week Day_Of_Month Year Month Day Hour 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 9 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10 30 2019 11 12 10

代码是这样的：

df = pd.read_csv('WatersensorWeek4Exported.csv')
pd.set_option('display.max_colwidth', -1)
df['Time'] =pd.to_datetime(df['Time'])
df['Day_Of_Week'] = df['Time'].dt.dayofweek
df['Day_Of_Month'] = df['Time'].dt.daysinmonth
df['Year'] = df['Time'].dt.year
df['Month'] = df['Time'].dt.month
df['Day'] = df['Time'].dt.day
df['Hour'] = df['Time'].dt.hour
t = df['Time']
df.apply(pd.to_numeric, errors=('ignore'))
df.fillna(0, inplace = True)
a = 0.1
print(df.head(20))
v_group = df.groupby('DeviceEui')
volume_sensor = v_group['Volume'].agg(np.max)
v_group.apply(lambda x: x.nlargest(int(len(x) * a), 'Volume')).agg(np.max)
print(v_group.describe)
for v, index in v_group:
    print(v, index)

print(v_group)

eui_group = df.groupby(['DeviceEui', 'Hour'])['Volume'].mean()
eui_group = eui_group.apply(lambda x: x.nlargest(int(len(x) * a), 'Volume'))
print(eui_group)
print(eui_group.dtypes)
for index, name in eui_group.iteritems():
    print(index,name)

代码产生的代码片段： [177 rows x 10 columns] 0007090000885 Time DeviceEui Volume Total Day_Of_Week Day_Of_Month Year Month Day Hour 44033 2019-11-28 06:55:30 0007090000885 0.000 0.000 3 30 2019 11 28 6 44034 2019-11-28 06:55:41 0007090000885 0.000 0.000 3 30 2019 11 28 6 44141 2019-11-28 07:55:30 0007090000885 0.000 0.000 3 30 2019 11 28 7 44142 2019-11-28 07:55:41 0007090000885 0.000 0.000 3 30 2019 11 28 7 44261 2019-11-28 08:55:30 0007090000885 0.011 0.011 3 30 2019 11 28 8 ... ... ... ... ... .. .. ... .. .. .. 60887 2019-12-04 03:56:49 0007090000885 0.971 0.000 2 31 2019 12 4 3 61000 2019-12-04 04:56:49 0007090000885 0.971 0.000 2 31 2019 12 4 4 61001 2019-12-04 04:56:49 0007090000885 0.971 0.000 2 31 2019 12 4 4 61108 2019-12-04 05:56:49 0007090000885 0.989 0.018 2 31 2019 12 4 5 61200 2019-12-04 06:56:49 0007090000885 1.005 0.016 2 31 2019 12 4 6 [195 rows x 10 columns] 0007090000FFF Time DeviceEui Volume Total Day_Of_Week Day_Of_Month Year Month Day Hour 58167 2019-12-03 05:15:29 0007090000FFF 0.000 0.000 1 31 2019 12 3 5 58168 2019-12-03 05:15:39 0007090000FFF 0.000 0.000 1 31 2019 12 3 5 58274 2019-12-03 06:15:29 0007090000FFF 0.000 0.000 1 31 2019 12 3 6 58275 2019-12-03 06:15:39 0007090000FFF 0.000 0.000 1 31 2019 12 3 6 58392 2019-12-03 07:15:29 0007090000FFF 0.011 0.011 1 31 2019 12 3 7 58393 2019-12-03 07:15:39 0007090000FFF 0.011 0.011 1 31 2019 12 3 7

错误消息：追溯（最近一次通话）：在第45行中输入文件“ C：\ Users \ xxx \ source \ repos \ MLwater \ MLwater \ForbruksNivå.py” eui_group = eui_group.apply（lambda x：x.nlargest（int（len（x）* a），'Volume'））应用中的文件“ C：\ Users \ xxx \ Anaconda3 \ lib \ site-packages \ pandas \ core \ series.py”，行4042 映射= lib.map_infer（值，f，转换= convert_dtype）在pandas._libs.lib.map_infer中的文件“ pandas_libs \ lib.pyx”，第2228行文件“ C：\ Users \ xxx \ source \ repos \MLvannmålere\ MLwater \ Forbruks.py”，第45行，在 eui_group = eui_group.apply（lambda x：x.nlargest（int（len（x）* a），'Volume'）） AttributeError：“ float”对象没有属性“ nlargest” 按任意键继续。。

获取百分比最高的组

0 个答案: