Question

我在创建一个数据帧时遇到问题，该数据帧保持温度测量的时间间隔。至于现在，数据帧的索引为时间，另一列为测量值，我希望将时间转换为间隔为12小时，测量值为该游戏中时光倒流中的值的平均值。

                         measurement
time
2016-11-04 08:49:25    17.730000
2016-11-04 10:23:52    18.059999
2016-11-04 11:02:09    18.370001
2016-11-04 12:04:20    18.090000
2016-11-04 14:26:43    18.320000

因此，不是每次都与测量相关，而是希望let的值的平均值为12小时，如下所示：

                                              measurement
time
2016-11-04 00:00:00 - 2016-11-04 12:00:00     17.730000
2016-11-04 12:00:00 - 2016-11-05 00:00:00     18.059999
2016-11-05 00:00:00 - 2016-11-05 12:00:00     18.370001
2016-11-05 12:00:00 - 2016-11-06 00:00:00     18.090000
2016-11-06 00:00:00 - 2016-11-06 12:00:00     18.320000

有一种简单的方法可以用熊猫做到这一点吗？

稍后我想将测量值转换为间隔，以便数据变为布尔值，如下所示：

                                              17.0-18.0   18.0-19.0  19.0-20
time
2016-11-04 00:00:00 - 2016-11-04 12:00:00         1           0         0
2016-11-04 12:00:00 - 2016-11-05 00:00:00         0           1         0
2016-11-05 00:00:00 - 2016-11-05 12:00:00         0           1         0
2016-11-05 12:00:00 - 2016-11-06 00:00:00         0           1         0
2016-11-06 00:00:00 - 2016-11-06 12:00:00         0           1         0

修改我使用了Coldspeed首次发布的解决方案

df = pd.DataFrame({'timestamp':time.values, 'readings':readings.values})
df = df.groupby(pd.Grouper(key='timestamp', freq='12H'))['readings'].mean()
v = pd.cut(df, bins=[17,18,19,20,21,22,23,24,25,26,27,28], labels=['17-18','18-19','19-20','20-21','21-22','22-23','23-24','24-25','25-26','26-27','27-28'])

我知道这些垃圾箱和标签可能已经完成但只是一个for循环，但这只是一个快速修复。 groupby函数对＆quot; timestamp＆＃39;的值进行分组。在12小时的频率，并获得游戏中时光倒流的读数平均值。

然后使用cut函数将平均值分类到它们的类别中。

结果：

                     17-18  18-19  19-20  20-21  21-22  22-23  23-24  24-25  \
timestamp
2016-11-04 00:00:00      0      1      0      0      0      0      0      0
2016-11-04 12:00:00      0      1      0      0      0      0      0      0
2016-11-05 00:00:00      0      0      0      0      0      0      0      0
2016-11-05 12:00:00      1      0      0      0      0      0      0      0
2016-11-06 00:00:00      1      0      0      0      0      0      0      0
2016-11-06 12:00:00      0      0      0      0      0      0      0      0
2016-11-07 00:00:00      0      1      0      0      0      0      0      0
2016-11-07 12:00:00      1      0      0      0      0      0      0      0
2016-11-08 00:00:00      0      0      0      0      0      0      0      0
2016-11-08 12:00:00      0      0      0      0      0      0      0      0
2016-11-09 00:00:00      1      0      0      0      0      0      0      0
2016-11-09 12:00:00      1      0      0      0      0      0      0      0
2016-11-10 00:00:00      0      1      0      0      0      0      0      0
2016-11-10 12:00:00      0      0      0      0      0      0      0      0
2016-11-11 00:00:00      0      0      0      0      0      0      0      0
2016-11-11 12:00:00      0      0      0      0      0      0      0      0
2016-11-12 00:00:00      0      0      0      0      0      0      0      0
2016-11-12 12:00:00      0      0      0      0      0      0      0      0
2016-11-13 00:00:00      0      0      0      0      0      0      0      0
2016-11-13 12:00:00      0      0      0      0      0      0      0      0
2016-11-14 00:00:00      0      0      0      0      0      0      0      0
2016-11-14 12:00:00      0      1      0      0      0      0      0      0
2016-11-15 00:00:00      0      0      0      1      0      0      0      0
2016-11-15 12:00:00      0      0      0      0      0      1      0      0
2016-11-16 00:00:00      0      0      0      0      0      0      1      0
2016-11-16 12:00:00      0      0      0      0      0      0      0      0
2016-11-17 00:00:00      0      0      0      0      0      0      0      0

Answer 1

使用import asyncio import aiohttp import async_timeout import json async def fetch(session, url): async with async_timeout.timeout(10): async with session.get(url) as response: return await response.text() async def get_bittrex_marketsummary(currency_pair): url = f'https://bittrex.com/api/v1.1/public/getmarketsummary?market={currency_pair}' async with aiohttp.ClientSession() as session: response = await fetch(session, url) return json.loads(response) class MyCryptoCurrency: def __init__(self): self.currency = "BTC-ETH" self.last_price = None async def get_last_price(self): self.last_price = await get_bittrex_marketsummary(self.currency) async def main(): eth = MyCryptoCurrency() await eth.get_last_price() print(eth.last_price) loop = asyncio.get_event_loop() loop.run_until_complete(main()) + pd.cut：

pd.get_dummies

Answer 2

IIUC你要用12小时的块重新取样，然后制作假人 pd.cut是一种完全可以接受的方法，可以将结果数据切割成垃圾箱但是，我使用np.searchsorted来完成任务。

bins = np.array([17, 18, 19, 20])
labels = np.array(['<17', '17-18', '18-19', '19-20', '>20'])
resampled = df.resample('12H').measurement.mean()
pd.get_dummies(pd.Series(labels[bins.searchsorted(resampled.values)], resampled.index))

                     17-18  18-19  19-20  >20
2018-03-20 00:00:00      0      1      0    0
2018-03-20 12:00:00      1      0      0    0
2018-03-21 00:00:00      0      1      0    0
2018-03-21 12:00:00      0      0      0    1
2018-03-22 00:00:00      0      0      1    0
2018-03-22 12:00:00      0      0      0    1

设置

np.random.seed(int(np.pi * 1E6))

tidx = pd.date_range(pd.Timestamp('now'), freq='3H', periods=20)
df = pd.DataFrame(dict(measurement=np.random.rand(len(tidx)) * 6 + 17), tidx)

df

                            measurement
2018-03-20 06:58:30.484383    17.960744
2018-03-20 09:58:30.484383    18.572100
2018-03-20 12:58:30.484383    17.646766
2018-03-20 15:58:30.484383    19.025463
2018-03-20 18:58:30.484383    17.521399
2018-03-20 21:58:30.484383    17.318663
2018-03-21 00:58:30.484383    19.388553
2018-03-21 03:58:30.484383    19.520969
2018-03-21 06:58:30.484383    19.060640
2018-03-21 09:58:30.484383    17.106034
2018-03-21 12:58:30.484383    22.887546
2018-03-21 15:58:30.484383    18.437271
2018-03-21 18:58:30.484383    18.426362
2018-03-21 21:58:30.484383    20.558928
2018-03-22 00:58:30.484383    22.555121
2018-03-22 03:58:30.484383    17.139489
2018-03-22 06:58:30.484383    17.209499
2018-03-22 09:58:30.484383    19.466367
2018-03-22 12:58:30.484383    21.765692
2018-03-22 15:58:30.484383    19.680785

Answer 3

您可以使用pd.cut() + pd.get_dummies()：

df["measurement"] = pd.cut(df["measurement"], bins=[17.0,18.0,19.0,20.0])
dummies = pd.get_dummies(df["measurement"])

Answer 4

对于您的第一个问题：您可以使用pandas.TimeGrouper每12小时（或任何其他频率）进行分组，然后取组的平均值。

df.groupby([pd.TimeGrouper(freq='12H')]).mean()

从2列创建分类数据 - Python Pandas

4 个答案: