非常感谢您的帮助。
我从influx db获得了这个结果集。它实际上是一本字典:
{u'current': [[0.03341725795376516, u'2018-10-10T12:41:27Z']], u'voltage': [[12.95246814679179, u'2018-10-10T12:41:27Z']], u'temperature': [[0.035324635690852216, u'2018-10-10T12:41:27Z']], u'tags': {u'product': u'00000000000000'}}
另一个例子是:
u'data': {
u'measurement': u'telemetry'},
u'tags': {u'product_imei': u'000000000000000'},
u'current': [
[1.234, u'2016-01-01T00:00:00Z'], [2.234, u'2016-01-01T04:00:00Z'], [3.234, u'2016-01-01T08:00:00Z'], [1.234, u'2016-01-01T12:00:00Z'], [2.345, u'2016-01-01T16:00:00Z'], [2.678, u'2016-01-01T20:00:00Z'], [2.91, u'2016-01-02T00:00:00Z'], [2.345, u'2016-01-02T04:00:00Z'], [2.678, u'2016-01-02T08:00:00Z'], [2.91, u'2016-01-02T12:00:00Z'], [2.345, u'2016-01-02T16:00:00Z'], [2.678, u'2016-01-02T20:00:00Z'], [2.91, u'2016-01-03T00:00:00Z']
],
u'voltage': [
[14.243, u'2016-01-01T00:00:00Z'], [14.723, u'2016-01-01T04:00:00Z'], [14.826, u'2016-01-01T08:00:00Z'], [13.284, u'2016-01-01T12:00:00Z'], [12.345, u'2016-01-01T16:00:00Z'], [12.678, u'2016-01-01T20:00:00Z'], [12.91, u'2016-01-02T00:00:00Z'], [12.345, u'2016-01-02T04:00:00Z'], [12.678, u'2016-01-02T08:00:00Z'], [12.91, u'2016-01-02T12:00:00Z'], [12.345, u'2016-01-02T16:00:00Z'], [12.678, u'2016-01-02T20:00:00Z'], [12.91, u'2016-01-03T00:00:00Z']
],
u'temperature': [
[21.345, u'2016-01-01T00:00:00Z'], [None, u'2016-01-01T04:00:00Z'], [21.345, u'2016-01-01T08:00:00Z'], [None, u'2016-01-01T12:00:00Z'], [21.345, u'2016-01-01T16:00:00Z'], [None, u'2016-01-01T20:00:00Z'], [21.91, u'2016-01-02T00:00:00Z'], [None, u'2016-01-02T04:00:00Z'], [21.678, u'2016-01-02T08:00:00Z'], [None, u'2016-01-02T12:00:00Z'], [21.345, u'2016-01-02T16:00:00Z'], [None, u'2016-01-02T20:00:00Z'], [21.91, u'2016-01-03T00:00:00Z']
]
}
我想使用python创建一个与此类似的pandas DataFrame:
time current product voltage temperature
------------------------------------------------------------------
2016-01-01 00:00:00 1.234 000000000000000 14.243 21.345
2016-01-01 04:00:00 2.234 000000000000000 14.723
2016-01-01 08:00:00 3.234 000000000000000 14.826 21.345
2016-01-01 12:00:00 1.234 000000000000000 13.284
2016-01-01 16:00:00 2.345 000000000000000 12.345 21.345
2016-01-01 20:00:00 2.678 000000000000000 12.678
2016-01-02 00:00:00 2.910 000000000000000 12.910 21.910
2016-01-02 04:00:00 2.345 000000000000000 12.345
2016-01-02 08:00:00 2.678 000000000000000 12.678 21.678
2016-01-02 12:00:00 2.910 000000000000000 12.910
2016-01-02 16:00:00 2.345 000000000000000 12.345 21.345
2016-01-02 20:00:00 2.678 000000000000000 12.678
2016-01-03 00:00:00 2.910 000000000000000 12.910 21.910
我已经尝试了一种非常低效的方法来执行此操作,实际上是逐行编写。太多时间。我已经花了数千年的时间这样做。
for i, line in enumerate(results['voltage']):
aux_dict = {}
for key in results.keys():
try:
results[key]
aux_dict[key] = results[key][i][0]
aux_dict['time'] = pd.to_datetime(line[1], infer_datetime_format=True)
output.append(aux_dict)
except:
"Column '" + key + "' does not have data."
continue
df = pd.DataFrame(output)
预先感谢您的帮助。
答案 0 :(得分:0)
我以前想回答这个问题。最后,我只做了一个处理不同数据输入并创建带有列名的数据框的函数。我只会在这里发布问题的答案。
背景: *向端点发出请求,结果在r.json()['data']->标签字典中,例如“电压”,“电流”具有列表(多次测量)列表(测量值,时间)。示例:
import pandas as pd
d = {
'current': [[-1.8795300221255817, '2018-09-14T13:36:00Z']],
'voltage': [[12.0, '2018-09-14T13:36:00Z']]
}
fields = ['current', 'voltage']
df = pd.DataFrame()
for field in fields:
df_aux = pd.DataFrame(d[field], columns = [field, 'time']) # check above definition of d
df_aux.set_index('time', inplace = True)
df[field] = df_aux[field]
df.index = pd.to_datetime(df.index, errors='coerce') #convert it to datetime
print df.head()
# When converting to datetime remember to check that the format was read correctly.
谢谢!
答案 1 :(得分:0)
我建议使用Pinform库(一种InfluxDB的ORM)来轻松创建测量类并读取/写入数据库。
供您使用:
from pinform import Measurement, MeasurementUtils
from pinform.fields import FloatField
from pinform.tags import Tag
class CurrentAndVoltage(Measurement):
class Meta:
measurement_name = 'current_voltage'
current = FloatField(null=False)
voltage = FloatField(null=False)
items = CurrentAndVoltage(time_point=datetime.datetime.now(), current=-1.87, voltage=12.0)
df = MeasurementUtils.to_dataframe([item])
答案 2 :(得分:0)
使用 influxdb python 模块,这里有一个精简的解决方案,它依赖于通过 ResultSet
方法解析 InfluxDBClient.query
返回的对象,而无需在查询中使用 GROUP BY
子句。>
假设在 Influx 中有:
> SELECT P FROM device WHERE time > now()-24h
name: device
time P
---- -
1612958108000000000 238
1612958108000000000 0
1612958108000000000 357
1612958108000000000 0
1612958108000000000 0
from os import environ
import pandas as pd
from influxdb import InfluxDBClient
def client(database=None):
return InfluxDBClient(
username=environ['INFLUXDB_USER'],
password=environ['INFLUXDB_PASS'],
host=environ['INFLUXDB_HOST'],
port=environ['INFLUXDB_PORT'],
database=database
)
r = client(database='test').query('SELECT P FROM device WHERE time > now()-24h')
df = pd.DataFrame(columns=['measurement', 'time', 'P'])
for k, v in r.items():
data = {'measurement': k[0]}
for p in v:
data.update({'time': p['time'], 'P': p['P']})
df = df.append(data, ignore_index=True)
df.head()
measurement time P
0 device 2021-02-10T11:55:08Z 238.0
1 device 2021-02-10T11:55:08Z 0.0
2 device 2021-02-10T11:55:08Z 357.0
3 device 2021-02-10T11:55:08Z 0.0
4 device 2021-02-10T11:55:08Z 0.0
如果您使用 GROUP BY
子句进行查询,假设在 Influx 中您有:
> SELECT P FROM device WHERE time > now()-24h GROUP BY "device_id", "asset_id"
name: device
tags: asset_id=57, device_id=44
time P
---- -
1612958108000000000 0
1612958108000000000 327
1612958108000000000 0
1612958108000000000 238
1612958108000000000 357
确保从 ResultSet
的键解析标签:
r = client(database='test').query('SELECT P FROM device WHERE time > now()-24h GROUP BY "device_id", "asset_id"')
df = pd.DataFrame(columns=['measurement', 'time', 'P', 'device_id', 'asset_id'])
for k, v in r.items():
data = {'measurement': k[0], 'device_id': k[1]['device_id'], 'asset_id': k[1]['asset_id']}
for p in v:
data.update({'time': p['time'], 'P': p['P']})
df = df.append(data, ignore_index=True)
df.head()
measurement time P device_id asset_id
0 device 2021-02-10T11:55:08Z 0.0 44 57
1 device 2021-02-10T11:55:08Z 327.0 44 57
2 device 2021-02-10T11:55:08Z 0.0 44 57
3 device 2021-02-10T11:55:08Z 238.0 44 57
4 device 2021-02-10T11:55:08Z 357.0 44 57