我有这个df:
import pandas as pd
df = pd.DataFrame({"Time": ["2020-04-09 04:40:40.559719","2020-04-09 04:40:40.559719", "2020-04-09 04:40:40.559719", 'NaN', 'NaN', 'NaN', 'NaN', 'NaN', '2020-04-29 16:50:38.559871'],
"Power": [7500, 6000, 6000, 0, 0, 0, 0, 0, 4200],
"Total Energy": [5000, 5100, 5300, 5300, 5300, 5300, 5300, 5300, 5500],
"ID": [1, 1, 1, '-', '-', '-', '-', '-', 2],
"Energy": [500, 600, 800, 0, 0, 0, 0, 0, 200]},
index=pd.date_range(start = "2020-04-09 6:45", periods = 9, freq = 'T'))
df['Time'] = pd.to_datetime(df['Time'])
df['Time'] = df['Time'].dt.tz_localize('Europe/Berlin')
df['Power'] = pd.to_numeric(df['Power'], errors = 'ignore')
df['Total Energy'] = pd.to_numeric(df['Total Energy'], errors = 'coerce')
df['ID'] = pd.to_numeric(df['ID'], errors = 'coerce')
df['Energy'] = pd.to_numeric(df['Energy'], errors = 'coerce')
df
输出:
Time Power Total Energy ID Energy
2020-04-09 06:45:00 2020-04-09 04:40:40.559719+02:00 7500.0 5000.0 1.0 500.0
2020-04-09 06:46:00 2020-04-09 04:40:40.559719+02:00 6000.0 5100.0 1.0 600.0
2020-04-09 06:47:00 2020-04-09 04:40:40.559719+02:00 6000.0 5300.0 1.0 800.0
2020-04-09 06:48:00 NaT 0 5300.0 - 0
2020-04-09 06:49:00 NaT 0 5300.0 - 0
2020-04-09 06:50:00 NaT 0 5300.0 - 0
2020-04-09 06:51:00 NaT 0 5300.0 - 0
2020-04-09 06:52:00 NaT 0 5300.0 - 0
2020-04-09 06:53:00 2020-04-29 04:50:38.559871+02:00 4200.0 5500.0 2.0 200.0
我必须做两件事:
预期结果:
Time Power Total Energy ID Energy
2020-04-09 06:45:00 2020-04-09 06:40:40 7500.0 5000.0 1.0 500.0
2020-04-09 06:46:00 2020-04-09 06:40:40 6000.0 5100.0 1.0 600.0
2020-04-09 06:47:00 2020-04-09 06:40:40 6000.0 5300.0 1.0 800.0
2020-04-09 06:48:00 NaT 0 5300.0 - 0
2020-04-09 06:49:00 NaT 0 5300.0 - 0
2020-04-09 06:50:00 NaT 0 5300.0 - 0
2020-04-09 06:51:00 2020-04-29 06:50:38 0 5300.0 2.0 0
2020-04-09 06:52:00 2020-04-29 06:50:38 7800.0 5400.0 2.0 130.0
2020-04-09 06:53:00 2020-04-29 06:50:38 4200.0 5500.0 2.0 200.0
感谢您的帮助:)