我有一张带有两张纸的Excel文件。
('表1')包含数据:
DATE TMAX TMIN
20110706 317 211
20110707 322 211
20110708 317 211
20110709 322 211
20110710 328 222
20110711 333 244
20110712 356 250
20110713 356 222
和另一个('表2')包括:
Start Date End Date Rep Month Cost kWh kW
7/6/2011 8/3/2011 July 5,065.17 76,640 205
8/3/2011 9/7/2011 August 5,572.38 86,640 195
我的目标是在('表1和#39;)上写下另一栏,了解kwh值是否属于特定日期范围('表2')
例如:
DATE TMAX TMIN kWh
20110706 317 211 76640
20110707 322 211 76640
20110708 317 211 76640
20110709 322 211 76640
20110710 328 222 76640
20110711 333 244 76640
20110712 356 250 76640
20110713 356 222 76640
20110801 344 228 76640
20110802 356 200 76640
20110803 367 200 86640
20110804 361 228 86640
我不知道为什么我的代码导致df [" kWh"]为空(' NaN')导致写入空白的kWh列('工作表Sheet&#39)
以下是我的代码:
import pandas as pd
from pandas import ExcelWriter
df = pd.read_excel("thecddhddtest.xlsx",'Sheet1')
df2 = pd.read_excel("thecddhddtest.xlsx",'Sheet2')
df.head()
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3["kWh"].reindex(df["DATE"], method="ffill")
df["kWh"] = df3["kWh"].reindex(df["DATE"], method="ffill")
print(df["kWh"])
writer = ExcelWriter('thecddhddtestkWh.xlsx')
df.to_excel(writer,'Sheet1',index=False)
df2.to_excel(writer,'Sheet2',index=False)
writer.save()
导致:
DATE TMAX TMIN kWh
20110706 317 211
20110707 322 211
20110708 317 211
20110709 322 211
20110710 328 222
20110711 333 244
20110712 356 250
20110713 356 222
答案 0 :(得分:1)
尝试我的解决方案 - 仅将DATE
中的列df
设置为index
,然后设置reindex
。
df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
#set column DATE to index
df = df.set_index("DATE")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
#reindex by index of df
df["kWh"] = df3["kWh"].reindex(df.index, method="ffill")
print(df["kWh"])
#DATE
#2011-07-06 76,640
#2011-07-07 76,640
#2011-07-08 76,640
#2011-07-09 76,640
#2011-07-10 76,640
#2011-07-11 76,640
#2011-07-12 76,640
#2011-07-13 76,640
#2011-08-01 76,640
#2011-08-02 76,640
#2011-08-03 86,640
#2011-08-04 86,640
#Name: kWh, dtype: object