我有一个包含汇率数据的数据框。我想将整个日期范围(从最小日期到最大日期)的基础货币(挪威克朗)插入单位为1的值中。
试图合并数据框,但我的技能运气不好。 该数据是进一步执行其他任务所需的。
Currency Date Rate UoM
0 Swedish krona 2016-01-05 1.0395 Hundreds
1 Swedish krona 2016-01-06 1.0422 Hundreds
2 Swedish krona 2016-01-07 1.0452 Hundreds
3 Swedish krona 2016-01-08 1.0450 Hundreds
4 Swedish krona 2016-01-11 1.0437 Hundreds
5 Swedish krona 2016-01-12 1.0422 Hundreds
6 Swedish krona 2016-01-13 1.0338 Hundreds
7 Swedish krona 2016-01-14 1.0347 Hundreds
8 Swedish krona 2016-01-15 1.0279 Hundreds
9 Swedish krona 2016-01-18 1.0371 Hundreds
... ... ... ... ...
3313 US dollar 2019-03-15 8.5674 Units
3314 US dollar 2019-03-18 8.5223 Units
3315 US dollar 2019-03-19 8.5178 Units
3316 US dollar 2019-03-20 8.5358 Units
3317 US dollar 2019-03-21 8.4463 Units
3318 US dollar 2019-03-22 8.5315 Units
3319 US dollar 2019-03-25 8.5289 Units
预期输出是数据框的新行,即
3320 Norwegian krone 2016-01-06 1 Units
3321 Norwegian krone 2016-01-07 1 Units
3322 Norwegian krone 2016-01-08 1 Units
3323 Norwegian krone 2016-01-11 1 Units
... ... ... ... ...
XXXX Norwegian krone 2019-03-21 1 Units
XXXX Norwegian krone 2019-03-22 1 Units
XXXX Norwegian krone 2019-03-25 1 Units
答案 0 :(得分:0)
诀窍是获取像源数据一样在其中具有漏洞的日期范围,然后有效地构造重复行,以进行追加和排序。构造数据框时,可以使用单个字典来填充数据框。
JSON.parse(data)
产生
import pandas as pd
import csv
from pandas.compat import StringIO
print(pd.__version__)
csvdata = StringIO("""Currency,Date,Rate,UoM
Swedish krona,2016-01-05,1.0395,Hundreds
Swedish krona,2016-01-06,1.0422,Hundreds
Swedish krona,2016-01-07,1.0452,Hundreds
Swedish krona,2016-01-08,1.0450,Hundreds
Swedish krona,2016-01-11,1.0437,Hundreds
Swedish krona,2016-01-12,1.0422,Hundreds
Swedish krona,2016-01-13,1.0338,Hundreds
Swedish krona,2016-01-14,1.0347,Hundreds
Swedish krona,2016-01-15,1.0279,Hundreds
Swedish krona,2016-01-18,1.0371,Hundreds
US dollar,2019-03-15,8.5674,Units
US dollar,2019-03-18,8.5223,Units
US dollar,2019-03-19,8.5178,Units
US dollar,2019-03-20,8.5358,Units
US dollar,2019-03-21,8.4463,Units
US dollar,2019-03-22,8.5315,Units
US dollar,2019-03-25,8.5289,Units""")
df = pd.read_csv(csvdata, sep=",")
df = df.set_index(['Date'])
date_range = df.index.values
nk_df = pd.DataFrame(index=date_range, data={'Currency':'Norwegian krone', 'Rate':1, 'UoM':'Units'})
df = pd.concat([df, nk_df])
print(df.sort_index().head(10))