我正在尝试根据某些食品的进口份额创建一个特定国家的指数。
我有以下数据:价格包含许多食品商品价格的时间序列数据。 权重包含相关商品的国家/地区特定进口份额的数据(请参阅模拟数据)。
我想要做的是创建一个特定国家的食品价格指数,它是进口商品的价格序列乘以进口份额的总和。
因此,在示例数据中,澳大利亚的食品价格指数将为:
FOODct = 0.12 * WHEATt + 0.08 * SUGARt
其中 c 表示国家/地区和 t 时间。
基本上我的问题是:如何将列乘以每个国家/地区的行?
我对R有一些经验但是试图解决这个问题我似乎超过了我的体重。我也没有在其他地方找到任何有用的指针,所以我希望也许你们中的任何人都可能有很好的建议。
## Code to create mock data:
## Generate data on country weights
country<-c(rep("Australia",2),rep("Zimbabwe",3))
item<-c("Wheat","Sugar","Wheat","Sugar","Soybeans")
itemcode<-c(1,2,1,2,3)
share<-c(0.12,0.08,0.16,0.08,0.03)
weights<-data.frame(country,item,itemcode,share)
## Generate data on price index
date<-seq(as.Date("2005/1/1"),by="month",length.out=12)
Wheat<-runif(12,80,160)
Sugar<-runif(12,110,230)
Soybeans<-runif(12,60,130)
prices<-data.frame(date,Wheat,Sugar,Soybeans)
编辑:解决方案
感谢alexwhan的建议(由于缺乏stackoverflow街头信誉,我不能不幸地投票)。和dnlbrky一起使用原始数据最容易实现的解决方案。
## Load data.table package
require(data.table)
## Convert data to data table
prices<-data.table(prices)
weights<-data.table(weights,key="item")
## Extract names for all the food commodities
vars<-names(prices)[!names(prices) %in% "date"]
## Unstack items to create table in long format
prices<-data.table(date=prices[,date], stack(prices,vars),key="ind")
## Rename the columns
setnames(prices,c("values","ind"),c("price","item"))
## Calculate the food price index
priceindex<-weights[prices,allow.cartesian=T][,list(index=sum(share*price)),
by=list(country,date)]
## Order food price index if not done automatically
priceindex<-priceindex[order(priceindex$country,priceindex$date),]
答案 0 :(得分:2)
这是一个选项。绝对有一种更简洁的方法来做到这一点,但它应该让你去。
首先,我将weights
变为宽格式,以便更容易为我们的目的使用:
library(reshape2)
weights.c <- dcast(weights, country~item)
# country Soybeans Sugar Wheat
# 1 Australia NA 0.08 0.12
# 2 Zimbabwe 0.03 0.08 0.16
然后我使用apply
遍历weights.c
的每一行并计算'食品价格指数'(告诉我这是否计算错误,我想我是按照右边的例子...)。
FOOD <- as.data.frame(apply(weights.c, 1, function(x)
as.numeric(x[3]) * prices$Soybeans +
as.numeric(x[3])*prices$Sugar + as.numeric(x[4])*prices$Wheat))
添加国家/地区和日期标识符:
colnames(FOOD) <- weights.c$country
FOOD$date <- prices$date
FOOD
# Australia Zimbabwe date
# 1 35.04337 39.99131 2005-01-01
# 2 38.95579 44.72377 2005-02-01
# 3 33.45708 38.50418 2005-03-01
# 4 30.42181 34.04647 2005-04-01
# 5 36.03443 39.90905 2005-05-01
# 6 46.21269 52.29347 2005-06-01
# 7 41.88694 48.15334 2005-07-01
# 8 34.47848 39.83654 2005-08-01
# 9 36.32498 40.60091 2005-09-01
# 10 33.74768 37.17185 2005-10-01
# 11 38.84855 44.87495 2005-11-01
# 12 36.45119 40.11678 2005-12-01
希望这足够接近你所追求的......
答案 1 :(得分:1)
我会将权重表中的项目取消堆叠/重新整形,然后使用data.table
加入价格和权重。
## Generate data table for country weights:
weights<-data.table(country=c(rep("Australia",2),rep("Zimbabwe",3)),
item=c("Wheat","Sugar","Wheat","Sugar","Soybeans"),
itemcode=c(1,2,1,2,3),
share=c(0.12,0.08,0.16,0.08,0.03),
key="item")
## Generate data table for price index:
prices<-data.table(date=seq(as.Date("2005/1/1"),by="month",length.out=12),
Wheat=runif(12,80,160),
Sugar=runif(12,110,230),
Soybeans=runif(12,60,130))
## Get column names of all the food types:
vars<-names(prices)[!names(prices) %in% "date"]
## Unstack the items and create a "long" table:
prices<-data.table(date=prices[,date], stack(prices,vars),key="ind")
## Rename the columns:
setnames(prices,c("values","ind"),c("price","item"))
prices[1:5]
## date price item
## 1: 2005-01-01 88.25818 Soybeans
## 2: 2005-02-01 71.61261 Soybeans
## 3: 2005-03-01 77.91082 Soybeans
## 4: 2005-04-01 129.05806 Soybeans
## 5: 2005-05-01 74.63005 Soybeans
## Join the weights and prices tables, multiply the share by the price, and sum by country and date:
weights[prices,allow.cartesian=T][,list(index=sum(share*price)),by=list(country,date)]
## country date index
## 1: Zimbabwe 2005-01-01 27.05711
## 2: Zimbabwe 2005-02-01 34.72842
## 3: Zimbabwe 2005-03-01 35.23615
## 4: Zimbabwe 2005-04-01 39.05027
## 5: Zimbabwe 2005-05-01 39.48388
## 6: Zimbabwe 2005-06-01 33.43677
## 7: Zimbabwe 2005-07-01 32.55172
## 8: Zimbabwe 2005-08-01 34.86790
## 9: Zimbabwe 2005-09-01 33.29748
## 10: Zimbabwe 2005-10-01 38.31180
## 11: Zimbabwe 2005-11-01 31.29709
## 12: Zimbabwe 2005-12-01 40.70930
## 13: Australia 2005-01-01 21.07165
## 14: Australia 2005-02-01 27.47660
## 15: Australia 2005-03-01 27.03025
## 16: Australia 2005-04-01 29.34917
## 17: Australia 2005-05-01 31.95188
## 18: Australia 2005-06-01 26.22890
## 19: Australia 2005-07-01 24.58945
## 20: Australia 2005-08-01 27.44728
## 21: Australia 2005-09-01 27.02199
## 22: Australia 2005-10-01 31.58282
## 23: Australia 2005-11-01 24.42326
## 24: Australia 2005-12-01 31.70109