我有两个不同维度的数据框,一个包含观察结果,另一个包含与算术结果相关的常数。我想做的是从df2中选择适当的常量,用于df1中的行方式观察,以产生df3。
我附上了样本数据和一个简单的等式:
# df1 with annual observation data from different commodities
df1 <- data.frame(Region = c("Europe", "Asia", "N.Amer", "Africa"),
Item = c("Wheat", "Barley", "Oats", "Rice"),
Year = c(1961, 1961, 1961, 1961),
Production = c(2000, 1000, 1500, 500),
Imports = c(1000, 200, 3000, 100),
Stock.Var = c(-100, 300, 50, 0),
Exports = c(250, 150, 100, 200))
#df2 with constants for losses by commodity in different regions
df2 <- data.frame(Area = c("Asia", "N.Amer", "Europe", "Africa"),
Item = c("Wheat", "Oats", "Rice", "Barley"),
LF1 = c(0.02, 0.1, 0.15, 0.05))
# df3 would contain the outputs from calculating losses from df1 by df2 by row
# Equation: L1 = (Production + Imports + Stock.Var - Exports) * LF1
等式中LF1的值取自df2,基于项目&amp; df1中的区域名称。
df1的完整大小是几十万行乘16列; df2大约是150行乘20列。
答案 0 :(得分:0)
可以使用dplyr inner_join
和mutate
完成此操作:
library(dplyr)
df1 %>% inner_join(df2, by = c(Region = "Area", "Item")) %>%
mutate(L1 = (Production + Imports + Stock.Var - Exports) * LF1)
如果你宁愿坚持基础R,你可以使用merge
:
m <- merge(df1, df2, by.x = c("Region", "Item"), by.y = c("Area", "Item"))
m$L1 <- (m$Production + m$Imports + m$Stock.Var - m$Exports) * m$LF1