Question

可能重复：
create new data frame from a function of other data frames

我对SOF的第一个问题得到了一些帮助，我不知道如何回应被访者。所以，我再次发布示例代码（应该是第一次这样做 - 我正在学习）。

我有两个数据框。我们假装是为了解释：

DF1 列代表收益类型：玉米，燕麦，小麦等。行代表一年中的月份，jan，feb等元素表示在该特定月份购买的该类谷物的每吨价格。

DF2 代表国家的列：西班牙，智利，墨西哥等此框架的行代表额外成本，可能：每个国家的包装成本，运输成本，国家进口税，检验费等。

现在我想构建第三个数据框：

DF3 它表示每个国家的谷物组合（例如10％玉米，50％燕麦，......）与相关的运输，税收等成本的总成本。假设存在一个等式（使用来自df1和df2的数据）来计算给定谷物组合的每个国家/地区每月的总成本以及每个国家的额外成本。

另一个词，df3有12行（月）和列数与国家一样多。它的要素是每个国家每个月的粮食总成本+成本。

在Excel / Gnumeric中花两分钟，在Fortran或C中花费15分钟，两天在R Cookbook和互联网搜索中挣扎。而且，我没有人在大厅里喊叫，“嘿，凯文，你怎么在R ......这样做？”

如此简单，但对于新手来说，我忽略了一些基本点......

提前致谢，这是我的假装代码，说明了我的问题。

版

# build df1 - cost of grains (with goofy data so I can track the arithemetic)
  v1 <- c(1:12)
  v2 <- c(13:24)
  v3 <- c(25:36)
  v4 <- c(37:48)
  grain <- data.frame("wheat"=v1,"oats"=v2,"corn"=v3,"rye"=v4)

  grain


# build df2 - additional costs (again, with goofy data to see what is being used where and when)
  w1 <- c(1.3:4.3)
  w2 <- c(5.3:8.3)
  w3 <- c(9.3:12.3)
  w4 <- c(13.3:16.3)
  cost <- data.frame("Spain"=w1,"Peru"=w2,"Mexico"=w3,"Kenya"=w4)
  row.names(cost) <- c("packing","shipping","tax","inspection")

  cost


# assume 10% wheat, 30% oats and 60% rye with some clown-equation for total cost

# now for my feeble attemp at getting a dataframe that has 12 rows (months) and 4 column (countries)

  total_cost <- data.frame( 0.1*grain[,"wheat"] +
                            0.3*grain[,"oats"] +
                            0.6*grain[,"rye"] +
                            cost["packing","Mexico"] +
                            cost["shipping","Mexico"] +
                            cost["tax","Mexico"]  +
                            cost["inspection","Mexico"] )
  total_cost

# this gives the correct values for the total cost for Mexico, for each month.

# and if I plug in the other countries, I get correct answers for that country
# I guess I can run a loop over the counties, but this is R, not Fortran or C. 

# btw, my real equation is considerably more complicated, using functions involving
# multiple columns of df1 and df2 data, so there is no "every column of a df1 get 
#multipied by... or any one-to-one column-row matches.

数据帧算术

0 个答案: