将一个数据框中的字段与另一个数据框中的列名匹配

时间:2016-07-21 20:42:08

标签: dataframe rstudio

我有两个数据框。在第一个数据框的最后一列(" Bill")中,我想应用一个函数(固定价格+数量*价格/数量)。为了应用该函数,R应该将df1的第一列中的值与df2的列名匹配。

我通过创建一个函数和几个ifelse语句解决了这个问题,但是我想使用一个自动匹配df1中的值与df2中列名的语句。我拥有的数据集包含超过200万行,我需要应用相同的理由来构建其他类似的函数。使用不需要循环或处理时间太长的东西会很好。

see what the data frame should look like here

2 个答案:

答案 0 :(得分:1)

### Set up your data frames like so ###
Code <- c("a1", "a2", "c3", "a1")
Name <- c("Dan", "David", "Anna", "Lisa")
Quantity <- c(30, 12, 10, 10)

df1 <- as.data.frame(cbind("Code" = Code, "Name" = Name, "Quantity" = Quantity), stringsAsFactors = F)
df1$Quantity <- as.numeric(df1$Quantity)

fixed_price <- c(12, 5, 23)
price_per_qty <- c(1, 4, 7)

df2 <- as.data.frame(rbind("fixed_price" = fixed_price, "price_per_qty" = price_per_qty))
colnames(df2) <- c("a1", "a2", "c3")

### Combine dataframe 1 and 2 into a single dataframe ###

# Code below pulls individual columns from df2 based on the 
# index provided by the "Code" column in df1, transposes them
# so they'll line up with df1, then column binds them to df1
df3 <- cbind(df1, t(df2[,df1$Code]))

# the bill is calculated simply enough
bill <- df3[4] + df3[3] * df3[5]
colnames(bill) <- "bill"
# Finally, output the results as you wanted
cbind(df3, bill)

答案 1 :(得分:0)

所以我对graggsd有一个相当类似的答案,但这对我有用。我根据关键词&#34; Code&#34;合并了两个数据帧。然后将它组合成大数据框到combined_data。然后我使用了一个函数,我认为这是你在上面定义的函数,然后通过它传递相应的数据框。

df2 <- t(data.frame(c(12,1),c(5,4),c(23,7)))
rownames(df2) <- c("a1","a2","c3")
test <- rownames(df2)
df2 <- cbind.data.frame(df2,test)
colnames(df2) <- c("fixed price","price/qty","Code")


df1 <- data.frame(c("a1","a2","c3","a1"), c("Dan","David","Anna","Lisa"),c(30,12,10,10))
colnames(df1) <- c("Code","Name","Quantity")


combined_data <- dplyr::inner_join(df1,df2, by = "Code")



f1 <- function(x,y,z){
  x + y * z
}
bill <- f1(combined_data[,4],combined_data[,3],combined_data[,5])

finalDataSet <- cbind.data.frame(combined_data,bill)

最终数据集:

   Code  Name Quantity fixed price price/qty bill
1   a1   Dan       30          12         1   42
2   a2 David       12           5         4   53
3   c3  Anna       10          23         7   93
4   a1  Lisa       10          12         1   22