在没有任何现有模式逻辑的情况下聚合两个数据框列

时间:2016-06-01 12:55:01

标签: r dataframe

我需要通过准合并两列来聚合数据帧的值。

一方面需要更改某些行值(更改名称),另一方面需要手动聚合而不需要任何模式或逻辑。由于这可能听起来很复杂或难以理解,请查看示例代码和图像。

数据集如下所示:

set.seed(1253)
dates <- as.Date(c(Sys.Date()+10))
fruits <- c("Apple","Apple","Apple","Apple","Banana","Banana","Banana","Banana",
  "Strawberry","Strawberry","Strawberry","Strawberry","Grape", "Grape",
  "Grape","Grape", "Kiwi","Kiwi","Kiwi","Kiwi")
parts <- c("Big Green Apple","Default","Blue Apple","XYZ Apple4",
  "Yellow Banana1","Small Banana","Banana3","Banana4",
  "Red Small Strawberry","Red StrawberryY","Big Strawberry", "StrawberryZ",
  "Green Grape", "Green Grape", "Blue Grape", "Blue Grape", 
  "Big Kiwi","Small Kiwi", "Kiwi","Default")
stock <- as.vector(sample(1:20))

theDF <- data.frame(dates, fruits, parts, stock)

theDF

enter image description here

纠正汇总的中间步骤:

enter image description here

最终数据框应如下所示:

enter image description here

希望有一个解决方案。提前谢谢!

1 个答案:

答案 0 :(得分:4)

set.seed(1253)
dates <- as.Date(c(Sys.Date()+10))
fruits <- c("Apple","Apple","Apple","Apple","Banana","Banana","Banana","Banana",
            "Strawberry","Strawberry","Strawberry","Strawberry","Grape", "Grape",
            "Grape","Grape", "Kiwi","Kiwi","Kiwi","Kiwi")
parts <- c("Big Green Apple","Default","Blue Apple","XYZ Apple4",
           "Yellow Banana1","Small Banana","Banana3","Banana4",
           "Red Small Strawberry","Red StrawberryY","Big Strawberry", "StrawberryZ",
           "Green Grape", "Green Grape", "Blue Grape", "Blue Grape", 
           "Big Kiwi","Small Kiwi", "Kiwi","Default")
stock <- as.vector(sample(1:20))

theDF <- data.frame(dates, fruits, parts, stock)

theDF

有几种方法可以做到这一点,如果你有更多的“部分”值,我建议使用一些自定义正则表达式来帮助你。只有像这样的可管理数字,它更容易做到如下。

theDF$fruits <- as.character(theDF$fruits)

theDF$fruits[theDF$fruits == "Grape" & theDF$parts == "Blue Grape"]  <- "Small Grape"
theDF$fruits[theDF$fruits == "Grape" & theDF$parts == "Green Grape"] <- "Big Grape"

df <- aggregate(theDF$stock, by = list(theDF$dates, theDF$fruits), FUN = sum)
colnames(df) <- c("dates", "fruits", "stock")

df
       dates      fruits stock
1 2016-06-11       Apple    40
2 2016-06-11      Banana    37
3 2016-06-11   Big Grape    15
4 2016-06-11        Kiwi    33
5 2016-06-11 Small Grape    21
6 2016-06-11  Strawberry    64
>