R-如何比较两个数据帧和更新列表列值

时间:2017-10-20 11:11:24

标签: r

我有两个数据帧数据帧1,数据帧2,如何将两个数据帧列值与P.Name,Name,Q.Name进行比较并更新相同的值并附加不同的值行。请在下面查看。

数据框1

P.Name    Name      Q.Name                    values

Read     Mike        salseID                  list(value = "Y2TS", countofvalues = 1)

Write    jhon        Purchasedcust            list(value = "ANDERSON", countofvalues = 1)

write    jhon        shippingname             list(value = "Mikel", countofvalues = 5)

Read     peter       ordername                list(value = c("july", "mary", "petersonavail"), countofvalues = c(1, 2, 1))

Write    jack        deliveredadd             list(value = c("IICC PS LOL UY", "IICC UYY LOL UY"), countofvalues = c(2,1))

Dataframe 2

P.Name    Name      Q.Name                    values

Read     Mike        salseID                  list(value = "Y2TS", countofvalues = 1)

Write    jhon        Purchasedcust            list(value = "vjantony", countofvalues = 1)

write    jhon        CustaAddress             list(value = "Mikel", countofvalues = 5)

Read     peter       ordername                list(value = c("july", "mary", "parker"), countofvalues = c(1, 2, 1))

预期数据框:

P.Name    Name      Q.Name                    values

 Read     Mike        salseID                  list(value = "Y2TS", countofvalues = 2)

 Write    jhon        Purchasedcust            list(value = c("ANDERSON","vjantony"), countofvalues = c(1,1))

 write    jhon        shippingname             list(value = "Mikel", countofvalues = 5)

 write    jhon        CustaAddress             list(value = "Mikel", countofvalues = 5)

 Read     peter       ordername                list(value = c("july", "mary", "petersonavail","parker"), countofvalues = c(2, 4, 1,1))

 Write    jack        deliveredadd             list(value = c("IICC PS LOL UY", "IICC UYY LOL UY"), countofvalues = c(2,1)) 

数据帧1输入数据。

structure(list(P.Name = c("Read", "Write", "Write", "Read", "Write"
), Name = c("Mike", "jhon", "jhon", "peter", "jack"), Q.Name = c("salseID", 
"Purchasedcust", "shippingname", "ordername", "deliveredadd"), 
    values = list(structure(list(value = "Y2TS", countofvalues = 1L), .Names = c("value", 
    "countofvalues"), row.names = c(NA, -1L), class = c("tbl_df", 
    "tbl", "data.frame")), structure(list(value = "ANDERSON", 
        countofvalues = 1L), .Names = c("value", "countofvalues"
    ), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"
    )), structure(list(value = "Mikel", countofvalues = 5L), .Names = c("value", 
    "countofvalues"), row.names = c(NA, -1L), class = c("tbl_df", 
    "tbl", "data.frame")), structure(list(value = c("july", "mary", 
    "petersonavail"), countofvalues = c(1L, 2L, 1L)), .Names = c("value", 
    "countofvalues"), row.names = c(NA, -5L), class = c("tbl_df", 
    "tbl", "data.frame")), structure(list(value = c("IICC PS LOL UY", 
    "IICC UYY LOL UY"), countofvalues = c(2L, 1L)), .Names = c("value", 
    "countofvalues"), row.names = c(NA, -3L), class = c("tbl_df", 
    "tbl", "data.frame")))), .Names = c("P.Name", "Name", "Q.Name", 
"values"), row.names = c(NA, -5L), class = "data.frame")

数据框2输入数据

structure(list(P.Name = c("Read", "Write", "Write", "Read"), 
    Name = c("Mike", "jhon", "jhon", "peter"), Q.Name = c("salseID", 
    "Purchasedcust", "CustaAddress", "ordername"), values = list(
        structure(list(value = "Y2TS", countofvalues = 1L), .Names = c("value", 
        "countofvalues"), row.names = c(NA, -1L), class = c("tbl_df", 
        "tbl", "data.frame")), structure(list(value = "vjantony", 
            countofvalues = 1L), .Names = c("value", "countofvalues"
        ), row.names = c(NA, -1L), class = c("tbl_df", "tbl", 
        "data.frame")), structure(list(value = "Mikel", countofvalues = 5L), .Names = c("value", 
        "countofvalues"), row.names = c(NA, -4L), class = c("tbl_df", 
        "tbl", "data.frame")), structure(list(value = c("july", 
        "mary", "parker"), countofvalues = c(1L, 2L, 1L)), .Names = c("value", 
        "countofvalues"), row.names = c(NA, -3L), class = c("tbl_df", 
        "tbl", "data.frame")))), .Names = c("P.Name", "Name", 
"Q.Name", "values"), row.names = c(NA, -4L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

您可以尝试tidyverse / dplyr解决方案

library(tidyverse)
# remove NAs. Otherwise it will not work. Don't know if they are important. 
d1$values <- lapply(d1$values, function(x) x[!is.na(x[,1]),])
d2$values <- lapply(d2$values, function(x) x[!is.na(x[,1]),])

d1 %>% 
  unnest() %>% 
  bind_rows(unnest(d2)) %>% 
  group_by(P.Name, Name, Q.Name, value) %>% 
  summarise(countofvalues=sum(countofvalues)) 
# A tibble: 11 x 5
# Groups:   P.Name, Name, Q.Name [?]
   P.Name  Name        Q.Name           value countofvalues
    <chr> <chr>         <chr>           <chr>         <int>
 1   Read  Mike       salseID            Y2TS             2
 2   Read peter     ordername            july             2
 3   Read peter     ordername            mary             4
 4   Read peter     ordername          parker             1
 5   Read peter     ordername   petersonavail             1
 6  Write  jack  deliveredadd  IICC PS LOL UY             2
 7  Write  jack  deliveredadd IICC UYY LOL UY             1
 8  Write  jhon  CustaAddress           Mikel             5
 9  Write  jhon Purchasedcust        ANDERSON             1
10  Write  jhon Purchasedcust        vjantony             1
11  Write  jhon  shippingname           Mikel             5

然后,您可以使用nest()

嵌套最后一列
d1 %>% 
  unnest() %>% 
  bind_rows(unnest(d2)) %>% 
  group_by(P.Name, Name, Q.Name, value) %>% 
  summarise(countofvalues=sum(countofvalues)) %>% 
  nest(.key = "values")
# A tibble: 6 x 4
  P.Name  Name        Q.Name           values
   <chr> <chr>         <chr>           <list>
1   Read  Mike       salseID <tibble [1 x 2]>
2   Read peter     ordername <tibble [4 x 2]>
3  Write  jack  deliveredadd <tibble [2 x 2]>
4  Write  jhon  CustaAddress <tibble [1 x 2]>
5  Write  jhon Purchasedcust <tibble [2 x 2]>
6  Write  jhon  shippingname <tibble [1 x 2]>