填写组合列,其中包含R中的ID

时间:2016-12-29 19:30:36

标签: r dataframe dplyr

我有一个像这样的数据框

ID <- c("A","B","C","D","E","F",
        "ALL","ALL","ALL")
Measurement <- c("Length","Length","Breadth","Height","Width","Width"
           ,"Length","Height_Breadth","Width")
Combination <- NA
df1 <- data.frame(ID,Measurement,Combination)

我正在尝试填充组合列中的值&#34; ALL&#34;具有基于测量列的相应ID。

If ID -> ALL and Measurement is Length then Combination is A,B
If ID -> ALL and Measurement is Height_Breadth then Combination is D,C

我想要的输出是

   ID    Measurement Combination
    A         Length        <NA>
    B         Length        <NA>
    C        Breadth        <NA>
    D         Height        <NA>
    E          Width        <NA>
    F          Width        <NA>
  ALL         Length         A,B
  ALL Height_Breadth         D,C
  ALL          Width         E,F

我在尝试这样做时遇到错误

if(df1$ID = 'ALL' & df1$Measurement = 'Length')
{
  df1$Combination <- paste(df1$ID, collapse=",")
}

有人能指出我正确的方向来实现这个目标吗?

2 个答案:

答案 0 :(得分:0)

你可以试试这个:

library(dplyr)
df1 %>% 
       # create a new group variable which transforms the Measurement, i.e, combine Breadth and Height
       group_by(Group = ifelse(Measurement %in% c("Breadth", "Height"), "Height_Breadth", Measurement)) %>% 

       # For each group paste non ALL IDs and assign it rows where ID is all
       mutate(Combination = ifelse(ID == "ALL", toString(ID[ID != "ALL"]), NA)) %>% 

       # drop Group column
       ungroup() %>% select(-Group)


# A tibble: 9 x 3
#     ID    Measurement Combination
#  <chr>          <chr>       <chr>
#1     A         Length        <NA>
#2     B         Length        <NA>
#3     C        Breadth        <NA>
#4     D         Height        <NA>
#5     E          Width        <NA>
#6     F          Width        <NA>
#7   ALL         Length        A, B
#8   ALL Height_Breadth        C, D
#9   ALL          Width        E, F

答案 1 :(得分:0)

以下是dplyr

的选项
library(dplyr)
df1 %>% 
   filter(ID != "ALL") %>% 
   group_by(Measurement = replace(Measurement, 
                Measurement %in% c("Breadth", "Height"), "Height_Breadth")) %>%
   summarise(Combination = toString(ID), ID = "ALL") %>%
   bind_rows(filter(df1, ID != "ALL"), .) 
#    ID    Measurement Combination
#1   A         Length        <NA>
#2   B         Length        <NA>
#3   C        Breadth        <NA>
#4   D         Height        <NA>
#5   E          Width        <NA>
#6   F          Width        <NA>
#7 ALL Height_Breadth        C, D
#8 ALL         Length        A, B
#9 ALL          Width        E, F

或者使用base R,我们将'测量'的'{1}}更改为'广度''高度'更改为'Height_Breadth',levels将'ID'更改为'测量'将aggregate他们放在一起,使用原始数据集创建“ID”列和paste

merge