Question

我有一个表单数据框，其中有多个条目分别用于相同的<rule name="specialcharacter" stopProcessing="true"> <match url="aaa/bbb/product/~productId=(.*)" /> <action type="Redirect" url="https://www.example.com/product/{R:1}" /> </rule>和IDs。我需要将此数据集分组为一行，但是在使用收集，展开和分组时遇到一些问题。

dates

我使用点差将某些列带到行：

# surveys dataset
user_id <- c(100, 100, 100, 200, 200, 200)
int_id <- c(1000, 1000, 1000, 2000, 2000, 2000)
fech <- c('01/01/2019', '01/01/2019','01/01/2019','02/01/2019','02/01/2019','02/01/2019')
order <- c(1,2,3,1,2,3)
questions <- c('question1','question2','question3','question1','question2','question3')
answers <- c('answ1','answ2','answ3','answ1','answ2','answ3')

survey.data <- data.frame(user_id, int_id, fech, order, questions,answers)

> survey.data
  user_id int_id       fech order questions answers
1     100   1000 01/01/2019     1 question1   answ1
2     100   1000 01/01/2019     2 question2   answ2
3     100   1000 01/01/2019     3 question3   answ3
4     200   2000 02/01/2019     1 question1   answ1
5     200   2000 02/01/2019     2 question2   answ2
6     200   2000 02/01/2019     3 question3   answ3

并获得以下信息：

survey.data %>% 
  spread(key= questions, value=answers) %>%
  group_by(user_id,int_id, fech) %>% 
  select(-order)

我试图对结果数据集进行分组，但是总是得到6行而不是2行。

我期望以下几点：

# A tibble: 6 x 6
  user_id int_id       fech question1 question2 question3
*   <dbl>  <dbl>     <fctr>    <fctr>    <fctr>    <fctr>
1     100   1000 01/01/2019     answ1        NA        NA
2     100   1000 01/01/2019        NA     answ2        NA
3     100   1000 01/01/2019        NA        NA     answ3
4     200   2000 02/01/2019     answ1        NA        NA
5     200   2000 02/01/2019        NA     answ2        NA
6     200   2000 02/01/2019        NA        NA     answ3

我的问题与this非常相似！

但是我不知道如何使用它。

Answer 1

我认为您需要在分组之前删除order：

survey.data %>% 
  select(-order) %>% 
  # group_by(user_id, int_id, fech) %>% # as pointed out, unnecessary
  spread(questions, answers)

结果：

# A tibble: 2 x 6
# Groups:   user_id, int_id, fech [2]
  user_id int_id fech       question1 question2 question3
    <int>  <int> <chr>      <chr>     <chr>     <chr>    
1     100   1000 01/01/2019 answ1     answ2     answ3    
2     200   2000 02/01/2019 answ1     answ2     answ3

Answer 2

我发现（我认为）另一种可能的解决方案：

survey.data %>% 
  select(-order) %>% 
  dcast(... ~ questions)
´´´

如何在单行数据集中使用spread和group_by

2 个答案: