Question

我有一个数据库，其中包含文本变量和用于定性分析的代码。每当应用代码时都会生成每一行，这意味着如果一个句子应用了3个代码，则数据库将有3行。我想将其合并，保留其余变量的数据，并对代码变量求和。

我一直在寻找方法，找不到方法。

example<-tibble(segments=c('Brexit is bad','Brexit is bad','We need a sit on the table','We need a sit on the table'),
   actor=c("SNP", "SNP", "Labour", "Labour"),
   year=c(2015, 2015, 2017,2017),
   TL_Brexit=c(1,0,0,0),
   Bre_negative=c(0,1,0,0),
   TL_participation=c(0,0,1,0),
   TD_other=c(0,0,0,1))

您可以看到有两个引号，每个引号都用2个代码编码，所以我想将它们合并，并有2行而不是4行，这样代码变量中的1和0相加（但年份，细分和演员变量保持不变，因为它们是相同的）应该看起来像这样：

desiredoutput<-tibble(segments=c('Brexit is bad','We need a sit on the table'),
   actor=c("SNP", "Labour"),
   year=c(2015, 2017),
   TL_Brexit=c(1,0),
   Bre_negative=c(1,0),
   TL_participation=c(0,1),
   TD_other=c(0,1))

任何帮助都将受到欢迎！

Answer 1

如果按segments，actor和year分组，则可以通过采用其他列的sum来汇总每个分组。

library(dplyr)

example %>% 
  group_by(segments, actor, year) %>% 
  summarise_all(sum)

# # A tibble: 2 x 7
# # Groups:   segments, actor [2]
#   segments                 actor  year TL_Brexit Bre_negative TL_participation TD_other
#   <chr>                    <chr> <dbl>     <dbl>        <dbl>            <dbl>    <dbl>
# 1 Brexit is bad            SNP    2015         1            1                0        0
# 2 We need a sit on the ta~ Labo~  2017         0            0                1        1

如何合并某些变量中相同的行，以及合并其他变量的值

1 个答案: