Question

y       app          baby           blackberry          dear   
Neg      20          33.33           100                100       
Neutral  80          66.66            0                  0     
Pos      0            0               0                  0

在上述数据框架中，“app”的中性等级比例更高。因此，我必须将“app”一词与中立情绪结合起来。同样，术语“黑莓”在“负面”类中的比例更高。因此必须将术语“黑莓”与“负”类结合起来。

有人可以帮我解决这个问题吗。

Answer 1

目前尚不清楚预期产量。如果我们需要为其他列中的最大值找到相应的“y”项，请使用summarise_each和which.max来获取数字索引，并根据该值得出那些'y'值列形成'app'到'亲爱的'

library(dplyr)
res <- df1 %>%
         summarise_each(funs(y[which.max(.)]), app:dear)
res
#       app    baby blackberry dear
# 1 Neutral Neutral        Neg  Neg

可以使用gather

中的tidyr将其转换为“长”格式

library(tidyr)
res %>%
    gather()
#        key   value
#1        app Neutral
#2       baby Neutral
#3 blackberry     Neg
#4       dear     Neg

或者我们可以melt将其设为'长'格式（来自data.table）并使用which.max

library(data.table)
melt(setDT(df1), id.var = "y")[, .(value = y[which.max(value)]),.(key= variable)]
#          key   value
#1:        app Neutral
#2:       baby Neutral
#3: blackberry     Neg
#4:       dear     Neg

或使用base R

df1$y[sapply(df1[-1], which.max)]
#[1] "Neutral" "Neutral" "Neg"     "Neg"

数据

df1 <- structure(list(y = c("Neg", "Neutral", "Pos"), app = c(20L, 80L, 
0L), baby = c(33.33, 66.66, 0), blackberry = c(100L, 0L, 0L), 
dear = c(100L, 0L, 0L)), .Names = c("y", "app", "baby", "blackberry", 
"dear"), class = "data.frame", row.names = c(NA, -3L))

如何在R

1 个答案:

数据