将多个变量的列拆分为2(非键和值)

时间:2016-10-13 13:52:47

标签: r

我有这个数据帧tk,它是我原始数据的子集

tk

> ##    document   term count    sentiment
> ## 1       111 happen     1 anticipation
> ## 2       111   time     1 anticipation
> ## 3       112 mother     1 anticipation
> ## 4       112 mother     1          joy
> ## 5       112 mother     1     negative
> ## 6       112 mother     1     positive
> ## 7       112 mother     1      sadness
> ## 8       112 mother     1        trust
> ## 9       112    sue     1        anger
> ## 10      112    sue     1     negative
> ## 11      112    sue     1      sadness
> ## 12      112  wrong     1     negative
> ## 13      113   suck     1     negative
> ## 14      114   gate     1        trust

我需要

  • 添加一个新列(tk $ positive_negative)以包含值" positive"和"否定"仅来自情绪变量。
  • 添加另一个新列(tk $ emotions)以包含除" positive"之外的任何其他值。和"否定"也来自情绪变量。

我试过循环,但我不能成功

for (i in tk$sentiment){
  ifelse(i=="positive",tk$positive_negative<-"positive",ifelse(i=="negative",tk$positive_negative<-"negative",tk$emotions<-paste(print(i))))
}

> ## [1] "anticipation"
> ## [1] "anticipation"
> ## [1] "anticipation"
> ## [1] "joy"
> ## [1] "sadness"
> ## [1] "trust"
> ## [1] "anger"
> ## [1] "sadness"
> ## [1] "trust"

tk

> ##    document   term count    sentiment emotions positive_negative
> ## 1       111 happen     1 anticipation    trust          negative
> ## 2       111   time     1 anticipation    trust          negative
> ## 3       112 mother     1 anticipation    trust          negative
> ## 4       112 mother     1          joy    trust          negative
> ## 5       112 mother     1     negative    trust          negative
> ## 6       112 mother     1     positive    trust          negative
> ## 7       112 mother     1      sadness    trust          negative
> ## 8       112 mother     1        trust    trust          negative
> ## 9       112    sue     1        anger    trust          negative
> ## 10      112    sue     1     negative    trust          negative
> ## 11      112    sue     1      sadness    trust          negative
> ## 12      112  wrong     1     negative    trust          negative
> ## 13      113   suck     1     negative    trust          negative
> ## 14      114   gate     1        trust    trust          negative

请指教,谢谢

1 个答案:

答案 0 :(得分:1)

请参阅@Sotos的评论。 ifelse已经向量化了,这基本上意味着它已经将函数应用于向量中的每个元素。所以不需要循环!此外,使用矢量化函数比非矢量化方法快得多。

据说我想解决你的问题,你需要做的就是:

tk$positive_negative <- ifelse(tk$sentiment %in% c("positive","negative"),tk$sentiment,"")
tk$emotions <- ifelse(tk$sentiment %in% c("positive","negative"),"",tk$sentiment)

tk
   document   term count    sentiment positive_negative     emotions
1       111 happen     1 anticipation                   anticipation
2       111   time     1 anticipation                   anticipation
3       112 mother     1 anticipation                   anticipation
4       112 mother     1          joy                            joy
5       112 mother     1     negative          negative             
6       112 mother     1     positive          positive             
7       112 mother     1      sadness                        sadness
8       112 mother     1        trust                          trust
9       112    sue     1        anger                          anger
10      112    sue     1     negative          negative             
11      112    sue     1      sadness                        sadness
12      112  wrong     1     negative          negative             
13      113   suck     1     negative          negative             
14      114   gate     1        trust                          trust

数据:

    tk <- structure(list(document = c(111L, 111L, 112L, 112L, 112L, 112L, 
112L, 112L, 112L, 112L, 112L, 112L, 113L, 114L), term = structure(c(2L, 
6L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 7L, 4L, 1L), .Label = c("gate", 
"happen", "mother", "suck", "sue", "time", "wrong"), class = "factor"), 
    count = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L), sentiment = c("anticipation", "anticipation", "anticipation", 
    "joy", "negative", "positive", "sadness", "trust", "anger", 
    "negative", "sadness", "negative", "negative", "trust")), .Names = c("document", 
"term", "count", "sentiment"), row.names = c(NA, -14L), class = "data.frame")