我有一个数据表,每个datatable$Ppt
和每个datatable$nitem
,当有" fffword"在datatable$Region
中,我需要提取" fffword"的值。并将其与以下" word"的值进行比较。如果两个值相同,那么我需要在datatable$Output
中使用值0,如果这两个值不同,我需要datatable$Output
中的值为1。
我尝试过:
datatable %>% group_by(Ppt, nitem) %>%
mutate(Output = ifelse(as.numeric(gsub("fffword([0-9]+).*","\\1",Region) == lag(as.numeric(gsub("word([0-9]+).*","\\1",Region)), 0L,ifelse(as.numeric(gsub("fffword([0-9]+).*","\\1",Region) != lag(as.numeric(gsub("word([0-9]+).*","\\1",Region)), 1L)
但它没有用。
#Ppt Region nitem Output
#1 "fffword8" 93 0 (current ffword n=8, following word n=8)
#1 "word8" 93 0 (previous ffword n=8, current word n=8)
#1 "fffword9" 93 1 (current ffword n=9, no following word for this ppt and this nitem)
#1 "word2" 122 1 (no previous fffword for this ppt and this nitem and this n Region)
#1 "fffword3" 122 0 (current ffword n=3, following word n=3)
#1 "word3" 122 0 (previous ffword n=3, current word n=3)
#1 "word6" 122 1 (no previous fffword for this ppt and this nitem and this n Region)
#1 "fffword7" 122 0
#1 "word7" 122 0
#1 "fffword8" 122 0
#1 "word8" 122 0
#54 "fffword8" 4 0
#54 "word8" 4 0
#54 "fffword9" 4 1
#54 "word2" 4 1
#54 "fffword2" 10 0
#54 "word4" 10 1
#54 "word6" 10 1
#54 "fffword23" 10 0
#54 "word23" 10 0
#54 "fffword24" 5 0
#54 "word24" 5 0
答案 0 :(得分:0)
嵌套ifelse
和dplyr
的一次尝试可以为您提供所需的结果。方法是:
如果当前行的区域为fffword
,则数字应与下一行(lead
)匹配,但当前行的区域为word
时,则应将数字与之前的数字进行比较({ {1}})行。
如果lag
或next
行不可用,则previous
应被视为Output
。
来自Region的1
刚刚被比较为digit
格式。没有明显的理由在比较前转换character
中的那些。
numeric