我的数据框df
包含简单的会话流(会话partner.id
和conversation.label
)。这些数据应代表admin
角色提出的问题类型以及interviewee
给出的答案。
df = data.frame('partner.id' = c('admin', 'interviewee', 'interviewee', 'admin', 'interviewee', 'admin', 'interviewee', 'interviewee', 'admin', 'interviewee')
, 'conversation.label' = c('intro', NA, NA, 'intro', NA, 'open', NA, NA, 'closed', NA))
partner.id conversation.label
1 admin intro
2 interviewee <NA>
3 interviewee <NA>
4 admin intro
5 interviewee <NA>
6 admin open
7 interviewee <NA>
8 interviewee <NA>
9 admin closed
10 interviewee <NA>
我想将受访者对话标签(所有NA
)设置为前一个管理员(即受访者标签获取管理员提出的问题标签):
partner.id conversation.label
1 admin intro
2 interviewee intro (HERE!)
...
为此,我递归使用了sapply
,如:
df$conversation.label[2 : length(df$conversation.label)] = sapply(seq(2, length(df$conversation.label)), function(i){
print(is.na(df$conversation.label[i]))
if(is.na(df$conversation.label[i])){
df$conversation.label[i] = df$conversation.label[i-1]
} else {
df$conversation.label[i] = df$conversation.label[i]
}
})
如果彼此之后没有两个NAs
,这样可以正常工作。
上面的脚本输出:
partner.id conversation.label
1 admin intro
2 interviewee intro
3 interviewee <NA>
4 admin intro
5 interviewee intro
6 admin open
7 interviewee open
8 interviewee <NA>
9 admin closed
10 interviewee closed
如何重写此脚本,以便在conversation.label
不是NA
而不是仅仅返回[n-1]
之前执行递归部分?