我有一个我从CSV中读取的数据框,我正在尝试清理它。这就是它的样子:
A B C
1 0 X;Y;Z true
2 2 Y;Z false
3 5 X:Y false
我想要分解的是将B列变成二进制输入,如下所示:
A B C has.x has.y has.z
1 0 X;Y;Z true 1 1 1
2 2 Y false 0 1 0
3 5 X:Y false 1 1 0
我尝试使用带有赋值的ifelse,但它将值应用于整个列。如何将其分解以申请单独到达行?
raw <- read.csv("data.csv")
raw$has.x <- ifelse("x" %in% raw[,"B"], 1, 0)
答案 0 :(得分:0)
x <- read.table(text = 'A B C
1 0 X;Y;Z true
2 2 Y;Z false
3 5 X;Y false', stringsAsFactors = F)
## use dplyr::seperate
x1 <- x %>% separate(., B, sep = ';',
into = c( 'has.x' ,'has.y' ,'has.z'), remove = FALSE)
## put the values in the desired format
x1[,c( 'has.x' ,'has.y' ,'has.z')] <- sapply(x1[,c( 'has.x' ,'has.y' ,'has.z')],
function (x) as.numeric(!is.na(x)))
x1
A B has.x has.y has.z C
1 0 X;Y;Z 1 1 1 true
2 2 Y;Z 1 1 0 false
3 5 X;Y 1 1 0 false