ifelse仅返回数字值

时间:2019-04-27 13:27:34

标签: r

我是R的新用户,刚开始使用数据框。 我正在尝试在数据框内创建一个新列(使用下面的代码)。问题在于创建的新列包含数字值,但是代码中使用的所有列都是非数字

我尝试在线寻找答案,但是找不到答案

dataframe$newcol <- ifelse(dataframe$colA == "London", dataframe$colA, dataframe$colB)'

2 个答案:

答案 0 :(得分:2)

R默认将很多字符列设置为因子,这可能会有些棘手。

您可以像这样查看变量的类

sapply( dataframe, class )

str( dataframe )

您可以像这样转换多个列:

dataframe[ , c("colA" ,"colB") ] <- sapply( dataframe[ , c("colA" ,"colB") ] , as.character )

您可以一次转换一列

dataframe$colA <- as.character( dataframe$colA )

如果您要转换数字列,请这样做

dataframe$colX <- as.numeric( as.character( dataframe$colX ))

您的代码现在应该可以工作-请注意,我将==更改为%in%

dataframe$newcol <- ifelse(dataframe$colA %in% "London", dataframe$colA, dataframe$colB)

您可以在此处使用transform来保存自己的输入

dataframe <- transform( dataframe , newcol = ifelse( colA %in% "London", colA, colB))

答案 1 :(得分:0)

为此,您可以编写一个小的ifelse.fac新函数。

ifelse.fac <- Vectorize(function(x, y, z) if (x) y else z)

应用数据产量:

dat$newcol <- ifelse.fac(dat$colA == "London", dat$colA, dat$colB)
dat
#         colA          colB    newcol
# 1     London not in France    London
# 2     London not in France    London
# 3     London not in France    London
# 4     London not in France    London
# 5      Paris     in France in France
# 6  Marseille     in France in France
# 7      Paris     in France in France
# 8      Paris     in France in France
# 9     London not in France    London
# 10 Marseille     in France in France

因子结构保持不变:

str(dat)
# 'data.frame': 10 obs. of  3 variables:
# $ colA  : Factor w/ 3 levels "London","Marseille",..: 1 1 1 1 3 2 3 3 1 2
# $ colB  : Factor w/ 2 levels "in France","not in France": 2 2 2 2 1 1 1 1 2 1
# $ newcol: Factor w/ 5 levels "London","Marseille",..: 1 1 1 1 4 4 4 4 1 4

数据

dat <- structure(list(colA = structure(c(1L, 1L, 1L, 1L, 3L, 2L, 3L, 
3L, 1L, 2L), .Label = c("London", "Marseille", "Paris"), class = "factor"), 
    colB = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L
    ), .Label = c("in France", "not in France"), class = "factor")), row.names = c(NA, 
-10L), class = "data.frame")

head(dat)
#        colA          colB
# 1    London not in France
# 2    London not in France
# 3    London not in France
# 4    London not in France
# 5     Paris     in France
# 6 Marseille     in France