我是R的新用户,刚开始使用数据框。 我正在尝试在数据框内创建一个新列(使用下面的代码)。问题在于创建的新列包含数字值,但是代码中使用的所有列都是非数字
我尝试在线寻找答案,但是找不到答案
dataframe$newcol <- ifelse(dataframe$colA == "London", dataframe$colA, dataframe$colB)'
答案 0 :(得分:2)
R默认将很多字符列设置为因子,这可能会有些棘手。
您可以像这样查看变量的类
sapply( dataframe, class )
或
str( dataframe )
您可以像这样转换多个列:
dataframe[ , c("colA" ,"colB") ] <- sapply( dataframe[ , c("colA" ,"colB") ] , as.character )
您可以一次转换一列
dataframe$colA <- as.character( dataframe$colA )
如果您要转换数字列,请这样做
dataframe$colX <- as.numeric( as.character( dataframe$colX ))
您的代码现在应该可以工作-请注意,我将==更改为%in%
dataframe$newcol <- ifelse(dataframe$colA %in% "London", dataframe$colA, dataframe$colB)
您可以在此处使用transform来保存自己的输入
dataframe <- transform( dataframe , newcol = ifelse( colA %in% "London", colA, colB))
答案 1 :(得分:0)
为此,您可以编写一个小的ifelse.fac
新函数。
ifelse.fac <- Vectorize(function(x, y, z) if (x) y else z)
应用数据产量:
dat$newcol <- ifelse.fac(dat$colA == "London", dat$colA, dat$colB)
dat
# colA colB newcol
# 1 London not in France London
# 2 London not in France London
# 3 London not in France London
# 4 London not in France London
# 5 Paris in France in France
# 6 Marseille in France in France
# 7 Paris in France in France
# 8 Paris in France in France
# 9 London not in France London
# 10 Marseille in France in France
因子结构保持不变:
str(dat)
# 'data.frame': 10 obs. of 3 variables:
# $ colA : Factor w/ 3 levels "London","Marseille",..: 1 1 1 1 3 2 3 3 1 2
# $ colB : Factor w/ 2 levels "in France","not in France": 2 2 2 2 1 1 1 1 2 1
# $ newcol: Factor w/ 5 levels "London","Marseille",..: 1 1 1 1 4 4 4 4 1 4
数据
dat <- structure(list(colA = structure(c(1L, 1L, 1L, 1L, 3L, 2L, 3L,
3L, 1L, 2L), .Label = c("London", "Marseille", "Paris"), class = "factor"),
colB = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L
), .Label = c("in France", "not in France"), class = "factor")), row.names = c(NA,
-10L), class = "data.frame")
head(dat)
# colA colB
# 1 London not in France
# 2 London not in France
# 3 London not in France
# 4 London not in France
# 5 Paris in France
# 6 Marseille in France