在数据表中,所有单元格都是数字,而我想要做的就是将所有数字替换为这样的字符串:
[0,2]中的数字:将其替换为字符串“ Bad”
[3,4]中的数字:将其替换为字符串“ Good”
数字> 4:将其替换为字符串“ Excellent”
这是我的原始表的一个示例,名为“ data.active”:
我这样做的尝试是:
x <- c("churches","resorts","beaches","parks","Theatres",.....)
for(i in x){
data.active$i <- as.character(data.active$i)
data.active$i[data.active$i <= 2] <- "Bad"
data.active$i[data.active$i >2 && data.active$i <=4] <- "Good"
data.active$i[data.active$i >4] <- "Excellent"
}
但是它不起作用。还有其他方法吗?
编辑
这是指向我的数据集GoogleReviews_Dataset的链接,这就是我上图中的表格的方式:
library(FactoMineR)
library(factoextra)
data<-read.csv2(file.choose())
data.active <- data[1:10, 4:8]
答案 0 :(得分:1)
x<-c('x','y','z')
df[,x] <- lapply(df[,x], function(x)
cut(x ,breaks=c(-Inf,2,4,Inf),labels=c('Bad','Good','Excellent'))))
数据
df<-structure(list(x = 1:5, y = c(1L, 2L, 2L, 2L, 3L), z = c(1L,3L, 3L, 3L, 2L),
a = c(1L, 5L, 6L, 4L, 8L),b = c(1L, 3L, 4L, 7L, 1L)),
class = "data.frame", row.names = c(NA, -5L))
答案 1 :(得分:1)
您可以使用tidyverse
的mutate_all
来限制范围:
library(tidyverse)
df<-structure(
list(
x = 1:5,
y = c(1L, 2L, 2L, 2L, 3L),
z = c(1L,3L, 3L, 3L, 2L),
a = c(1L, 5L, 6L, 4L, 8L),
b = c(1L, 3L, 4L, 7L, 1L)
),
class = "data.frame",
row.names = c(NA, -5L)
)
df %>% mutate_all(
funs(
case_when(
. <= 2 ~ 'Bad',
(. > 3) & (. <= 4) ~ 'Good',
(. > 4) ~ 'Excellent',
TRUE ~ as.character(.)
)
)
)
上面的.
代表要评估的元素。结果是
x y z a b 1 Bad Bad Bad Bad Bad 2 Bad Bad 3 Excellent 3 3 3 Bad 3 Excellent Good 4 Good Bad 3 Good Excellent 5 Excellent 3 Bad Excellent Bad
要仅更改选择的列,请使用mutate_at
:
df %>% mutate_at(
vars(
c('a', 'x', 'b')
),
funs(
case_when(
. <= 2 ~ 'Bad',
(. > 3) & (. <= 4) ~ 'Good',
(. > 4) ~ 'Excellent',
TRUE ~ as.character(.)
)
)
)
这产生
x y z a b 1 Bad 1 1 Bad Bad 2 Bad 2 3 Excellent 3 3 3 2 3 Excellent Good 4 Good 2 3 Good Excellent 5 Excellent 3 2 Excellent Bad
在这里,您可以使用tidyverse
直接下载和分类源数据:
df <- read_csv(
'https://archive.ics.uci.edu/ml/machine-learning-databases/00485/google_review_ratings.csv'
) %>% select(
-X26 # Last column is blank (faulty CSV)
) %>% select_at(
vars(
paste('Category', 1:10) # Pick only Category 1-Category 10
)
) %>% mutate_all(
funs(
case_when(
. <= 2 ~ 'Bad',
(. > 3) & (. <= 4) ~ 'Good',
(. > 4) ~ 'Excellent',
TRUE ~ as.character(.)
)
)
)