我在R中有以下数据框
ID Season Year Weekday
1 Winter 2017 Monday
2 Winter 2018 Tuesday
3 Summer 2017 Monday
4 Summer 2018 Wednsday
我想将这些因子水平转换为整数,下面是我想要的数据框
ID Season Year Weekday
1 1 1 1
2 1 2 2
3 2 1 1
4 2 2 3
Winter = 1,Summer =2
2017 = 1 , 2018 = 2
Monday = 1,Tuesday = 2,Wednesday = 3
当前,我在ifelse
上做3以上的实验
otest_xgb$Weekday <- as.integer(ifelse(otest_xgb$Weekday == "Monday",1,
ifelse(otest_xgb$Weekday == "Tuesday",2,
ifelse(otest_xgb$Weekday == "Wednesday",3,
ifelse(otest_xgb$Weekday == "Thursday",4,5)))))
有什么办法可以避免写长ifelse
吗?
答案 0 :(得分:3)
m=dat
> m[]=lapply(dat,function(x)as.integer(factor(x,unique(x))))
> m
ID Season Year Weekday
1 1 1 1 1
2 2 1 2 2
3 3 2 1 1
4 4 2 2 3
答案 1 :(得分:1)
您可以简单地使用as.numeric()
将因子转换为数字。每个值将更改为该因子级别表示的相应整数:
library(dplyr)
### Change factor levels to the levels you specified
otest_xgb$Season <- factor(otest_xgb$Season , levels = c("Winter", "Summer"))
otest_xgb$Year <- factor(otest_xgb$Year , levels = c(2017, 2018))
otest_xgb$Weekday <- factor(otest_xgb$Weekday, levels = c("Monday", "Tuesday", "Wednesday"))
otest_xgb %>%
dplyr::mutate_at(c("Season", "Year", "Weekday"), as.numeric)
# ID Season Year Weekday
# 1 1 1 1 1
# 2 2 1 2 2
# 3 3 2 1 1
# 4 4 2 2 NA
答案 2 :(得分:1)
我们可以将match
与unique
元素一起使用
library(dplyr)
dat %>%
mutate_all(funs(match(., unique(.))))
# ID Season Year Weekday
#1 1 1 1 1
#2 2 1 2 2
#3 3 2 1 1
#4 4 2 2 3
答案 3 :(得分:1)
订购和名义因素变量需要分别处理 。将因子列直接转换为整数或数字将提供 词典 意义上的值。
Weekday
在概念上是普通,Year
是整数,Season
通常是名义 。但是,这又是主观的,取决于所需的分析类型。
例如。 直接从因子转换为整数变量时。在Weekday
列中,Wednesday
的值将高于星期六和星期二:
dat[] <- lapply(dat, function(x)as.integer(factor(x)))
dat
# ID Season Year Weekday
#1 1 2 1 1
#2 2 2 2 3
#3 3 1 1 2 (Saturday)
#4 4 1 2 4 (Wednesday): assigned value greater than that ofSaturday
因此,仅可以将Season
和Year
列的因数直接转换为整数。 可能需要注意的是,对于year
列,由于字典意义与其序数意义相匹配,因此可以很好地发挥作用。
dat[c('Season', 'Year')] <- lapply(dat[c('Season', 'Year')],
function(x) as.integer(factor(x)))
Weekday
需要从具有所需水平顺序的有序因子变量转换。如果进行常规聚集可能无害,但会 在实施统计模型时会严重影响结果 。
dat$Weekday <- as.integer(factor(dat$Weekday,
levels = c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday", "Sunday"), ordered = TRUE))
dat
# ID Season Year Weekday
#1 1 2 1 1
#2 2 2 2 2
#3 3 1 1 6 (Saturday)
#4 4 1 2 3 (Wednesday): assigned value less than that of Saturday
使用的数据:
dat <- read.table(text=" ID Season Year Weekday
1 Winter 2017 Monday
2 Winter 2018 Tuesday
3 Summer 2017 Saturday
4 Summer 2018 Wednesday", header = TRUE)
答案 4 :(得分:0)
将季节,年份和工作日转换为因子后,请使用此代码更改为虚拟指标变量
Fatal error: Uncaught Error: Call to a member function get_status() on null in /home/mywebsite/public_html/wp-content/plugins/contact-form-7/includes/contact-form.php:732
Stack trace:
#0 /home/mywebsite/public_html/wp-content/plugins/contact-form-7/includes/controller.php(14): WPCF7_ContactForm->submit()
#1 /home/mywebsite/public_html/wp-includes/class-wp-hook.php(285): wpcf7_control_init()
#2 /home/mywebsite/public_html/wp-includes/class-wp-hook.php(311): WP_Hook->apply_filters(NULL, Array)
#3 /home/mywebsite/public_html/wp-includes/plugin.php(544): WP_Hook->do_action(Array)
#4 /home/mywebsite/public_html/wp-includes/class-wp.php(388): do_action_ref_array('parse_request', Array)
#5 /home/mywebsite/public_html/wp-includes/class-wp.php(739): WP->parse_request('')
#6 /home/mywebsite/public_html/wp-includes/functions.php(1274): WP->main('')
#7 /home/mywebsite/public_html/wp-blog-header.php(16): wp()
#8 /home/mywebsite/public_html/wp-content/plugins/contact-form-7/includes/contact-form.php on line 732