我有以下数据框:
library(tidyverse)
df <- structure(list(rank = structure(c(1L, 10L, 11L, 12L, 13L, 14L,
15L, 16L, 17L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1",
"10", "11", "12", "13", "14", "15", "16", "17\n*", "2", "3",
"4", "5", "6", "7", "8", "9"), class = "factor"), p_value = structure(c(2L,
5L, 17L, 16L, 13L, 12L, 11L, 10L, 9L, 8L, 4L, 3L, 14L, 7L, 6L,
1L, 15L), .Label = c("1e-12", "1e-12262", "1e-164", "1e-176",
"1e-2381", "1e-26", "1e-27", "1e-274", "1e-369", "1e-397", "1e-413",
"1e-422", "1e-429", "1e-57", "1e-6", "1e-855", "1e-919"), class = "factor")), row.names = c(NA,
-17L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("rank",
"p_value"))
df
看起来像这样:
# A tibble: 17 x 2
rank p_value
<fctr> <fctr>
1 1 1e-12262
2 2 1e-2381
3 3 1e-919
4 4 1e-855
5 5 1e-429
6 6 1e-422
7 7 1e-413
8 8 1e-397
9 9 1e-369
10 10 1e-274
11 11 1e-176
12 12 1e-164
13 13 1e-57
14 14 1e-27
15 15 1e-26
16 16 1e-12
17 "17\n*" 1e-6
我的问题是如何将p_value
列类型从fctr
转换为数字,以便我可以使用它执行数学运算。
我尝试了这个错误
> df %>% mutate(logp = log(p_value))
Error in mutate_impl(.data, dots) :
Evaluation error: ‘log’ not meaningful for factors.
答案 0 :(得分:1)
您可以将这些转换为这样的数字。您首先需要在数字之前将因子转换为字符,否则您只需获得数字因子级别。
df %>% mutate(logp = log(as.numeric(as.character(p_value))))
# A tibble: 17 x 3
rank p_value logp
<fctr> <fctr> <dbl>
1 1 1e-12262 -Inf
2 2 1e-2381 -Inf
3 3 1e-919 -Inf
4 4 1e-855 -Inf
5 5 1e-429 -Inf
6 6 1e-422 -Inf
7 7 1e-413 -Inf
8 8 1e-397 -Inf
9 9 1e-369 -Inf
10 10 1e-274 -630.90832
11 11 1e-176 -405.25498
12 12 1e-164 -377.62396
13 13 1e-57 -131.24735
14 14 1e-27 -62.16980
15 15 1e-26 -59.86721
16 16 1e-12 -27.63102
17 "17\n*" 1e-6 -13.81551