将df从factor转换为数值

时间:2019-11-16 18:42:03

标签: r encode sapply datamatrix

我正在努力将数据集转换为数值。我拥有的数据集如下:

customer_id 2012 2013 2013 2014  2015 2016 2017
15251        X     N     U    D     S    C    L

X1-X7被标记为因素。 dput(head(df))的摘录为:

    structure(list(`2012` = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("N", 
"X"), class = "factor"), `2013` = structure(c(6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("C", "D", "N", "S", "U", "X"), class = "factor"), 
    `2014` = structure(c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
    8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("C", 
    "D", "L", "N", "R", "S", "U", "X"), class = "factor"), ... 

我希望数据具有数字值,但是我不知道如何进行相应的转换。 我的目标是可以将df馈入热图,以便可以直观地探索差异。据我所知,这只能通过数字矩阵来实现。因为出现错误 Heatmap.2(input,trace =“ none”,所以:'x'必须是数字矩阵

有人有什么主意吗?

非常感谢您的支持!

1 个答案:

答案 0 :(得分:1)

是可行的。我认为下次包含完整的df会有所帮助。 heatmap.2不起作用,因为您给它提供了一个字符矩阵。使用heatmap.2将颜色的图例显示为字母要复杂一些,我建议在下面使用ggplot

library(ggplot2)
library(dplyr)
library(viridis)

# simulate data
df = data.frame(id=1:5,
replicate(7,sample(LETTERS[1:10],5)))
colnames(df)[-1] = 2012:2018

#convert to long format for plotting and refactor
df <- df %>% pivot_longer(-id) %>%
mutate(value=factor(as.character(value),levels=sort(levels(value))))

#define color scale
# sorted in alphabetical order
present_letters = levels(df$value)
COLS = viridis_pal()(length(present_letters))
names(COLS) = present_letters

#plot
ggplot(data=df,aes(x=name,y=id,fill=value)) + 
geom_tile() + 
scale_fill_manual(values=COLS)

enter image description here