数据框错误,尝试创建热图

时间:2020-01-24 17:23:44

标签: r ggplot2

我正在尝试创建一个热图,类似于此处。我正在使用相同的代码。 https://nitinahuja.github.io/2017/heatmaps-in-r/

以下是数据示例:

    Date        Change  Hour    Day
1   2020-01-22  -0.01   02:00   Wednesday
2   2020-01-22  -0.24   01:00   Wednesday
3   2020-01-22  0.12    00:00   Wednesday
4   2020-01-21  0.16    23:00   Tuesday
5   2020-01-21  -0.12   22:00   Tuesday
6   2020-01-21  -0.02   21:00   Tuesday
7   2020-01-21  2.46    20:00   Tuesday
8   2020-01-21  -1.22   19:00   Tuesday
9   2020-01-21  -0.26   18:00   Tuesday
10  2020-01-21  0.1    17:00    Tuesday
11  2020-01-21  -0.07   16:00   Tuesday
12  2020-01-21  -0.1054 15:00   Tuesday
13  2020-01-21  -0.069  14:00   Tuesday
14  2020-01-21  0.0477  13:00   Tuesday
15  2020-01-21  -0.02   12:00   Tuesday
16  2020-01-21  -0.02   11:00   Tuesday
17  2020-01-21  0.34    10:00   Tuesday
18  2020-01-21  -0.22   09:00   Tuesday
19  2020-01-21  0.21    08:00   Tuesday
20  2020-01-21  -0.11   07:00   Tuesday
21  2020-01-21  -0.12   06:00   Tuesday
22  2020-01-21  -0.19329 5:00   Tuesday
23  2020-01-21  0.0213  4:00    Tuesday
24  2020-01-21  0.09    3:00    Tuesday
25  2020-01-21  0.1306  2:00    Tuesday
26  2020-01-21  0.1960  1:00    Tuesday
27  2020-01-21  -0.09   0:00    Tuesday
28  2020-01-20  -0.23   23:00   Monday

我已经运行了以下代码:

ggplot(Change , aes(x=Hour, y=Day, fill = Change)) + 
    geom_tile(color = "white", size = 0.1) + 
    scale_x_discrete(expand=c(0,0)) + 
    scale_y_discrete(expand=c(0,0)) + 
    scale_fill_viridis(name="Price Change", option = "plasma") + 
    coord_equal() + 
    labs(x="Call hour", y=NULL, title=sprintf("price change by hr", vendor)) + 
    theme_tufte(base_family="Helvetica") +
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))'''

我设法使ggplot函数正常工作,但生成的图表完全错误。

由于我获得的数据基本上与链接的示例中给出的数据相同,并且遵循相同的命令,因此我不确定哪里出了问题。参见下图:

enter image description here

1 个答案:

答案 0 :(得分:0)

这是您要查找的热图吗?

library(ggplot2)
library(ggthemes)
ggplot(df , aes(x=Hour, y=Day, fill = Change)) + 
  geom_tile(color = "white", size = 0.1) + 
  scale_x_discrete(expand=c(0,0)) + 
  scale_y_discrete(expand=c(0,0)) + 
  scale_fill_viridis_c(name="Price Change", option = "plasma") + 
  coord_equal() + 
  #labs(x="Call hour", y=NULL, title=sprintf("price change by hr", vendor)) + 
  theme_tufte(base_family="Helvetica") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))

enter image description here

我所做的唯一修改是将scale_fill_viridis替换为scale_fill_viridis_c

用作以下结构的数据框(在此称为df):

Classes ‘data.table’ and 'data.frame':  28 obs. of  5 variables:
 $ ROw   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Date  : chr  "2020-01-22" "2020-01-22" "2020-01-22" "2020-01-21" ...
 $ Change: num  -0.01 -0.24 0.12 0.16 -0.12 -0.02 2.46 -1.22 -0.26 0.1 ...
 $ Hour  : chr  "02:00" "01:00" "00:00" "23:00" ...
 $ Day   : chr  "Wednesday" "Wednesday" "Wednesday" "Tuesday" ...
 - attr(*, ".internal.selfref")=<externalptr> 

您可以通过执行以下操作与自己的data.frame比较:

str(Change)

如果看到“更改”列标记为“ chr”,请执行以下操作将其转换为数字格式:

Change$Change <- as.numeric(Change$Change)

如果标记为因素,请执行以下操作将其转换为数字格式:

Change$Change <- as.numeric(as.character(Change$Change))

数据示例:

structure(list(ROw = 1:28, Date = c("2020-01-22", "2020-01-22", 
"2020-01-22", "2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", 
"2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", 
"2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", 
"2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", 
"2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", "2020-01-21", 
"2020-01-20"), Change = c(-0.01, -0.24, 0.12, 0.16, -0.12, -0.02, 
2.46, -1.22, -0.26, 0.1, -0.07, -0.1054, -0.069, 0.0477, -0.02, 
-0.02, 0.34, -0.22, 0.21, -0.11, -0.12, -0.19329, 0.0213, 0.09, 
0.1306, 0.196, -0.09, -0.23), Hour = c("02:00", "01:00", "00:00", 
"23:00", "22:00", "21:00", "20:00", "19:00", "18:00", "17:00", 
"16:00", "15:00", "14:00", "13:00", "12:00", "11:00", "10:00", 
"09:00", "08:00", "07:00", "06:00", "5:00", "4:00", "3:00", "2:00", 
"1:00", "0:00", "23:00"), Day = c("Wednesday", "Wednesday", "Wednesday", 
"Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", 
"Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", 
"Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", 
"Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", 
"Monday")), row.names = c(NA, -28L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x55aee5c44350>)