描述我的数据集:
A有一个类似于此的数据集,其中我有一个3列的数据框,2是描述对话进展的因素,因为学生彼此交流(Discourse Code
)并通过单词构建模型(Modeling Code
),约30分钟(Time_Processed
)。总而言之,我有一个大约(386行×9列)的数据框架,但我在这里分享了大约100行的随机样本以保密。
Discourse Code
包含以下因素:
"AG" "C" "D" "DA" "G" "J" "OFF" "Q" "S"
Modeling Code
有以下因素:
"A" "MA" "OFF" "P" "SM" "V"
而Time_Processed
只是数字分钟,如:
3.4500 11.2500 12.2500 14.4667 17.9333 18.0167 18.0667 18.6333 32.9000 33.1333, etc ...
我试图用ggplot2构建一个表面图,但事实上我有更多的时间数据点而不是我可以容纳(6 x 9)或(9 x 6)矩阵,这使得它很难构建表面情节。
如果考虑与R一起分发的流行火山数据集, 如果你将它融化,那么你就可以准确地获得火山上相应的频率,因为这就是它的模型:
library(dplyr)
reshape2::melt(volcano) %>% head
Var1 Var2 value
1 1 1 100 # x = row 1, y = column 1, z = 100
2 2 1 101 # x = row 2, y = column 2, z = 101,
3 3 1 102 # etc ...
4 4 1 103
5 5 1 104
6 6 1 105
etc ...
所有这些坐标与此plot_ly图很好地匹配,并且得到第三维“z”,它是对应于(x,y)坐标对的大矩阵中的每个值。这种类型的图只是两个变量:行,列,高度z是那些的矩阵。
plot_ly(z = ~ volcano) %>% add_surface()
更多互动版本:https://plot.ly/r/3d-surface-plots/
这种情节对我来说太简单了。我没有2-3个数值。
我的数据集如下所示:
`Modeling Code` `Discourse Code` Time_Processed
1 SM J 5.05
2 OFF OFF 6.95
3 P Q 3.18
4 MA S 20.4
5 MA S 20.6
6 A S 32.7
7 V S 17.8
8 MA S 20.5
9 V J 8.05
10 MA C 14.4
等等......
更确切地说就是这样:
df <- structure(list(`Modeling Code` = structure(c(5L, 3L, 4L, 2L,
2L, 1L, 6L, 2L, 6L, 2L, 2L, 2L, 4L, 6L, 2L, 2L, 1L, 2L, 2L, 4L,
2L, 4L, 3L, 2L, 2L, 6L, 5L, 2L, 3L, 2L, 5L, 5L, 3L, 6L, 6L, 6L,
2L, 2L, 5L, 2L, 2L, 2L, 1L, 6L, 2L, 4L, 2L, 1L, 5L, 2L, 4L, 3L,
1L, 4L, 6L, 5L, 2L, 4L, 1L, 2L, 5L, 5L, 2L, 2L, 4L, 2L, 5L, 2L,
3L, 2L, 2L, 2L, 5L, 3L, 2L, 6L, 2L, 2L, 5L, 3L, 2L, 2L, 2L, 4L,
6L, 5L, 2L, 2L, 5L, 2L, 4L, 6L, 3L, 2L, 5L, 4L, 3L, 5L, 2L, 2L
), .Label = c("A", "MA", "OFF", "P", "SM", "V"), class = "factor"),
`Discourse Code` = structure(c(6L, 7L, 8L, 9L, 9L, 9L, 9L,
9L, 6L, 2L, 6L, 1L, 9L, 8L, 6L, 6L, 6L, 9L, 2L, 9L, 1L, 6L,
7L, 8L, 8L, 6L, 4L, 1L, 7L, 1L, 2L, 6L, 7L, 9L, 1L, 1L, 9L,
2L, 1L, 2L, 8L, 2L, 2L, 2L, 9L, 3L, 9L, 9L, 9L, 1L, 5L, 7L,
1L, 3L, 9L, 6L, 3L, 9L, 9L, 2L, 6L, 6L, 1L, 8L, 9L, 2L, 4L,
3L, 7L, 3L, 6L, 2L, 6L, 7L, 2L, 2L, 8L, 4L, 6L, 7L, 9L, 6L,
4L, 8L, 8L, 1L, 8L, 2L, 9L, 2L, 5L, 6L, 7L, 2L, 6L, 2L, 7L,
6L, 4L, 8L), .Label = c("AG", "C", "D", "DA", "G", "J", "OFF",
"Q", "S"), class = "factor"), Time_Processed = c(5.05, 6.95,
3.1833, 20.4333, 20.55, 32.7333, 17.75, 20.5167, 8.05, 14.3667,
20.7167, 15.4833, 4, 26.2667, 14.6667, 17.2167, 33.1833,
3.1, 18.9833, 9.35, 17.35, 30.7167, 23.1167, 18.1667, 27.1833,
26.4667, 3.7, 20.5, 23.0833, 24.6833, 22.8833, 7.8333, 10.3,
24.8, 26.5, 26.1667, 15.05, 15.6, 3.8, 24.8, 5.6, 27.0833,
32.6667, 32.0167, 3.0333, 2.1667, 12.7167, 32.6167, 22.6,
24.25, 1.4333, 28.6333, 9.9667, 3.45, 32.7, 12.9667, 6.25,
30.3, 9.9, 16.9667, 20.8667, 3.6, 16.3833, 26.7, 13.55, 23.45,
11.4167, 17.55, 19.6333, 3.05, 11.35, 26.8, 12.85, 10.15,
26.6667, 4.6, 15.8667, 15.7333, 9.5167, 27.7667, 13.95, 23.9833,
15.6333, 6.9, 4.4833, 17.1167, 12.25, 24.7833, 17.0833, 13.9,
11.85, 29, 10.2667, 20.8167, 17.0333, 13.3167, 27.8, 8.55,
27.2, 21.2167)), .Names = c("Modeling Code", "Discourse Code",
"Time_Processed"), row.names = c(NA, -100L), class = c("tbl_df",
"tbl", "data.frame"))
因为我知道一些数学并且我可以得到一个在R中工作的计数图,我知道我可以将时间显示为每个话语代码的轮廓图和Time_Processed
加热的建模代码:
plot_ly(data = df, x = ~ `Modeling Code`, y = ~ `Discourse Code`, z = ~ `Time_Processed`, type = "contour") # if I try %>% add_surface() to this then I get z must be a numeric matrix
这是轮廓图是一种说法。这就像从天空看山脉。热值(黄色)在天空中较高,而冷值(蓝色)在山谷中或在山脉的底部。
所以我的问题是如何使这个轮廓图有一个“Time_Processed”高度组件,我可以通过查看我的3维图形来更好地显示“Time_Processed”的热度? < / p>
我可以制作等高线图的事实意味着我的3个可变数据应该在3个维度中呈现。这就是火山看起来像轮廓图。
plot_ly(z = ~ volcano, type = "contour")
很容易看出这是一个等高线图的火山。我的数据不那么漂亮,但是如果我跟踪Time_Processed
的3D表面,那么我可以很好地了解我的主题随着时间推移实现的问题解决和沟通技巧。
对我来说,你建议使用什么图形库并不重要。我只想知道如何做到这一点,或者是否真的有办法用2个分类变量和1个数字来制作逻辑矩阵。
谢谢!