我正在尝试使用ggplot2和多变量数据来绘制geom_point图,并且遇到了对数据进行颜色编码并对其进行可视化绘制的问题。我在下面分享了我的数据:我对工作量(X轴)与换发(y轴)感兴趣,并按头发类型(脱发的类型:弥散,额叶/颞叶和/或顶点)对数据进行颜色编码。该调查的本质是多变量的,患者能够认可多种脱发类型(头发类型1,2和/或3)。前20名参与者的代码如下:
Figure3Data = structure(list(MonthsMassage = c(0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1),
MinutesPerDayMassage = c("0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily",
"11-20 minutes daily", "11-20 minutes daily", "11-20 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily"), Minutes = c(5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 15, 15, 15, 5, 5, 5, 5, 5, 5, 5), hairchange = c(-1, -1, 0,
-1, 0, -1, -1, 0, 0, -1, 0, -1, -1, 0, 0, -1, 0, -1, 0, -1),
HairType1 = c("Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"other", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal"), HairType2 = c("other", "other", "other",
"other", "other", "other", "other", "other", "other", "Vertexthinning",
"Vertexthinning", "other", "Vertexthinning", "other", "other",
"Vertexthinning", "other", "Vertexthinning", "Vertexthinning",
"other"), HairType3 = c("other", "Diffusethinning", "other",
"Diffusethinning", "other", "other", "Diffusethinning", "Diffusethinning",
"Diffusethinning", "other", "Diffusethinning", "Diffusethinning",
"other", "other", "Diffusethinning", "Diffusethinning", "other",
"Diffusethinning", "Diffusethinning", "Diffusethinning"),
Effort = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.5, 2.5,
2.5, 2.5, 2.5, 2.5, 2.5), EffortGroup = c("<5", "<5", "<5",
"<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5",
"<5", "<5", "<5", "<5", "<5", "<5", "<5")), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
由于患者认可属于多列的发型,因此我无法使用以下代码直观地分离数据:
ggplot(data, aes(x=Effort, y=hairchange, color = hairtype????)+geom_point()
如果数据以某种方式出现在显示脱发的1列中,则很容易看到:
因此,我想知道是否存在一种组织数据的方式,以便对三种脱发类型进行可视化和颜色编码?我尝试过reshape2并融化了,没有任何运气。我想避免创建“报告的多种类型”的第四类,因为这使许多人无法理解我想获得的见识。
或者,将非常感谢提供用于对此数据进行绘图的其他方法(密度/线图)的建议。我的一个想法是要制作四个单独的线图-每个脱发类型(即平均,散布,顶点,时间)一个-以x轴为“努力”,以y轴为平均感知到的头发变化。 / p>
答案 0 :(得分:0)
我使用以下代码段:
Employee
您可以创建一个将三种头发类型结合在一起的全新列,只需将第五,第六和第七列粘贴为新的“ combinedHair”列即可:
var WorkingTimePerDatePerEmployee = myDbContext.Attendencies
// make groups of attendencies for the same Employee
.GroupBy(attendancy => employee.Id,
// the attendancies in the group are all for the same Employee
(employeeId, attendanciesForThisEmployeeId) => new
{
EmployeeId = employeeId,
AttendanciesGroupedByDate = attendanciesForThisEmployeeId
// group by same Date:
.GroupBy(attendancy2 => attendancy2.Date,
(date, sameDateAttendancies => new
{
Date = date
TotalWorkingHoursOnDate = sameDateAttendancies
// per attendancy select CheckOut - CheckIn = working time per shift
.Select(sameDateAttendancy => sameDateAttendancy.Checkout - sameDateAttendancy.Checkin)
// sum all shifts on this date
.Sum(),
}),
});
如果您想绘制该数据表的数据,则说明它具有过度绘图的功能,因此建议使用library(ggplot2)
library(data.table)
dt <- data.table(MonthsMassage = c(0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1),
MinutesPerDayMassage = c("0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily",
"11-20 minutes daily", "11-20 minutes daily", "11-20 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily", "0-10 minutes daily", "0-10 minutes daily",
"0-10 minutes daily"),
Minutes = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 15, 15, 15, 5, 5, 5, 5, 5, 5, 5),
hairchange = c(-1, -1, 0, -1, 0, -1, -1, 0, 0, -1, 0, -1, -1, 0, 0, -1, 0, -1, 0, -1),
HairType1 = c("Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"other", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal", "Templefrontal", "Templefrontal", "Templefrontal",
"Templefrontal"),
HairType2 = c("other", "other", "other", "other", "other", "other", "other", "other",
"other", "Vertexthinning", "Vertexthinning", "other", "Vertexthinning",
"other", "other", "Vertexthinning", "other", "Vertexthinning",
"Vertexthinning", "other"),
HairType3 = c("other", "Diffusethinning", "other", "Diffusethinning", "other", "other",
"Diffusethinning", "Diffusethinning", "Diffusethinning", "other",
"Diffusethinning", "Diffusethinning", "other", "other", "Diffusethinning",
"Diffusethinning", "other", "Diffusethinning", "Diffusethinning", "Diffusethinning"),
Effort = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5),
EffortGroup = c("<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5", "<5",
"<5", "<5", "<5", "<5", "<5", "<5", "<5"))
函数:
dt[, CombinedHair:=do.call(paste0,.SD), .SDcols=c(5,6,7)]
如果您想要更好的类名,则可以使用空引号替换“默认”。
答案 1 :(得分:0)
这是一种将位置移动到其自己的变量中的方法(此处未显示,但是您可以将其映射到构面,点形或其他美感(如果需要),然后根据头发类型绘制颜色,删除“其他”发型。
library(tidyverse)
Figure3Data_long <- Figure3Data %>%
gather(location, hairtype, HairType1:HairType3) %>%
filter(hairtype != "other")
ggplot(Figure3Data_long,
aes(Effort, hairchange, color = hairtype)) +
# geom_point() +
geom_jitter(width = 0.03, height = 0.01) # illustrative to show overplots