我有一年的数据看起来像这样:
datetime, key, value
1/1/15, 7k Steps, 1
1/1/15, Ate Poorly, 1
1/1/15, Audiobook, 1
1/1/15, Befriend, 1
1/1/15, Called Mom, 1
1/1/15, Code, 1
1/1/15, Create, 1
1/1/15, Critical, 1
1/1/15, Emailed Friend, 1
1/2/15, 10k Steps, 1
1/2/15, Ate Poorly, 1
1/2/15, Audiobook, 1
1/2/15, Befriend, 1
1/2/15, Called Mom, 1
1/2/15, Create, 1
1/2/15, Emailed Friend, 1
1/2/15, Exercise, 1
1/2/15, Friend Contact, 1
1/2/15, Great Day, 1
1/2/15, Write, 1
1/3/15, 7k Steps, 1
1/3/15, Ate Poorly, 1
1/3/15, Befriend, 1
1/3/15, Create, 1
1/3/15, Emailed Friend, 1
1/3/15, Friend Contact, 1
1/3/15, Great Day, 1
1/3/15, Happiness, 1
1/3/15, Health, 1
1/3/15, Videogame, 1
1/3/15, Walked With Michelle, 1
1/3/15, Write, 1
1/4/15, 7k Steps, 1
1/4/15, Ate Poorly, 1
1/4/15, Audiobook, 1
1/4/15, Great Day, 1
1/4/15, Happiness, 1
1/4/15, Health, 1
1/4/15, Impatient, 1
1/4/15, Love, 1
1/4/15, Movie With Michelle, 1
我想创建一个图表,为每个键显示一行,每天使用条形码,该键为1。这是我想要的输出的一个例子:
这是我用Python和Matplotlib痛苦渲染的那个。
我正在寻找最好和最简单的方法在R中渲染这样的情节,或许是ggplot2。我曾计划在ggplot2中使用条形图,每个键都有一个循环。这是我的代码示例:
library(ggplot2)
library(reshape)
#library(ggtheme)
# 2015 Lifedata Processing
d <- read.csv("lifedata_2015.csv")
d$datetime <- as.Date(d$datetime, "%m/%d/%Y")
# Create a new dataframe with a subset of keys
r <- d[d$key %in% c("Read", "Audiobook"), ]
# Put 1s in all values.
r$value <- 1
# Generate a data frame for each day with a value of 1 and a key of "alldates"
mydates <- data.frame("datetime" = seq(as.Date("2015/1/1"), as.Date("2015/12/31"), "days"), "key" = "alldates", "value" = 1)
# combine two data frames, one after the other
n <- rbind(r, mydates)
# Transform into a wide data frame based on datetime and key with mean as the value.
c <- cast(n, datetime~key, mean)
# Turn NaNs into 0
c[is.na(c)] = 0
for(name in c("Read", "Audiobook")){
plt <- c(plt, ggplot(data=c, aes_string(x="datetime", y=name)) +
geom_bar(stat="Identity", width=1))
print(plot)
}
svg("~/Desktop/tagplot.svg")
grid.arrange(plt, ncol = 1, main = "Read")
dev.off()
这种技术似乎不起作用。
在示例中绘制事件数据的更好方法是什么?
答案 0 :(得分:6)
这是另一种方法,大量借鉴@ TylerRinker的答案。据我所知,如果该活动连续两天进行,他的回答只会显示出来。
library(dplyr)
library(ggplot2)
首先,我们从泰勒借这些作品。我们需要不错的标签。
d <- d %>%
mutate(datetime = as.Date(datetime, "%m/%d/%y"))
key <- d %>%
group_by(key) %>%
summarize(n = length(datetime), perc = n/length(unique(d$datetime))) %>%
arrange(perc) %>%
mutate(
new = paste0(key, " - ", n, "(", 100*perc, "%)"),
new = factor(new, levels = new)
)
我们使用geom_line
而不是geom_tile
来获取每天填充矩形的值为1,缺少的天数仍为空。我们使用geom_hline
在y方向上创建一些分隔。
left_join(d, key) %>%
ggplot(aes(datetime, y = new)) +
geom_tile(show.legend = FALSE, fill = 'grey50') +
geom_hline(yintercept = seq(0.5, length(levels(d$key))),
color = 'white', size = 2) +
theme_classic() +
scale_x_date(date_breaks = "1 month", date_labels = "%b", expand = c(0, 0)) +
ylab(NULL) +
xlab(NULL)
答案 1 :(得分:4)
这是一个不错的开始,但需要解决一些较小的细节:
library(ggplot2)
library(tidyr)
library(dplyr)
d <- d %>%
mutate(datetime = as.Date(datetime, "%m/%d/%y"))
key <- d %>%
group_by(key) %>%
summarize(
n = length(datetime),
perc = n/length(unique(d$datetime))
) %>%
arrange(perc) %>%
mutate(
new = paste0(key, " - ", n, "(", 100*perc, "%)"),
new = factor(new, levels = new)
)
left_join(d, key) %>%
ggplot(aes(datetime, y = new)) +
geom_line(size = 6, alpha=.3) +
theme_minimal() +
scale_x_date(date_breaks = "1 month", date_labels = "%b", expand = c(0, 0)) +
ylab(NULL) +
xlab(NULL)