我刚开始使用R,并希望创建一个通用函数,我可以为qplot()
函数指定“fill”参数。
graph_search <- function(group) {
qplot(time, data=subset, geom="density", fill=path, alpha=I(.5))
}
graph_search("path");
graph_search("code"); # does not work
理想情况下,我会将fill=path
替换为fill=group
,方式与其相同:
data_max <- function(size = 5) {
print(tail(subset[order(subset$time),], n=size))
}
data_max(10);
我正在使用它查看一些WebServer日志,其中每条记录(请求)都有一个time
(在几秒钟内执行需要多长时间),path
(没有请求的URL)查询字符串),响应code
(例如200,301),登录的user
的ID等。
使用以下查询创建subset
变量:
subset <- subset(data, code != 302 & time > 0.2 & path!="/not/this/path/")
subset <- subset(data, code != 302 & grepl("^/admin/", path) & time > 0)
subset <- subset(data, code == 500)
这些适用于:
graph_frequency <- function() {
# hist(subset$time, xlab="time", col="lightblue", main="Web 1")
qplot(time, data=subset, geom="density", fill=code, alpha=I(.5))
}
graph_history <- function() {
# plot(time ~ timestamp, data=subset, type='h', xlab='date', ylab='time')
plot(subset$timestamp, subset$time, type='h', xlab='date', ylab='time')
}
虽然这与问题无关(但可以随意评论如何改进),Apache配置使用:
LogFormat "%h %l %u [%{LOG_INFO}n] [%{%Y-%m-%d %H:%M:%S}t] [%D/%{TIME_INFO}n] \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" inc_info
非Apache变量来自PHP:
if (function_exists('apache_note')) {
apache_note('LOG_INFO', USER_ID);
apache_note('TIME_INFO', number_format(round((microtime(true) - FRAMEWORK_START), 4), 4));
}
其中R以:
开头library("stringr")
library("ggplot2")
访问日志解析为:
data_load <- function(log_path) {
data = read.table(log_path, sep=" ")
data$timestamp = as.POSIXct(strptime(paste(data[,5], data[,6]), '[%Y-%m-%d %H:%M:%S]'))
data$timings <- str_match(data[,7], "\\[([0-9]*)/(.*)\\]")[,c(2,3)]
data$info <- str_match(data[,4], "\\[(.*)\\]")[,2]
data$request <- str_match(data[,8], "([A-Z]+) (/.*) HTTP")[,c(2,3)]
data = cbind(
timestamp = data[13],
apache = data[,14][,1],
time = data[,14][,2],
ip = data[,1],
info = data[,15],
method = data[,16][,1],
url = data[,16][,2],
code = data[,9],
size = data[,10],
referrer = data[,11],
agent = data[,12])
data$time <- as.numeric(as.character(data$time))
data$info <- as.numeric(as.character(data$info))
data$code <- as.character(data$code)
data$path <- gsub("\\?.*", "", data$url)
# Drop apache/referrer/agent
data = data[,-c(2,10,11)]
# Drop url (optional)
data = data[,-c(6)]
return(data)
}
答案 0 :(得分:2)
感谢@joran,这似乎有效:
graph_search <- function(group) {
ggplot(subset, aes(x = time)) + geom_density(aes_string(fill=group), alpha=I(.5))
}
直接使用ggplot
,而不是简写qplot
(又名&#34;快速情节&#34;)。