我目前正在Coursera上做一个可重复数据课程,其中一个问题是每天步数的平均值和中位数,我有这个但是当我用摘要函数确认它时,Mean和Median的摘要版本是不同的。我是通过knitr
运行的为什么会这样? **下面是一个编辑,显示到目前为止我的所有脚本,包括原始数据的链接:
##Download the data You have to change https to http to get this to work in knitr
target_url <- "http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2Factivity.zip"
target_localfile = "ActivityMonitoringData.zip"
if (!file.exists(target_localfile)) {
download.file(target_url, destfile = target_localfile)
}
Unzip the file to the temporary directory
unzip(target_localfile, exdir="extract", overwrite=TRUE)
List the extracted files
list.files("./extract")
## [1] "activity.csv"
Load the extracted data into R
activity.csv <- read.csv("./extract/activity.csv", header = TRUE)
activity1 <- activity.csv[complete.cases(activity.csv),]
str(activity1)
## 'data.frame': 15264 obs. of 3 variables:
## $ steps : int 0 0 0 0 0 0 0 0 0 0 ...
## $ date : Factor w/ 61 levels "2012-10-01","2012-10-02",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ interval: int 0 5 10 15 20 25 30 35 40 45 ...
Use a histogram to view the number of steps taken each day
histData <- aggregate(steps ~ date, data = activity1, sum)
h <- hist(histData$steps, # Save histogram as object
breaks = 11, # "Suggests" 11 bins
freq = T,
col = "thistle1",
main = "Histogram of Activity",
xlab = "Number of daily steps")
Obtain the Mean and Median of the daily steps
steps <- histData$steps
mean(steps)
## [1] 10766
median(steps)
## [1] 10765
summary(histData$steps)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 41 8840 10800 10800 13300 21200
summary(steps)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 41 8840 10800 10800 13300 21200
sessionInfo()
## R version 3.1.1 (2014-07-10)
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## locale:
## [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
## [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
## [5] LC_TIME=English_Australia.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.6
##
## loaded via a namespace (and not attached):
## [1] evaluate_0.5.5 formatR_1.0 stringr_0.6.2 tools_3.1.1
答案 0 :(得分:6)
实际上,答案 正确,你只是打错了。您正在某处设置digits
选项。
将这个放在脚本之前:
options(digits=12)
你将拥有:
mean(steps)
# [1] 10766.1886792
median(steps)
# [1] 10765
summary(steps)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 41.0000 8841.0000 10765.0000 10766.1887 13294.0000 21194.0000
请注意summary
使用max(3, getOption("digits")-3)
来打印多少个数字。所以它稍微圆了一点(10766.1887而不是10766.1886792)。