我刚开始在本科生的第二个学期学习R.很难在这里找到很多东西。我希望你能指出我正确的方向吗?
最初我不确定为什么要求线路出错,然后我认为他们正在尝试加载不在我的R安装中的软件包,所以我把它们打开了。我还想知道给我的数据是放在一个名为auto_mpg.txt的文件中。但是那条线之后的错误很难让我想象。
你能帮我理解一下吗?
R version 3.4.1 (2017-06-30) -- "Single Candle"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Workspace loaded from ~/R/my_project/.RData]
> #################################################################
> ################# NIKITA TIWARI ############################
> #################################################################
> ################# DATA SUMMARY PJ1 ############################
> ## mpg cyl wt region
> ## Min. : 9.00 Min. :3.000 Min. :1613 Min. :1.000
> ## 1st Qu.:17.50 1st Qu.:4.000 1st Qu.:2224 1st Qu.:1.000
> ## Median :23.00 Median :4.000 Median :2804 Median :1.000
> ## Mean :23.51 Mean :5.455 Mean :2970 Mean :1.573
> ## 3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:3608 3rd Qu.:2.000
> ## Max. :46.60 Max. :8.000 Max. :5140 Max. :3.000
> ##
> ## model
> ## ford pinto : 6
> ## amc matador : 5
> ## ford maverick : 5
> ## toyota corolla: 5
> ## amc gremlin : 4
> ## amc hornet : 4
> ## (Other) :369
> #################################################################
> #################################################################
>
> require(gridExtra) #given to me
Loading required package: gridExtra
Warning message:
package ‘gridExtra’ was built under R version 3.4.2
> require(ggplot2) #given to me
Loading required package: ggplot2
Warning message:
package ‘ggplot2’ was built under R version 3.4.2
>
>
> auto <- read.table("auto_mpg.txt", sep="\t", header = TRUE) #given to me
> head(auto)
mpg.............cyl..............wt...........region
1 Min. : 9.00 Min. :3.000 Min. :1613 Min. :1.000
2 1st Qu.:17.50 1st Qu.:4.000 1st Qu.:2224 1st Qu.:1.000
3 Median :23.00 Median :4.000 Median :2804 Median :1.000
4 Mean :23.51 Mean :5.455 Mean :2970 Mean :1.573
5 3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:3608 3rd Qu.:2.000
6 Max. :46.60 Max. :8.000 Max. :5140 Max. :3.000
> summary(auto) #given to me
mpg.............cyl..............wt...........region
1st Qu.:17.50 1st Qu.:4.000 1st Qu.:2224 1st Qu.:1.000 :1
3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:3608 3rd Qu.:2.000 :1
Max. :46.60 Max. :8.000 Max. :5140 Max. :3.000 :1
Mean :23.51 Mean :5.455 Mean :2970 Mean :1.573 :1
Median :23.00 Median :4.000 Median :2804 Median :1.000 :1
Min. : 9.00 Min. :3.000 Min. :1613 Min. :1.000 :1
>
>
> auto$cyl <- as.factor(auto$cyl)
Error in `$<-.data.frame`(`*tmp*`, cyl, value = integer(0)) :
replacement has 0 rows, data has 6
> auto$region <- as.factor(auto$region)
Error in `$<-.data.frame`(`*tmp*`, region, value = integer(0)) :
replacement has 0 rows, data has 6
> auto$cyl[auto$cyl == 1] <- "USA"
Error in `$<-.data.frame`(`*tmp*`, cyl, value = character(0)) :
replacement has 0 rows, data has 6
> auto$cyl[auto$cyl == 2] <- "EUR"
Error in `$<-.data.frame`(`*tmp*`, cyl, value = character(0)) :
replacement has 0 rows, data has 6
> auto$cyl[auto$cyl == 3] <- "ASIA"
Error in `$<-.data.frame`(`*tmp*`, cyl, value = character(0)) :
replacement has 0 rows, data has 6
> auto$region[auto$region == 1] <- "USA"
Error in `$<-.data.frame`(`*tmp*`, region, value = character(0)) :
replacement has 0 rows, data has 6
> auto$region[auto$region == 2] <- "EUR"
Error in `$<-.data.frame`(`*tmp*`, region, value = character(0)) :
replacement has 0 rows, data has 6
> auto$region[auto$region == 3] <- "ASIA"
Error in `$<-.data.frame`(`*tmp*`, region, value = character(0)) :
replacement has 0 rows, data has 6
>
> auto$cyl <- factor(auto$cyl, levels=c("USA","EUR","ASIA"))
Error in `$<-.data.frame`(`*tmp*`, cyl, value = integer(0)) :
replacement has 0 rows, data has 6
> auto$region <- factor(auto$region, levels=c("USA","EUR","ASIA"))
Error in `$<-.data.frame`(`*tmp*`, region, value = integer(0)) :
replacement has 0 rows, data has 6
> summary(auto)
mpg.............cyl..............wt...........region
1st Qu.:17.50 1st Qu.:4.000 1st Qu.:2224 1st Qu.:1.000 :1
3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:3608 3rd Qu.:2.000 :1
Max. :46.60 Max. :8.000 Max. :5140 Max. :3.000 :1
Mean :23.51 Mean :5.455 Mean :2970 Mean :1.573 :1
Median :23.00 Median :4.000 Median :2804 Median :1.000 :1
Min. : 9.00 Min. :3.000 Min. :1613 Min. :1.000 :1
>
> ###################################################################
> # calculate the mean mpg for cars, broken out by the number of
> # cylinders in the car.
> ###################################################################
>
>
> ###################################################################
> #Provide a description of what you notice above
> ###################################################################
> # write a line describing the purpose of the next code chunk
> ###################################################################
>
>
> ###################################################################
> #provide a dsecriptiojn of what you notice above
> ###################################################################
> # Histograms
> ###################################################################
> # the next chunk of code is creating bar graphs and filling them
> # with either region or cyl , with count on the y axis and
> # mpg on the x axis
> ###################################################################
> ggplot(auto, aes(x = mpg, y = count, fill = cyl)) + geom_bar()
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'count' not found
> ggplot(auto, aes(x = mpg, y = count, fill = region)) + geom_bar()
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'count' not found
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # write a line describe the purpose of the next chunck of code
> ###################################################################
> b1 <- ggplot(auto, aes(x=cyl, fill=cyl)) + geom_bar()
> b2 <- ggplot(auto, aes(x=region, fill=region)) + geom_bar()
> ###################################################################
> # the above code is used to creat a specific bar graph and it's
> # x axis is the same a the fill so that it is more clear
> ###################################################################
> # the next chunck of code is used to fill the graph with the bars
> ###################################################################
> b3 <- ggplot(auto, aes(x = cyl, fill = region)) + geom_bar(position = "fill")
> b4 <- ggplot(auto, aes(x = region, fill = cyl)) + geom_bar(position = "fill")
> grid.arrange(b1, b2 ,b3, b4, ncol= 4)
Error in FUN(X[[i]], ...) : object 'cyl' not found
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # the code below is going to make box plots of the data
> ###################################################################
> bp1 <- ggplot ( auto, aes(x = cyl, y = mpg, fill = cyl)) + geom_boxplot()
> bp2 <- ggplot ( auto, aes(x = region, y = mpg, fill = region)) + geom_boxplot()
> grid.arrange(bp1, bp2, ncol = 2)
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'cyl' not found
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # the code below is going to make box plots of the data spilt into
> # 3, 4, ,5, 6, 8 data with mpg being y axis, region being x axis
> # and region being the fill
> ###################################################################
>
>
>
>
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # the code below is going to a jitter plot so basically its just a
> # lot dots on a graph
> ###################################################################
> jp1 <- ggplot ( auto, aes(x = wt, y = mpg, fill = cyl)) + geom_jitter()
> jp2 <- ggplot ( auto, aes(x = wt, y = mpg, fill = region)) + geom_jitter()
> grid.arrange(jp1, jp2, ncol = 2)
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'wt' not found
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # the code below is going to a jitter plot but separate the graphs
> # into the different regions and only that region shows up on the
> # graph
> ###################################################################
> fg1 <- ggplot ( auto, aes(x = wt, y = mpg, fill = cyl)) + geom_jitter()
> + facet_grid(region)
Error in facet_grid(region) : object 'region' not found
> fg1
Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'wt' not found
>
>
> ###################################################################
> # provide a description of what you notice above
> ###################################################################
> # the code below is going to a jitter plot but separate the graphs
> # into the different regions and into different cyl and only that
> # cyl and that region show up in that graph
> ###################################################################
>
答案 0 :(得分:1)
作为起点,错误消息
Error in $<-.data.frame(tmp, cyl, value = integer(0)) :
replacement has 0 rows, data has 6
因为e而发生。 G。 auto$cyl
会返回NULL
。
这是因为执行时
auto <- read.table("auto_mpg.txt", sep="\t", header = TRUE)
根据您的代码,您的列名称不应该像它们那样命名。
而不是e。 G。 cyl
,auto
对象中的相应列似乎是cyl..............
。
所以你要做的第一件事是交叉检查你的read.table
函数是否按预期工作,列是按照它们的名字命名的。
此外,错误Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'count' not found
表示您的count
对象中没有名为auto
的列(执行summary(auto)
时也未提及)。因此,我认为您必须自己计算列count
的值。
最后,Don't know how to automatically pick scale for object of type tbl_df/tbl/data.frame. Defaulting to continuous.
Error in FUN(X[[i]], ...) : object 'cyl' not found
的目标是再次对列进行错误的命名。这也指向代码段末尾的以下错误消息。
答案 1 :(得分:0)
欢迎使用Stack Overflow!使用R,您将学习一种重要的编程语言并开始使用有用的平台。也许您想要考虑阅读how you can create a minimal example和ask great questions。除了发布代码时,了解您希望使用代码实现的目标总是有益的 - 我建议在每个重要步骤中使用# comments
。
正如@ rui-barradas怀疑我认为在阅读你的数据时存在问题(这是正常的情况:)虽然没有类似的数据示例很难说。您可以使用head(auto)
并将结果添加到您的问题中,如果它不是任何方式的私人数据。
但是,您可能希望使用View(auto)
来检查R是否正确读取数据集并生成所需的数据帧。 (您可以class(auto)
检查它是否确实是data.frame。)如果结果看起来很丑(例如列名显示很奇怪),请使用?read.table
并探索如何自定义read.table()
的选项(例如stringsAsFactors = FALSE
或sep = ";"
(即* .txt文件数据中使用的分隔符))以控制R将以何种方式读取数据。