用于数据分发的ggplot语法

时间:2018-12-06 17:00:43

标签: r ggplot2 histogram data-visualization distribution

我正在尝试绘制beforeMinWageLaw和afterMinWageLaw变量的数据分布,但是当我将其存储在df中而不是seattleData中时,r说“错误:美学必须为长度1或与数据相同(43 ): X”。我怎样才能解决这个问题?另外,我该如何做一个正态概率图来了解数据的正态性?谢谢。

#Import Data
#seattleData <- read.table(file=file.choose(),
#                          header=T, sep=",",)

library(ggplot2)

#Define Variables
 food_drink_workers <- seattleData$food_drink_workers
 MinWage <- seattleData$washington_state_minwage
 afterMinWageLaw <- food_drink_workers[304:346]
 beforeMinWageLaw <- food_drink_workers[1:303]
 df <- data.frame(seattleData)

#Display Data Distribution with ggplot
 x <-ggplot(df, aes(x=food_drink_workers)) + 
  geom_histogram(mapping = aes(y = ..density..), color="black",     fill="white") +
  geom_density(alpha=.2, fill="blue")
  x + geom_vline(xintercept = c(108.8636), linetype = "dashed", color = "red") + 
    ggtitle("Distribtution of the Data") + xlab("Seattle MSA Food and Drink          Workers") + ylab("Density")

#Conduct Two Sample t-test
 options(scipen = 100)
 tTest <- t.test(beforeMinWageLaw, afterMinWageLaw, mu=0, alternative = "less",
                conf=.95, var.equal = F, paired = F)

您可以在此处下载数据:https://fred.stlouisfed.org/series/SMU53426607072200001SA

Screenshot

1 个答案:

答案 0 :(得分:0)

您收到此错误消息“错误:美学必须为长度1或与数据(43):x”相同,因为向量afterMinWageLaw的长度为43个值,而beforeMinWageLaw的长度为长度为303个值,这就是为什么我不能在相同的美学aes()中引用它们。

我会在一个图中使用不同的可视化,以便您也可以使用不同的数据长度或行数来设置不同的美观度。首先,我将您的数据分为两个数据框,一个用于法律之前,另一个用于法律之后。使用ggplot,您可以在一个绘图中引用不同的数据帧,例如您的情况,如下所示:

#set row indicex ranges for before and after law
row_index_range_before <- 1:303;
row_index_range_after <- 304:346;

#define two data frames
df_before <- data.frame(seattleData)[row_index_range_before, ];
df_after <- data.frame(seattleData)[row_index_range_after, ];

#display data distributions of both data frames with ggplot
x <- ggplot() + 
  geom_histogram(
    data = df_before
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,color = "blue")
    ,fill = "white") +
  geom_histogram(
    data = df_after
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,color = "red")
    ,fill = "white") +
  geom_density(
    data = df_before
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,fill = "blue")
    ,alpha = .2) +
  geom_density(
    data = df_after
    ,mapping = aes(
      x = food_drink_workers
      ,y = ..density..
      ,fill = "red")
    ,alpha = .2) +
  scale_colour_manual(
    name = "Color"
    ,values = c("blue" = "blue", "red" = "red")
    ,labels = c("blue" = "Before Law", "red" = "After Law")) +
  scale_fill_manual(
    name = "Fill"
    ,values = c("blue" = "blue", "red" = "red")
    ,labels = c("blue" = "Before Law","red" = "After Law"));

x + geom_vline(
  xintercept = c(108.8636)
  ,linetype = "dashed"
  ,color = "red") + 
ggtitle("Distribtution of the Data") + 
  xlab("Seattle MSA Food and Drink          Workers") + 
  ylab("Density");

但是通过这种方式,我认为您也可以在afterMinWageLaw中将beforeMinWageLawx引用为aes(),并删除引用数据帧的data。 / p>

要同时绘制图例,您需要在color中设置fillaes()并将scale_colour_manual()scale_fill_manual()添加到图中。{{3} }