我有一个如下所示的数据集:
!important
我可以运行什么来查看工资分配(第一栏)。具体来说,我想看看有多少人的工资低于300美元。
我可以运行什么ggplot功能?
答案 0 :(得分:1)
试试这个:
library(dplyr)
library(ggplot2)
df <- df %>% filter(wage < 300)
qplot(wage, data = df)
答案 1 :(得分:1)
您可以获得累积直方图:
library(ggplot2)
ggplot(df,aes(wage))+geom_histogram(aes(y=cumsum(..count..)))+
stat_bin(aes(y=cumsum(..count..)),geom="line",color="green")
如果您特别想知道具有特定条件的条目数,请在 base r
中使用以下内容:
count(df[df$wage > 1000,])
## # A tibble: 1 x 1
## n
## <int>
## 1 3
<强> 数据:的强>
df <- structure(list(wage = c(769L, 808L, 825L, 650L, 562L, 1400L,
600L, 1081L, 1154L, 1000L), hours = c(40L, 50L, 40L, 40L, 40L,
40L, 40L, 40L, 45L, 40L), iq = c(93L, 119L, 108L, 96L, 74L, 116L,
91L, 114L, 111L, 95L), kww = c(35L, 41L, 46L, 32L, 27L, 43L,
24L, 50L, 37L, 44L), educ = c(12L, 18L, 14L, 12L, 11L, 16L, 10L,
18L, 15L, 12L), exper = c(11L, 11L, 11L, 13L, 14L, 14L, 13L,
8L, 13L, 16L), tenure = c(2L, 16L, 9L, 7L, 5L, 2L, 0L, 14L, 1L,
16L), age = c(31L, 37L, 33L, 32L, 34L, 35L, 30L, 38L, 36L, 36L
), married = c(1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L), black = c(0L,
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), south = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), urban = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 0L, 1L), sibs = c(1L, 1L, 1L, 4L, 10L, 1L, 1L, 2L, 2L, 1L
), brthord = c(2L, NA, 2L, 3L, 6L, 2L, 2L, 3L, 3L, 1L), meduc = c(8L,
14L, 14L, 12L, 6L, 8L, 8L, 8L, 14L, 12L)), .Names = c("wage",
"hours", "iq", "kww", "educ", "exper", "tenure", "age", "married",
"black", "south", "urban", "sibs", "brthord", "meduc"), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))