R-另一个变量的排序因数返回NA:如何解决?

时间:2018-07-28 15:04:24

标签: r ggplot2 dplyr

我在R中的getJson() { this.getCarparkData().subscribe( data => { console.log(data) // here you can work with your json } } 中有一个小tibble,像这样:

animal_observations

> animal_observations # A tibble: 12 x 3 SPECIES n_detections detection_rate <fct> <int> <dbl> 1 Badger 203 0.190 2 Blackbird 463 0.433 3 Domestic cat 292 0.273 4 Grey squirrel 788 0.736 5 Hedgehog 179 0.167 6 Nothing 960 0.897 7 Pheasant 476 0.445 8 Rabbit 602 0.563 9 Red fox 424 0.396 10 Roe Deer 621 0.580 11 Small rodent 198 0.185 12 Woodpigeon 381 0.356 是我见过动物的次数,n_detections是见过动物detection_rate的频率(在其他地方计算)。

这里是SPECIES

dput()

我想通过structure(list(SPECIES = structure(1:12, .Label = c("Badger", "Blackbird", "Domestic cat", "Grey squirrel", "Hedgehog", "Nothing", "Pheasant", "Rabbit", "Red fox", "Roe Deer", "Small rodent", "Woodpigeon"), class = "factor"), n_detections = c(203L, 463L, 292L, 788L, 179L, 960L, 476L, 602L, 424L, 621L, 198L, 381L), detection_rate = c(0.189719626168224, 0.432710280373832, 0.272897196261682, 0.736448598130841, 0.167289719626168, 0.897196261682243, 0.444859813084112, 0.562616822429907, 0.396261682242991, 0.580373831775701, 0.185046728971963, 0.35607476635514)), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -12L)) 订购我的动物(SPECIES,以便进行下游detection_rate的投放(例如ggplot(),其中geom_col()和列将按aes(x = SPECIES, y = detection_rate)等进行排序,等等,这是我尝试运行的行:

detection_rate

奇怪的是,这是生成的animal_observations$SPECIES <- factor(animal_observations$SPECIES, levels = animal_observations[order(animal_observations$detection_rate, decreasing = F), "SPECIES"])

tibble

如您所见,所有> animal_observations # A tibble: 12 x 3 SPECIES n_detections detection_rate <fct> <int> <dbl> 1 NA 203 0.190 2 NA 463 0.433 3 NA 292 0.273 4 NA 788 0.736 5 NA 179 0.167 6 NA 960 0.897 7 NA 476 0.445 8 NA 602 0.563 9 NA 424 0.396 10 NA 621 0.580 11 NA 198 0.185 12 NA 381 0.356 都变成了NA ...我在做什么错了,我该如何纠正,以便对SPECIES因子进行排序(“水平”)? SPECIES,以便在输出detection_rate中将所有动物名称保留在tibble列中?谢谢。

3 个答案:

答案 0 :(得分:1)

就这么简单

library(dplyr)
animal_observations %>% arrange(desc(detection_rate))

答案 1 :(得分:1)

另一个选择是像这样使用order

# df = your dput()
with(df, df[order(detection_rate, decreasing = TRUE),])

和输出

# A tibble: 12 x 3
   SPECIES       n_detections detection_rate
   <fct>                <int>          <dbl>
 1 Nothing                960          0.897
 2 Grey squirrel          788          0.736
 3 Roe Deer               621          0.580
 4 Rabbit                 602          0.563
 5 Pheasant               476          0.445
 6 Blackbird              463          0.433
 7 Red fox                424          0.396
 8 Woodpigeon             381          0.356
 9 Domestic cat           292          0.273
10 Badger                 203          0.190
11 Small rodent           198          0.185
12 Hedgehog               179          0.167

答案 2 :(得分:1)

reorder()内使用ggplot()

animal_observations %>% 
  ggplot(aes(reorder(SPECIES, detection_rate), detection_rate)) +
  geom_bar(stat="identity") + 
  theme(axis.text.x = element_text(angle=90))

plot

更新
要在进入ggplot()之前设置新订单,请使用mutateorder现有因素来设置新订单:

animal_observations %>% 
  mutate(species = factor(SPECIES, levels=SPECIES[order(detection_rate)])) %>%
  ggplot(aes(species, detection_rate)) +
  geom_bar(stat="identity") + 
  theme(axis.text.x = element_text(angle=90))