我拥有整齐的数据格式的物种数据。为了包含在报告中,我需要减小表格的宽度,方法是对每个组仅列出一次较高的顺序(王国,门,阶级等)。
我目前有:
...或类似的东西
...每个高阶仅给出一次,该高阶内的每个物种都列在下面。
此列表很长,因此需要基于脚本。我已经看过使用dplyr
的情况,但是看不到实现这一目标的方法。
如果需要,下面是可复制的示例数据。
exampledata <- structure(list(KINGDOM = c("Animalia", "Animalia", "Animalia",
"Animalia", "Animalia", "Animalia", "Animalia", "Animalia", "Animalia",
"Animalia", "Animalia", "Animalia"), PHYLYM = c("Chordata", "Chordata",
"Chordata", "Chordata", "Chordata", "Chordata", "Chordata", "Chordata",
"Chordata", "Chordata", "Chordata", "Chordata"), CLASS = c("Amphibia",
"Amphibia", "Amphibia", "Amphibia", "Amphibia", "Aves", "Aves",
"Aves", "Aves", "Aves", "Aves", "Aves"), ORDER = c("Anura", "Anura",
"Anura", "Anura", "Anura", "Accipitriformes", "Ciconiiformes",
"Gruiformes", "Passeriformes", "Passeriformes", "Pelecaniformes",
"Pelecaniformes"), FAMILY = c("Ranidae", "Ranidae", "Rhacophoridae",
"Rhacophoridae", "Rhacophoridae", "Accipitridae", "Ciconiidae",
"Gruidae", "Muscicapidae", "Muscicapidae", "Threskiornithidae",
"Threskiornithidae"), SCIENTIFICNAME = c("Hylarana attigua",
"Hylarana taipehensis", "Philautus", "Polypedates leucomystax",
"Theloderma asperum", "Aviceda jerdoni", "Leptoptilos javanicus",
"Antigone antigone", "Cyanoptila cyanomelana", "Cyornis hainanus",
"Pseudibis davisoni", "Thaumatibis gigantea"), OTHERDATA = c("XYZ",
"ABC", "XYZ", "ABC", "XYZ", "XYZ", "ABC", "XYZ", "ABC", "ABC",
"XYZ", "XYZ")), row.names = c(NA, 12L), class = "data.frame")
答案 0 :(得分:3)
删除数据通常不是一个好主意,但是我看到了用例。
只要您的数据顺序正确,就可以执行以下操作:
iris %>%
mutate(Species = if_else(duplicated(Species),"", as.character(Species)))
请注意,as.character()
仅是必需的,因为Species
是此数据集中的一个因素。
编辑示例数据:
exampledata %>%
mutate_at(vars("KINGDOM", "PHYLYM", "CLASS","ORDER", "FAMILY", "SCIENTIFICNAME"), ~ if_else(duplicated(.x),"", as.character(.x)) )
提供如下表格:
KINGDOM PHYLYM CLASS ORDER FAMILY SCIENTIFICNAME OTHERDATA
1 Animalia Chordata Amphibia Anura Ranidae Hylarana attigua XYZ
2 Hylarana taipehensis ABC
3 Rhacophoridae Philautus XYZ
4 Polypedates leucomystax ABC
5 Theloderma asperum XYZ
6 Aves Accipitriformes Accipitridae Aviceda jerdoni XYZ
7 Ciconiiformes Ciconiidae Leptoptilos javanicus ABC
8 Gruiformes Gruidae Antigone antigone XYZ
9 Passeriformes Muscicapidae Cyanoptila cyanomelana ABC
10 Cyornis hainanus ABC
11 Pelecaniformes Threskiornithidae Pseudibis davisoni XYZ
12 Thaumatibis gigantea XYZ
答案 1 :(得分:2)
如果要减少数据,我建议不要group_by
较高的顺序,而将其他详细信息另存为逗号分隔的字符串。
library(dplyr)
exampledata %>%
group_by(KINGDOM, PHYLYM, CLASS, ORDER, FAMILY) %>%
summarise_at(vars(SCIENTIFICNAME, OTHERDATA), toString)
# KINGDOM PHYLYM CLASS ORDER FAMILY SCIENTIFICNAME OTHERDATA
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 Animalia Chordata Amphibia Anura Ranidae Hylarana attigua, Hylarana taipehensis XYZ, ABC
#2 Animalia Chordata Amphibia Anura Rhacophoridae Philautus, Polypedates leucomystax, Theloderm… XYZ, ABC, X…
#3 Animalia Chordata Aves Accipitrifor… Accipitridae Aviceda jerdoni XYZ
#4 Animalia Chordata Aves Ciconiiformes Ciconiidae Leptoptilos javanicus ABC
#5 Animalia Chordata Aves Gruiformes Gruidae Antigone antigone XYZ
#6 Animalia Chordata Aves Passeriformes Muscicapidae Cyanoptila cyanomelana, Cyornis hainanus ABC, ABC
#7 Animalia Chordata Aves Pelecaniform… Threskiornith… Pseudibis davisoni, Thaumatibis gigantea XYZ, XYZ
使用此方法,您不会丢失任何信息,也可以减少数据框中的行数。您可以根据自己的喜好在group_by
和summarise_at
中添加/删除列。
答案 2 :(得分:0)