我有一个两列,每列13893行的数据集,如下所示:
trip species
120318 ADHJ
120918 FJIW
120918 ADHJ
180817 ADHJ
180817 FJIW
180817 FJIW
099217 ADHJ
我想更改它,以便物种名称成为列标题,并计算每次旅行中每种物种的计数。结果如下:
trip ADHJ FJIW
120318 1 0
120918 1 1
180817 1 2
099217 1 0
答案 0 :(得分:1)
数据:
a <- read.table(header=T, stringsAsFactors=F, text="
trip species
120318 ADHJ
120918 FJIW
120918 ADHJ
180817 ADHJ
180817 FJIW
180817 FJIW
099217 ADHJ")
table(a$trip, a$species)
#
# ADHJ FJIW
# 99217 1 0
# 120318 1 0
# 120918 1 1
# 180817 1 2
xtabs(~ trip + species, data = a)
# species
# trip ADHJ FJIW
# 99217 1 0
# 120318 1 0
# 120918 1 1
# 180817 1 2
dplyr
library(dplyr)
library(tidyr)
a %>%
group_by(trip, species) %>%
tally() %>%
spread(species, n, fill = 0)
# # A tibble: 4 x 3
# # Groups: trip [4]
# trip ADHJ FJIW
# <int> <dbl> <dbl>
# 1 99217 1 0
# 2 120318 1 0
# 3 120918 1 1
# 4 180817 1 2
data.table
library(data.table)
aDT <- as.data.table(a)
dcast(aDT, trip ~ species, fill = 0)
# Using 'species' as value column. Use 'value.var' to override
# Aggregate function missing, defaulting to 'length'
# trip ADHJ FJIW
# 1: 99217 1 0
# 2: 120318 1 0
# 3: 120918 1 1
# 4: 180817 1 2