我正在寻找一个转换此表的解决方案:
____V1______V2______V3______V4______V5______V6
1: SP1 SP2 SP3 NA NA NA
2: SP1 SP3 SP6 NA NA NA
3: SP3 SP5 SP7 SP8 SP9 SP10
4: SP4 SP5 SP6 SP7 NA NA
进入这一个(每个物种的存在/不存在):
___SP1___SP2___SP3___SP4___SP5___SP6___SP7___SP8___SP9___SP10
1: 1 1 1 0 0 0 0 0 0 0
2: 1 0 1 0 0 1 0 0 0 0
3: 0 0 1 0 1 0 1 1 1 1
4: 0 0 0 1 1 1 1 0 0 0
据说我在表1中有很多行和很多物种(我不知道有多少)。
有什么想法吗?
答案 0 :(得分:6)
尝试
library(qdapTools)
res1 <- mtabulate(as.data.frame(t(df1)))
或者
library(reshape2)
res2 <- table(melt(as.matrix(df1), na.rm=TRUE)[,-2])
res2New <- res2[,paste0('SP',1:10)]
res2New
# value
# Var1 SP1 SP2 SP3 SP4 SP5 SP6 SP7 SP8 SP9 SP10
# 1 1 1 1 0 0 0 0 0 0 0
# 2 1 0 1 0 0 1 0 0 0 0
# 3 0 0 1 0 1 0 1 1 1 1
# 4 0 0 0 1 1 1 1 0 0 0
如果我们需要转换为&#39; data.frame&#39;
as.data.frame.matrix(res2New)
df1 <- structure(list(V1 = c("SP1", "SP1", "SP3", "SP4"), V2 = c("SP2",
"SP3", "SP5", "SP5"), V3 = c("SP3", "SP6", "SP7", "SP6"), V4 = c(NA,
NA, "SP8", "SP7"), V5 = c(NA, NA, "SP9", NA), V6 = c(NA, NA,
"SP10", NA)), .Names = c("V1", "V2", "V3", "V4", "V5", "V6"),
class = "data.frame", row.names = c(NA, -4L))
答案 1 :(得分:3)
使用reshape2回答:
data <- read.table(text="V1 V2 V3 V4 V5 V6
1: SP1 SP2 SP3 NA NA NA
2: SP1 SP3 SP6 NA NA NA
3: SP3 SP5 SP7 SP8 SP9 SP10
4: SP4 SP5 SP6 SP7 NA NA")
#identify lines
data$line <- 1:nrow(data)
#turn data into long format
melt_data <- melt(data,id.var="line", variable.name="column",
value.name="species")
#rearrange levels species as otherwise SP10 comes after SP1
melt_data$species_fact <- factor(melt_data$species,
levels=paste0("SP",1:10))
#turn into - different- wide format for result
result <- dcast(data=melt_data[!is.na(melt_data$species_fact),],
formula=line~species_fact,value.var="species_fact",
fun.aggregate=length)
result
产量
> result
line SP1 SP2 SP3 SP4 SP5 SP6 SP7 SP8 SP9 SP10
1 1 1 1 1 0 0 0 0 0 0 0
2 2 1 0 1 0 0 1 0 0 0 0
3 3 0 0 1 0 1 0 1 1 1 1
4 4 0 0 0 1 1 1 1 0 0 0