是否有一个看起来像这样的数据框
Forest Grass Shrub Water Binary
0.6 0.5 0.3 0.2 1
0.2 0.3 0.4 0.5 0
0.3 0.5 0.2 0.6 1
0.2 0.6 0.3 0.2 0
0.6 0.5 0.3 0.2 1
我希望R查看所有行和 写一个新的专栏,写下最大数量的土地覆盖名称,这样我就有了这样一张桌子。
Forest Grass Shrub Water Binary Most
0.6 0.5 0.3 0.2 1 Forest
0.2 0.3 0.4 0.5 0 Water
0.3 0.5 0.2 0.6 1 Water
0.2 0.6 0.3 0.2 0 Grass
0.6 0.5 0.3 0.2 1 Forest
然后我希望R查看二进制列并计算森林-1森林-0,水-1,水0组合出现的频率
不幸的是,我对如何做到这一点没有任何线索,非常感谢你的帮助!
答案 0 :(得分:0)
您可以使用apply
代替max.col
,而不是names(mydf)[max.col(mydf[-length(mydf)])]
# [1] "Forest" "Water" "Water" "Grass" "Forest"
:
## Create your "Most" column.
## This assumes "Binary" to be the last column.
## Use names along with `setdiff`, index positions, or other approaches
## if this is not the case with your actual data.
mydf$Most <- apply(mydf[-length(mydf)], 1, function(x) names(x)[which.max(x)])
mydf
# Forest Grass Shrub Water Binary Most
# 1 0.6 0.5 0.3 0.2 1 Forest
# 2 0.2 0.3 0.4 0.5 0 Water
# 3 0.3 0.5 0.2 0.6 1 Water
# 4 0.2 0.6 0.3 0.2 0 Grass
# 5 0.6 0.5 0.3 0.2 1 Forest
## This is the tabulation step.
## I'm assuming this is separate from the original data.frame
## as it isn't shown to be as part of your desired output.
table(mydf[c("Most", "Binary")])
# Binary
# Most 0 1
# Forest 0 2
# Grass 1 0
# Water 1 1
通过它的声音,你可能正在寻找这样的东西:
{{1}}
答案 1 :(得分:-1)
library(data.table)
library(reshape2)
setDT(dt)
dt[,RowNo := .I]
dt2 <- melt(dt[,c(setdiff(colnames(dt),'Binary')), with = F], id.vars = c('RowNo'))
dt2 <- unique(dt2[, Most := as.character(variable)[which.max(value)], by = RowNo][,list(RowNo,Most)])
dt <- merge(dt2,dt, by = 'RowNo')
dt[,list(.N), by = list(Binary,Most)]
输出 -
Binary Most N
1: 1 Forest 2
2: 0 Water 1
3: 1 Water 1
4: 0 Grass 1