我是R的新手,不知道如何从我的数据中获得正确的输出:
我的数据:
row1 101 woody 5
row2 101 woody 0
row3 111 kiln 23
row4 200 weez 2
row5 315 rowt 0
例如,在第3行中,第3列中的元素大于0,其第1列值在101(第1行)和第111行(第3行)之间。因此,条件是,对于任何行,如果column3中的值大于0,并且如果其列1的值介于上面和下面的列之间。
必需的输出:
col1 col2 col3
row1 101 woody After_none
row2 101 woody 0
row3 111 kiln Between_woody_weez
row4 200 weez Between_Kiln_rowt
row5 315 rowt 0
如果有人可以帮助我,我会很高兴。感谢
添加了更多数据来运行Akru的代码:
col1 col2 col3
255 mwu 21
77031 netw 0
77031 netw 0
77031 netw 0
82513 cuu 91
88206 cxum 0
88206 cxum 0
88206 cxum 0
188450 xaii 25
188450 xaii 0
188450 xaii 0
188450 xaii 0
188450 xaii 0
199800 aau 0
代码使用此数据样本运行,但输出不太正确:
col1 col2 col3 colN
255 mwu 21 After_none
77031 netw 0 <NA>
77031 netw 0 <NA>
77031 netw 0 <NA>
82513 cuu 91 Between_mwu_netw
88206 cxum 0 <NA>
88206 cxum 0 <NA>
88206 cxum 0 <NA>
188450 xaii 25 Between_netw_cxum
188450 xaii 0 <NA>
188450 xaii 0 <NA>
188450 xaii 0 <NA>
188450 xaii 0 <NA>
199800 aau 0 <NA>
但预期的输出是:
col1 col2 col3
255 mwu 21
77031 netw 0
77031 netw 0
77031 netw 0
82513 Between_mwu_cxum 91
88206 cxum 0
88206 cxum 0
88206 cxum 0
188450 Between_cxum_aau 25
188450 xaii 0
188450 xaii 0
188450 xaii 0
188450 xaii 0
199800 aau 0
OR与额外列“colN”将是正常的
预期产出:
col1 col2 col3
255 mwu 21
77031 netw 0
77031 netw 0
77031 netw 0
82513 Between_mwu_cxum 91
88206 cxum 0
88206 cxum 0
88206 cxum 0
188450 Between_cxum_aau 25
188450 xaii 0
88450 xaii 0
188450 xaii 0
188450 xaii 0
199800 aau 0
答案 0 :(得分:0)
一种方法是:
indx <- df$col3 >0
df$colN <- df$col3
df$colN[indx] <- sapply(which(indx), function(i) {
i1 <- 1:(i - 1)
i2 <- (i + 1):nrow(df)
indx1 <- with(df, col1[i] > col1[i1])
indx2 <- with(df, col1[i] < col1[i2])
if (any(indx1) & any(indx2))
paste("Between", df$col2[i1][max(which(indx1))], df$col2[i2][min(which(indx2))],
sep = "_") else df$col3[i]
})
df
# col1 col2 col3 colN
#1 255 mwu 21 21
#2 77031 netw 0 0
#3 77031 netw 0 0
#4 77031 netw 0 0
#5 82513 cuu 91 Between_netw_cxum
#6 88206 cxum 0 0
#7 88206 cxum 0 0
#8 88206 cxum 0 0
#9 188450 xaii 25 Between_cxum_aau
#10 188450 xaii 0 0
#11 188450 xaii 0 0
#12 188450 xaii 0 0
#13 188450 xaii 0 0
#14 199800 aau 0 0
如果您想更改col2
,请执行以下操作:
df$col2[indx] <-sapply(which(indx), function(i) {
i1 <- 1:(i - 1)
i2 <- (i + 1):nrow(df)
indx1 <- with(df, col1[i] > col1[i1])
indx2 <- with(df, col1[i] < col1[i2])
if (any(indx1) & any(indx2))
paste("Between", df$col2[i1][max(which(indx1))], df$col2[i2][min(which(indx2))],
sep = "_") else df$col2[i] #replaced here
})
df
# col1 col2 col3
#1 255 mwu 21
#2 77031 netw 0
#3 77031 netw 0
#4 77031 netw 0
#5 82513 Between_netw_cxum 91
#6 88206 cxum 0
#7 88206 cxum 0
#8 88206 cxum 0
#9 188450 Between_cxum_aau 25
#10 188450 xaii 0
#11 188450 xaii 0
#12 188450 xaii 0
#13 188450 xaii 0
#14 199800 aau 0
df <- structure(list(col1 = c(255L, 77031L, 77031L, 77031L, 82513L,
88206L, 88206L, 88206L, 188450L, 188450L, 188450L, 188450L, 188450L,
199800L), col2 = c("mwu", "netw", "netw", "netw", "cuu", "cxum",
"cxum", "cxum", "xaii", "xaii", "xaii", "xaii", "xaii", "aau"
), col3 = c(21L, 0L, 0L, 0L, 91L, 0L, 0L, 0L, 25L, 0L, 0L, 0L,
0L, 0L)), .Names = c("col1", "col2", "col3"), class = "data.frame",
row.names = c(NA,-14L))