我想根据此比较来比较嵌套列表中的最大值(从列表中的文本中提取值)与嵌套列表中另一列和gsub元素中的数字(未嵌套):P
输入:
structure(list(ExtentNumber = list("3", 1, "2",
"4", "1"), BiopsyType = list("2--Biopsy site: Stomach Number of biopsies: 2",
c("4--Biopsy site: D2 - 2nd part of duodenum Number of biopsies: 7",
"2--Biopsy site: Stomach Number of biopsies: 9", "Biopsy site: None",
"3--Biopsy site: Duodenal bulb Number of biopsies: 1"), c("1--Biopsy site: Oesophagus Number of biopsies: 10",
"2--Biopsy site: Stomach Number of biopsies: 6"), "3--Biopsy site: Duodenal bulb Number of biopsies: 4",
c("1--Biopsy site: Oesophagus Number of biopsies: 6", "4--Biopsy site: D2 - 2nd part of duodenum Number of biopsies: 9"
))), .Names = c("ExtentNumber", "BiopsyType"), row.names = c(NA,
5L), class = "data.frame")
我最初尝试过:
lapply(OGDProcedureDf$BiopsyType, function(p)
ifelse(max(as.numeric(str_match(p,"^(\\d)--")),na.rm=T)>OGDProcedureDf$ExtentNumber,gsub("*.","",p),p)
)
但意识到我正在与ExtentNumber
中的所有数字进行比较
然后,我尝试将其包装在一个apply函数中,如下所示:
apply(OGDProcedureDf,1,function(x) lapply(OGDProcedureDf$BiopsyType, function(p)
ifelse(max(as.numeric(str_match(p,"^(\\d)--")),na.rm=T)>OGDProcedureDf$ExtentNumber,gsub("*.","",p),p)
))
但是我得到了错误:
Error in match.fun(FUN) : argument "FUN" is missing, with no default
因此,基本上,如何基于未嵌套的列值来查找和替换嵌套列表中的元素?
预期结果:
structure(list(ExtentNumber = list("3", 1, "2", "4", "1"), BiopsyType = list("2--Biopsy site: Stomach Number of biopsies: 2",
c("", "", ""), c("1--Biopsy site: Oesophagus Number of biopsies: 10","")
, "3--Biopsy site: Duodenal bulb Number of biopsies: 4",
c("1--Biopsy site: Oesophagus Number of biopsies: 6", ""
))), .Names = c("ExtentNumber", "BiopsyType"), row.names = c(NA, 5L), class = "data.frame")
答案 0 :(得分:1)
这可能不是最有效的方法,但这是我的评论的后续内容,
l1 <- Map(function(x, y) replace(x > y, is.na(x > y), FALSE) ,
df$ExtentNumber,
lapply(df$BiopsyType, function(i)
as.numeric(gsub('^([0-9]+)--.*$', '\\1', i))))
mapply(function(x, y) paste0(x[y], collapse = ', '),
lapply(df$BiopsyType, function(i) unlist(strsplit(i, ', '))), l1)
#[1] "2--Biopsy site: Stomach Number of biopsies: 2" "" "1--Biopsy site: Oesophagus Number of biopsies: 10" "3--Biopsy site: Duodenal bulb Number of biopsies: 4"
#[5] ""
答案 1 :(得分:1)
Map(function(x,y)y[as.numeric(x)>=as.numeric(sub("^(\\d+).*$|.*","\\1",y))],
dat$ExtentNumber,dat$BiopsyType)
[[1]]
[1] "2--Biopsy site: Stomach Number of biopsies: 2"
[[2]]
[1] NA
[[3]]
[1] "1--Biopsy site: Oesophagus Number of biopsies: 10" "2--Biopsy site: Stomach Number of biopsies: 6"
[[4]]
[1] "3--Biopsy site: Duodenal bulb Number of biopsies: 4"
[[5]]
[1] "1--Biopsy site: Oesophagus Number of biopsies: 6"