我希望从下面的列中提取所有数字细节
head(df$Session, 5)
[1] "Session_01122016" "Session_02122016" "Session_03122016" "Session_04122016" "Session_05122016"
head(df$Date, 5)
[1] "01/12/2016" "02/12/2016" "03/12/2016" "04/12/2016" "05/12/2016"
我的预期输出是:
head(df$SessionOutput, 5)
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
head(df$DateOutput, 5)
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
有可能请这样做吗?
谢谢。
答案 0 :(得分:1)
如果每个列中的模式一致,您只需使用gsub()
删除不需要的模式:
df <- data.frame(
Session = c("Session_01122016","Session_02122016","Session_03122016","Session_04122016","Session_05122016"),
Date = c("01/12/2016","02/12/2016","03/12/2016","04/12/2016","05/12/2016"),
stringsAsFactors = F
)
df$SessionOutput <- gsub("Session_", "", df$Session)
df$DateOutput <- gsub("/", "", df$Date, fixed = T)
> head(df$SessionOutput )
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
> head(df$DateOutput )
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
答案 1 :(得分:1)
您可以使用gsub
:
x <- c("01/12/2016", "02/12/2016", "03/12/2016", "04/12/2016", "05/12/2016")
y <- c("Session_01122016", "Session_02122016", "Session_03122016", "Session_04122016", "Session_05122016")
# defines a pattern to be replaced with an empty string
# here, anything that is a punctuation sign or alphabetic character
remove_this <- "[[:punct:]]|[[:alpha:]]"
gsub(remove_this, "", x)
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
gsub(remove_this, "", y)
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
?gsub
和?regex
会有所帮助。
答案 2 :(得分:0)
您可以使用stringi
包
lapply(df,function(x)stri_c_list(stri_extract_all(x,regex = '[0-9]')))
$Session
[1] "01122016" "02122016" "03122016" "04122016" "05122016"
$Date
[1] "01122016" "02122016" "03122016" "04122016" "05122016"