许多stackoverflow问题都与提取模式后的数字有关。但是,我的任务有点挑战性 我有一个模式列表如下
Customer Id :
C_Id=
CustID=
数据帧的快照如下
Customer Details Purchase Amount
Alpha Customer Id:293 500
C_ID= 495;task based 788
Detail PurcCustID=789;982 in k 12345
我希望获得如下数据框
Customer Details Purchase Amount Customer ID
Alpha Customer Id:293 500 293
C_ID= 495;task based 788 495
Detail PurcCustID=789;982 in k 12345 789
代码段:
customer_details = c("Alpha Customer Id:293","C_ID= 495;task
based","DetailPurcCustID=789;982 in k")
purchase_amount = c(500,788,12345)
customer_data = data.frame(customer_details,purchase_amount)
有没有办法完成这项工作
答案 0 :(得分:2)
我们可以使用str_extract
library(tidyverse)
customer_data %>%
mutate(CustomerID = as.numeric(str_extract(customer_details, "(?<=I[Dd][:=])\\s*\\d+")))
# customer_details purchase_amount CustomerID
#1 Alpha Customer Id:293 500 293
#2 C_ID= 495;task based 788 495
#3 DetailPurcCustID=789;982 in k 12345 789
或使用sub
base R
customer_data$CustomerID <- as.numeric(sub(".*(I(?i)d[:=]\\s*)(\\d+).*",
"\\2", customer_data$customer_details))