仅检索R中列的值

时间:2018-01-29 06:50:42

标签: r regex dataframe grep

您好我的数据集如下所示,我只需要值。请建议如何去做吧

输入:

Col1

45.90625 %RH
491.25 ppm
523.5 ppm
0 % open
58.59375 cfm
50 deg F
24.3125 % open
0 % open
55.59375 deg F
0 % open
70 deg F

输出:

 Col1

45.90625
491.25 
523.5 
0 
58.59375 
50 
24.3125 
0 
55.59375 
0 
70 

2 个答案:

答案 0 :(得分:1)

试试这个正则表达式:

^\d*(?:\.\d+)?

Click for Demo

<强>解释

  • ^ - 断言字符串的开头
  • \d* - 匹配0+位数
  • (?:\.\d+)? - 匹配.后跟1位数字。 ?最后使这个子序列成为可选的

答案 1 :(得分:0)

您可以捕获第一次出现,然后用以下内容替换值:

col1 <- c('45.90625 %RH', '491.25 ppm', '523.5 ppm', '0 % open', '58.59375 cfm', '50 deg F', '24.3125 % open', 
          '0 % open', '55.59375 deg F', '0 % open', '70 deg F')

gsub(".*?\\b(\\d+(?:\\.\\d+)?)\\b.*", "\\1", col1)

这会产生

[1] "45.90625" "491.25"   "523.5"    "0"        "58.59375" "50"       "24.3125"  "0"        "55.59375" "0"       
[11] "70"    

作为数字:

col1 <- c('45.90625 %RH', '491.25 ppm', '523.5 ppm', '0 % open', '58.59375 cfm', '50 deg F', '24.3125 % open', 
          '0 % open', '55.59375 deg F', '0 % open', '70 deg F')

(col2 <- as.numeric(gsub(".*?\\b(\\d+(?:\\.\\d+)?)\\b.*", "\\1", col1)))