如何删除R中的列中的字符?

时间:2016-06-07 04:05:23

标签: r database

这是我的数据框,但我只想要框架中的数字。如何删除~th pick /所有字符?

[1] " 1st pick "   " 2nd pick "   " 4th pick "   " 5th pick "   " 6th pick "   " 7th pick "  
[7] " 8th pick "   " 9th pick "   " 10th pick "  " 11th pick "  " 12th pick "  " 13th pick " 
[13] " 14th pick "  " 15th pick "  " 16th pick "  " 17th pick "  " 18th pick "  " 19th pick " 
[19] " 20th pick "  " 21st pick "  " 22nd pick "  " 23rd pick "  " 24th pick "  " 25th pick " 
[25] " 26th pick "  " 27th pick "  " 28th pick "  " 29th pick "  " 30th pick "  " 31st pick " 
[31] " 32nd pick "  " 33rd pick "  " 34th pick "  " 35th pick "  " 36th pick "  " 37th pick " 
[37] " 38th pick "  " 39th pick "  " 40th pick "  " 41st pick "  " 42nd pick "  " 43rd pick " 
[43] " 44th pick "  " 45th pick "  " 46th pick "  " 47th pick "  " 48th pick "  " 49th pick " 
[49] " 50th pick "  " 51st pick "  " 52nd pick "  " 53rd pick "  " 54th pick "  " 55th pick " 
[55] " 56th pick " 

5 个答案:

答案 0 :(得分:3)

假设您的数据框df包含包含此数据的单个列col,您可以使用gsub()提取所需的数字:

df$number <- gsub(".*(\\d+)+.*", "\\1", df$col)

数据:

df <- data.frame(col=c(" 1st pick ", " 2nd pick ", " 4th pick ", " 5th pick ",
                       " 6th pick ", " 7th pick "))

由于同伴压力:

您也可以使用:

df$number <- gsub("[^0-9]", "", df$col)

答案 1 :(得分:3)

使用@Ronak和stringr包中的上述数据,您可以执行以下操作:

library(stringr)
x <- c(" 1st pick ", " 2nd pick " ," 4th pick " ," 5th pick ", " 6th pick " ,
       " 7th pick ", " 8th pick ", " 9th pick " ," 10th pick " ," 11th pick ",
       " 12th pick " ," 13th pick " )
as.numeric(str_extract_all(x, '\\d+'))

输出如下:

[1]  1  2  4  5  6  7  8  9 10 11 12 13

答案 2 :(得分:3)

以下是gsub

的另一个选项
as.numeric(gsub("\\D+", "", x))
#[1]  1  2  4  5  6  7  8  9 10 11 12 13

数据

x <- c(" 1st pick ", " 2nd pick " ," 4th pick " ," 5th pick ", " 6th pick " ,
   " 7th pick ", " 8th pick ", " 9th pick " ," 10th pick " ," 11th pick ",
   " 12th pick " ," 13th pick " )

答案 3 :(得分:1)

您可以使用

提取所有数字
unlist(regmatches(x, gregexpr("[[:digit:]]+", x)))

#[1] "1"  "2"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13"

根据@thelatemail的评论,您可以避免使用{/ 1}}

unlist

或者,如果你有一个只包含数字和字符的向量,你也可以删除所有字符,

regmatches(x, regexpr("\\d+",x))

数据

as.numeric(gsub("[[:alpha:]]", "", x))

#[1]  1  2  4  5  6  7  8  9 10 11 12 13

答案 4 :(得分:0)

数据

x<-c(" 1st pick "," 2nd pick "," 4th pick "," 5th pick "," 6th pick "," 45th pick "," 46th pick "," 47th pick "," 48th pick ")

a <- function(x)as.numeric(unlist(strsplit(x,"st pick|nd pick|th pick")))
x <- a(x)
x <- x[!is.na(x)]
[1]  1  2  4  5  6 45 46 47 48

或者您可以在不定义功能的情况下使用

x <- as.numeric(unlist(strsplit(x,"st pick|nd pick|th pick")))
x <- x[!is.na(x)]
[1]  1  2  4  5  6 45 46 47 48