如何列出正则表达式字符类的成员,如[:punct:]?

时间:2014-03-21 17:12:23

标签: regex r character-class

例如,我从

等文档中了解到
http://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html

[:punct:]

包括

! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.

但我想从命令行检查(在我的情况下,在R中,但在bash中可能类似),并列出[:alpha:]等。

2 个答案:

答案 0 :(得分:2)

 grep("[[:punct:]]", unlist(strsplit(rawToChar(as.raw(1:127)), "")), value = TRUE)
 ## [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/" 
 ## [16] ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "\\" "]"  "^"  "_"  "`"  "{"  "|" 
 ## [31] "}"  "~" 

gsub("[^[:punct:]]", "", rawToChar(as.raw(1:127)), "")
## [1] "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"

答案 1 :(得分:0)

如果你只需要担心ASCII,可能会像以下那样(使用bash):

$ for n in {0..127}; do awk '{ printf("%c", $0); }' <<< $n | grep '[[:punct:]]' | tr '\n' ' '; done
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~

$ for n in {0..127}; do awk '{ printf("%c", $0); }' <<< $n | grep '[[:alnum:]]' | tr '\n' ' '; done
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z