sas,prchange,reg ex,非结构化数据

时间:2018-11-05 18:35:38

标签: regex sas

我正在查看某人的代码。这段代码做什么?我在PRXCHANGES和正则表达式方面越来越好。我不是专业人士,但仍在学习。在我看来,它正在取代P.O之类的东西。带邮政信箱的信箱。是\ s吗?是指可选空间?和0?意思是0是可选的?还:?表示:是可选的吗?看起来我可能正在理解其中的一些。谢谢

   DATA _NULL_;
    X='P.O. BOX 123';
    Y=PRXCHANGE('s/0?\s?P\.\s?O\. BOX\:?/PO BOX/',-1,X);
    PUT Y;
RUN;

1 个答案:

答案 0 :(得分:0)

记录一个人使用正则表达式非常重要,这样其他人才能理解

如您在输出中所见

  P.O. BOX 123 is converted into PO BOX 123

一些重要事物的解释,并使用此链接进行详细了解。  http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a003288497.htm

   ? indicates something is present 0 or 1 time
   example: "do(es)?" matches the "do" in "do" or "does"
     period(.) is another metacharacter it means any single character in regex
    to address period it has to escaped using \ it becomes \.
    \s indicates space

在prxchange中,这是捕获的组“ 0?\ s?P. \ s?O.BOX :?”替换为/ 邮政信箱 /

 0? can have zero or may not have in start
followed by \s that is space \s? indicates it can be there or not.
followed by literal P followed by . that is P. 
followed by space or no space and literal 0 followed by .
folloed by space and Box and can have : or not

这涵盖了以下几种不同的场景/模式

"0 P.O. BOX" will converted into PO BOX
"0 P.O. BOX:" will converted into PO BOX
"P.O. BOX:" will converted into PO BOX
"0 P.O. BOX" will converted into PO BOX
"P.O. BOX" will converted into PO BOX