我想从sas中的字符串中分隔名字姓和年龄

时间:2017-04-24 18:30:19

标签: sas

输入:

David30Miller   
Jhonty45Rhodes  
Ahsley63Cummins

因此,名字变量应包含年龄之前的字符,即大卫年龄应包含数字,即30,姓氏应包含米勒。

必填项:

FirstName Age Last name  
David     30  Miller 
Jhonty    45  Rhodes  
Ahsley    63  Cummins

有人可以帮忙吗?

2 个答案:

答案 0 :(得分:2)

Step1:使用压缩(字符串,,“kd”)提取年龄(其中kd压缩所有带有年龄的字符值)
第2步:使用age作为扫描功能的参数来创建名字和姓氏。 scan(,,):第一个参数是您要处理的值,第二个参数是您要提取的字符串的哪一部分,第三个参数是在这种情况下用于区分(年龄)的符号。

data abc;
input string $50.;
cards;
David30Miller
Jhonty45Rhodes
Ahsley63Cummins
;
run;

data abc;
set abc;
age = input(compress(string,,"kd"),best.);
first_name =scan(string,1,age);  /*or scan(string,1,,"d");*/
last_name = scan(string,2,age);  /*or scan(string,2,,"d");*/
run;

我的输出:

|string             |age    |first_name   |last_name
|David30Miller      |30     |David        |Miller
|Jhonty45Rhodes     |45     |Jhonty       |Rhodes
|Ahsley63Cummins    |63     |Ahsley       |Cummins

如果有任何疑问,请告诉我

答案 1 :(得分:0)

您还可以如下所示使用Prxchange。下面是有关代码的简短讨论。

 ^([a-z]+)([0-9]+)([a-z]+)$ --- ^ means starting ^([a-z]+) this is group1 with 
 alphabets

 ([0-9]+) is group2 with numbers only

 ([a-z]+)$  is group3. 

 $1 represents group1 which can replace everything with group 1 by using /$1/
  $2 represents group1 which can replace everything with group 2 by using /$2/
 $3 represents group1 which can replace everything with group 3 by using /$3/

在第一种情况下,我们将所有内容替换为第一个组,并给出您的名字,依此类推

data want
set have;
firstname = prxchange('s/^([a-z]+)([0-9]+)([a-z]+)$/$1/i',1,trim(string));
age = input(prxchange('s/^([a-z]+)([0-9]+)([a-z]+)$/$2/i',2,trim(string)),8.);;
lastname = prxchange('s/^([a-z]+)([0-9]+)([a-z]+)$/$3/i',1,trim(string));;
run;