**Application_id Reaon_code Value**
123 AB31AB45 £500
124 AB43RD49TY87 £640
125 RT87 £900
126 CD19RV29 £1000
我想得到的是将reason_code
变量与其子集分开,每个原因只有4个字符,并且组合2个字母和2个数字,总是
我想获得的数据集如下:
Application_id Reason_code Value
123 AB31 £500
123 AB45 £500
124 AB43 £640
124 RD49 £640
124 TY87 £640
145 RT87 £900
希望这是有道理的。
第二个问题,我想创建一个显示的标志:
Application_id Reason_code Value Waterfall_reason Unique_Reason
123 AB31 £500 1 (as it his AB31 first) 0 (as it hits both AB31 and AB45)
123 AB45 £500 0 (as it hits AB31 first) 0 (as it hits both AB31 and AB45)
124 AB43 £640 1 (as it hits AB43 first) 0 (as it hits both AB43,RD49 and TY87)
124 RD49 £640 0 0
124 TY87 £640 0 0
145 RT87 £900 1 (as it hits RT87 first) 1 (as it ONLY Hit RT87)
答案 0 :(得分:3)
假设所有代码都是4个字符,那么一个简单的DO循环将完成这项工作。只需保持前四个字符,直到字符串为空。如果您创建一个只有长度为4的变量并为其指定一个更长的字符串,那么只需要前四个字符。然后,您可以使用SUBSTR()函数在下一次循环之前删除前四个字符。
data have ;
input ID Reason_Code :$20. Value ;
cards;
123 AB31AB45 500
124 AB43RD49TY87 640
125 RT87 900
126 CD19RV29 1000
;;;;
data want ;
set have (rename=(reason_code=reason_list));
length Reason_code $4 Waterfall_reason 8 Unique_reason 8;
unique_reason = length(reason_list)<= 4;
waterfall_reason= 1;
do until (reason_list=' ');
reason_code = reason_list ;
output;
waterfall_reason=0;
reason_list = substr(reason_list,5);
end;
run;
答案 1 :(得分:0)
这是使用正则表达式的另一种方法,基于不同的假设,即您的子字符串基于字母+数字,而不是固定的4字符设置。下面的代码将拾取符合字母+数字模式的字符串(在这种情况下将包括2个字母+2个数字),一个接一个,直到输入字符串的整个长度耗尽。 'waterfall_reason'仅在拾取第一个子字符串后被标记,'unique_reason'由countw()使用字母作为分隔符完成。
data have;
input ID Reason_Code :$20. Value;
cards;
123 ABcd31AB45 500
124 AB43RD49T87 640
125 RT87 900
126 C19RV29 1000
;;;;
data want;
set have;
_pat=prxparse('/[a-z]+[0-9]+/io');
_start=1;
_stop=length(reason_code);
unique_reason=ifn(countw(reason_code,,'a')=1,1,0);
do _n=1 by 1 until (_pos = 0);
call prxnext(_pat,_start,_stop,reason_code,_pos,_len);
new_code=substr(reason_code,_pos, _len);
waterfall_reason=ifn(_n=1,1,0);
if not missing (new_code) then
output;
end;
drop _:;
run;
答案 2 :(得分:-1)
return String.format("Hi %s, my name is %s", yourName, name);