我如何在变量

时间:2018-04-02 23:13:15

标签: sas

我需要帮助提取在名为wildfire_narrative的sas文件中刻录的英亩。文件中的变量是episdoe_id,episode_narrative,event_id,event_narrative。燃烧的面积在变量episode_narrative内。 episode_narrative至少包含一段文本字符串,在文本字符串中是烧毁的英亩。

示例:干燥的圣诞风导致8号州际公路的中间位置丢弃的烟头长成10,353英亩的刷火。用于扑灭火灾的资源耗资超过800万美元,涉及2000架消防员,9架直升机和9架空中加油机。火灾中受损或毁坏的财产包括15个单户住宅,65个附属建筑,15个拖车和164辆汽车。几只牲畜被烧毁,后来被安乐死。谢谢。

data acres;
set 'C:\Users\scott\Downloads\Wildfire_narrative.sas7bdat';
acresBurned = scan(episode_narrative, findw("acre",0-8,' ')-1, ",");  
run;

1 个答案:

答案 0 :(得分:2)

在prxchange中有类似的东西。这是通过使用正则表达式替换其他所有内容并将数字保持在英亩前面来完成的。通过下面的代码,它基本上捕获各种组,并用英亩前的数字替换所有内容。

 acres=input(prxchange('s/(.+?)([0-9\,]+)(?=\s?\-?acre)(.+)/$2/i',-1, 
 acresBurned),comma10.)

上述代码的简要说明。

(。+?)是第一个被捕获的组,直到数字空间后跟单词英亩

([0-9 \,] +)是第二个数字为

的捕获组

(?= \ s? - ?acre)是第三个捕获组,这是前瞻性参考,确保在数字前面有单词英亩,后跟空格或 -

(。+)是第四个捕获组,直到句子结束。

/ $ 2 /用第二个捕获组替换所有内容,输入函数用于将值更改为数字

  

data have ;
length acresBurned $500.;
acresBurned = "Dry Santa winds caused a discarded cigarette butt in the median 
of    8 to grow into a 10,353 acre brush fire. Resources used to fight the 
fire cost over $8 million and involved 2000 fire fighters, nine helicopters, a 
and nine air tankers. Property damaged or destroyed in the fire consisted of 
15 single family homes, 65 outbuildings, 15 trailers, and 164 motor vehicles. 
Several livestock were burned and later euthanasized";
output;
acresBurned = "Dry Santa winds caused a discarded cigarette butt in the    
median 
of Interstate 8 to grow into a 100,353-acre brush fire. Resources used
to  fight 
the fire cost over $8 million and involved 2000 fire fighters, nine 
helicopters, and nine air tankers. Property damaged or destroyed in the   
fire consisted of 15 single family homes, 65 outbuildings, 15 trailers, and 
164 motor vehicles. Several livestock were burned and later  
euthanasized";    
output;
run;

data have1;
set have;
acres=input(prxchange('s/(.+?)([0-9\,]+)(?=\s?\-?acre)(.+)/$2/i',-1, 
acresBurned),comma10.);
run;