Question

我在数据集中有一个客户地址列表，我试图找到居住国家/地区，例如：NEWSOUTHWALESAUSTRALIA可以编入索引以将该国家/地区报告为澳大利亚。我正在尝试使用do循环方法来扫描252个国家/地区列表，以便将居住国与名为address_format的数据集相关联

数据集测试列出了252个国家，这些国家已经升级了＆amp;压缩，如字段concat_address, so should no issues with differences in the text.

%macro counter;
%do ii = 1 %to 252;

data test;
                set country_data (obs=&ii.);
                call symput('New_upcase_country',trim(New_upcase_country));
                country_new = compress(trim(country_two));
                call symput('country_new',trim(country_new));

run;

data ADDRESS_FORMAT_NEW;
               set ADDRESS_FORMAT;
                     length success $70.;
                     format success $70.; 
                 if index(concat_address,"&country_new.") ge 1 
                     then do ;           
                country="&country_new.";                
                end;
run;

%end;
%mend;
%counter;

出于某些奇怪的原因如果我手动编程if index(concat_address,'AUSTRALIA')，我会得到结果，但在宏内部结果是空白的。

是否有一些显而易见的东西让我无法阻止国家指数的成功？

Answer 1

obs=选项可以被视为lastobs=（没有lastobs选项）。

跳过前n-1个观测值的选项是firstobs=

此示例将产生4行（8-5 + 1）

data class;
  set sashelp.class (firstobs=5 obs=8);
run;

所以你想要

firstobs=&ii obs=&ii或
firstobs=&ii和STOP;RUN;
以防止前往&ii+1行。

尽管有上述答案，我建议切换到无宏方法，在一个数据步骤中执行所有252次检查（每次检查一步）。有很多方法可以做到这一点，这是不使用数组或散列的一种方式

例如：

data have;
input;
text = _infile_;
datalines;
BONGOBONGOAUSTRALIA
SOMEWHERE IN CHINA
CANADA USA
TIBET LANE, NORWAY
run;

data countries;
length name $50;
input;
name = _infile_;
datalines;
AUSTRALIA
GERMANY
UNITED STATES
TIBET
NORWAY
run;

仅输出第一场比赛。一个重要的代码功能是在point=循环内的nobs=语句中使用set和do选项。

data want;
  set have;
  do index = 1 to check_count until (found);
    set countries point=index nobs=check_count;
    found = index(text,trim(name));
    if found then matched_country = name;
  end;
run;

输出所有匹配

data want (keep=text matched_country);
  set have;
  do index = 1 to check_count;
    set countries point=index nobs=check_count;
    found = index(text,trim(name));
    if found then do;
      found_count = sum(found_count,1);
      matched_country = name;
      output;
    end;
  end;
  if not found_count > 0 then do;
    matched_country = '** NO MATCH **';
    output;
  end;
run;

SAS从地址字符串中找到国家/地区？

1 个答案: