string =“spanner,span,spaniel,span”; 从这个字符串我想删除所有重复项保留一个单词,然后使用SAS输出修改后的字符串。 修改后的字符串应如下所示: var string =“spanner,span,spaniel”;
答案 0 :(得分:1)
data a;
string = "spanner,span,spaniel,span,abc,span,bcc";
length word $100;
i = 2;
do while(scan(string, i, ',') ^= '');
word = scan(string, i, ',');
do j = 1 to i - 1;
if word = scan(string, j, ',') then do;
start = findw(string, word, ',', findw(string, word, ',', 't') + 1, 't');
string = cats(substr(string, 1, start - 2), substr(string, start + length(word)));
leave;
end;
end;
i = i + 1;
end;
keep string;
run;
答案 1 :(得分:1)
首先创建一个包含单词的列的数据集。使用 cats()可以消除空间。
data temp(keep=text);
string = "spanner, span, spaniel, span";
do i=1 to count(cats(string),",")+1;
text = scan(string,i);
output;
end;
run;
使用nodup消除重复(nodupkey也可以)。
proc sort data=temp nodup;
by text;
run;
使用唯一的单词创建一个宏变量 new_string 。
proc sql noprint;
SELECT text
INTO :new_string separated by ","
FROM temp
;
quit;
更好地解决新规范:
data temp(keep=i text);
string = tranwrd("I hate the product. I hate it because it smells bad. I hate wasting money.","."," ."); do i=1 to count(string," ")+1;
text = scan(string,i," ");
if text ne "" then do;
output;
end;
end;
run;
proc sort data=temp;
by text i;
run;
data temp2;
set temp;
by text i;
if first.text OR text eq ".";
run;
proc sort data=temp2;
by i;
run;
proc sql noprint;
SELECT text
INTO :new_string separated by ","
FROM temp
;
quit;
答案 2 :(得分:0)
谢谢罗伯特。只是想让您知道我在您的代码中发现了一个缺陷。内部循环通过删除重复的单词来修改字符串,但是外部循环无论如何都将检查原始字符串的下一个位置。示例:“ A,B,C,B,B”变为“ A,B,C,B”,因为内部循环删除了第四个B,然后外部循环找不到了最后一个“ B”,因为它移到了第四个“ B”的位置。
我的解决方案:
data a;
string = "spanner,span,spaniel,span,abc,span,bcc";
length word $100;
i = 2;
do while(scan(string, i, ',') ^= '');
hit = 0;
word = scan(string, i, ',');
do j = 1 to i - 1;
if word = scan(string, j, ',') then do;
start = findw(string, word, ',', findw(string, word, ',', 't') + 1, 't');
string = cats(substr(string, 1, start - 2), substr(string, start + length(word)));
hit = 1;
leave;
end;
end;
if hit = 0 then i = i + 1;
end;
keep string;
run;
答案 3 :(得分:0)
将唯一词列表构建到新变量中。
data test;
input string $80.;
length newstring $80;
do i=1 to countw(string,',');
if not findw(newstring,scan(string,i,','),',','t') then
newstring=catx(', ',newstring,scan(string,i,','))
;
end;
cards;
spanner, span, spaniel, span
;