我尝试使用SPSS / SPSS语法/ Excel标准化一组Age数据(即年龄/月龄)。我的直觉是使用一系列DO IF循环,即:
DO IF CHAR.INDEX(Age, "y")>1... for years
DO IF CHAR.INDEX(Age, "m")>1... for months
DO IF CHAR.INDEX(Age, "d")>1... for days
并让程序引用字符串前面的数字作为年/月/天的数量,并将其添加到新变量中的总数中,该变量可以是以天为单位(最小单位),以后可以转换为年。
例如对于一个单元格" 3年5月":将3 * 365 + 5 * 30.5 = 1248天的时间添加到一个新变量(类似于" DaysOld")。 / p>
单元格内容的示例(没有任何字符串的数字假定为年):
2
5 months
11 days
1.7
13 yr
22 yrs
13 months
10 mo
6/19/2016
3y10m
10m
12y
3.5 years
3 years
11 mos
1 year 10 months
1 year, two months
20 Y
13 y/o
3 years in 2014
答案 0 :(得分:0)
以下语法将解决很多案例,但绝对不是所有案例(例如," 1.7"或" 2014年和#34; 3年)。你需要做更多的工作,但这应该让你开始很好......
首先,我重新创建您要使用的示例数据:
data list list/age (a30).
begin data
"2"
"5 months"
"11 days"
"1.7"
"13 yr"
"22 yrs"
"13 Months"
"10 mo"
"6/19/2016"
"3y10m"
"10m"
"12y"
"3.5 years"
"3 YEARS"
"11 mos"
"1 year 10 months"
"1 year, two months"
"20 Y"
"13 y/o"
"3 years in 2014"
end data.
现在上班:
* some necessary definitions.
string ageCleaned (a30) chr (a1) nm d m y (a5).
compute ageCleaned="".
* my first step is to create a "cleaned" age variable (it's possible to
manage without this variable but using this is better for debugging and
improving the method).
* in the `ageCleaned` variable I only keep digits, periods (for decimal
point) and the characters "d", "m", "y".
do if CHAR.INDEX(lower(age),'ymd',1)>0.
loop #chrN=1 to char.length(age).
compute chr=lower(char.substr(age,#chrN,1)).
if CHAR.INDEX(chr,'0123456789ymd.',1)>0 ageCleaned=concat(rtrim(ageCleaned),chr).
end loop.
end if.
* the following line accounts for the word "days" which in the `ageCleaned`
variable has turned into the characters "dy".
compute ageCleaned=replace(ageCleaned,"dy","d").
exe.
* now I can work through the `ageCleaned` variable, accumulating digits
until I meet a character, then assigning the accumulated number to the
right variable according to that character ("d", "m" or "y").
compute nm="".
loop #chrN=1 to char.length(ageCleaned).
compute chr=char.substr(ageCleaned,#chrN,1).
do if CHAR.INDEX(chr,'0123456789.',1)>0.
compute nm=concat(rtrim(nm),chr).
else.
if chr="y" y=nm.
if chr="m" m=nm.
if chr="d" d=nm.
compute nm="".
end if.
end loop.
exe.
* we now have the numbers in string format, so after turning them into
numbers they are ready for use in calculations.
alter type d m y (f8.2).
compute DaysOld=sum(365*y, 30.5*m, d).