请问,我如何得到数据集中最后6个观察组的平均值(平均值):第一列是组,即类,第二列是观察到的变量,即高度。
Class Height
1 12.5
1 14.5
1 15.8
1 16.1
1 18.9
1 21.2
1 23.4
1 25.7
2 13.1
2 15.0
2 15.8
2 16.3
2 17.4
2 18.6
2 22.6
2 24.1
2 25.6
3 11.5
3 12.2
3 13.9
3 14.7
3 18.9
3 20.5
3 21.6
3 22.6
3 24.1
3 25.8
答案 0 :(得分:1)
这有点粗糙,但它应该完成工作。基本上,我们读入数据然后按行号递减排序。然后我们可以再次运行数据并标记每个“类”的前六个观察结果。请注意,这仅适用于您对“课程”上的观察预先排序的情况。
* This will read in your data and get a row number;
data one;
input class height;
row_number = _n_;
cards;
1 12.5
1 14.5
1 15.8
1 16.1
1 18.9
1 21.2
1 23.4
1 25.7
2 13.1
2 15.0
2 15.8
2 16.3
2 17.4
2 18.6
2 22.6
2 24.1
2 25.6
3 11.5
3 12.2
3 13.9
3 14.7
3 18.9
3 20.5
3 21.6
3 22.6
3 24.1
3 25.8
;
run;
* Now we sort by row number in descending order;
proc sort data = one out = two;
by descending row_number;
run;
* Now we run through the data again to make a flag for the last
six observations for each class;
data three;
set two;
* This sets up the counter;
retain counter 0;
* This resets the counter to zero at the first instance of each new class;
if class ne lag(class) then counter = 0;
counter = counter + 1;
* This makes a flag (1/0) on whether we want to keep the
observation for analysis;
keep_it = (counter le 6);
run;
* Now we get the means;
proc means data = three mean;
where keep_it gt 0;
class class;
var height;
run;
答案 1 :(得分:0)
此示例要求输入数据按类排序,每个类至少有6个观察值。
data input;
input class height;
cards;
1 12.5
1 14.5
1 15.8
1 16.1
1 18.9
1 21.2
1 23.4
1 25.7
2 13.1
2 15.0
2 15.8
2 16.3
2 17.4
2 18.6
2 22.6
2 24.1
2 25.6
3 11.5
3 12.2
3 13.9
3 14.7
3 18.9
3 20.5
3 21.6
3 22.6
3 24.1
3 25.8
;
run;
data output;
set input;
by class;
average = mean(height, lag1(height), lag2(height), lag3(height), lag4(height), lag5(height));
if last.class;
drop height;
run;
如果输入没有按升序/降序排序,而是按类分组(每组的所有记录都存储在"在一起",例如序列1,1,3,3,2,2, 2),NOTSORTED
选项可以解决问题。