我有一个数据集,其回报差异为20对股票价格差异以及每对价格超过125天的触发价值。现在我想进行一些计算以获得一个数据集,其中每个对都有总回报,但在某种程度上我放弃了我的值。到目前为止,我得到一个输出数据集,其中我的所有对的名称都是变量,但始终缺少值。 因此,首先我将价格偏差,触发器和返回偏差变量的名称放入三个宏变量中。然后我创建我的数据集,其中包含总共60个变量的值。 我的数据集"全部"看起来像这样
date trigger1 trigger2 ...... trigger20 pricedev1 pricedev2....... returndev1 returndev2 ......... returndev20
21/11/2002 0.04 0.23 -0.12 . 0.0012 . .
. 0.04 o.23 0.34
. . . .
. . .
28/04/2004 0.04 0.23 ... 0.11 ..... -0.23
我使用来自不同数据集的proc sql创建了宏变量。它们包括变量名称为trigger1,trigger2等等或pricedev1 pricedev2等等或者returndev1,retundev2等等。这就是我所做的:
data all;
if _n_=1 then set trigger;
set ba.trade1_pdev;
run;
data all;
merge all ba.trade1_rdev;
run;
Proc transpose data=all out=data1 (rename=(_name_=var));
by date;
run;
然后我创建了一个宏:
%macro totret (dsname);
%do d=1 %to 20;
%let pair=%trim(%scan(&pairname.,&d.," "));
%let ret=%trim(%scan(&ret.,&d.," "));
%let trigger=%trim(%scan(&trigger.,&d.," "));
data pair;
set data1;
length all $20;
if var="&pair." then all="pdev";
else if var="&trigger." then all="trigger";
else if var="&ret." then all="rdev";
else delete;
drop var;
run;
proc sort data=pair;
by date;
quit;
proc transpose data=pair out=pair;
by date;
id all;
quit;
data pair;
set pair;
ivar=0;
if pdev>=trigger then ivar=1;
if pdev<=-1*trigger then ivar=-1;
run;
data pair;
set pair;
totret=ivar*rdev;
keep date totret;
run;
data pair;
set pair;
rename totret=&pair.;
run;
proc sort data=pair;
by date;
quit;
proc transpose data=pair out=pair (rename=(_name_=var));
by date;
quit;
%if &d.=1 %then %do;
data &dsname.;
set pair;
run;
%end;
%if &d.>1 %then %do;
data &dsname.;
set &dsname. pair;
run;
%end;
%end;
%mend totret;
%totret (tot_ret);
然后我再次将其转置回来,但结果是包含20个变量和日期变量的数据集,但所有20个变量都没有值。
proc sort data=tot_ret;
by date;
quit;
Proc transpose data=tot_ret out=test;
by date;
id var;
quit;
在我想要的结果数据集中:
date totret1 totret2 ........... totret20
21/11/2002 . . .
.
. . . .
. . .
28/04/2004 . . .
但只是使用正确的值;)
答案 0 :(得分:2)
循环20次,执行相同的转置,简单操作,转置和堆叠操作的过程表明您可以使用数组。如果我理解这个过程,那么不需要一个宏来运行每个股票,一次一个通过多次回转。
使用数组的示例代码(未经测试):
data want;
set all end=end;
by date; * for safety, cause error if rows are not in date order;
array TRIGGERS trigger1-trigger20;
array PRICEDEVS price1-price20;
array RETURNDEVS returndev1-returndev20;
array IVARS ivar1-ivar20;
array TOTRETS totret1-totret20;
* process all 20 stocks price info on a day;
do index = 1 to 20;
if PRICEDEVS[index] >= TRIGGERS[index] then
IVARS[index] = 1
else
if PRICEDEVS[index] <= -TRIGGERS[index] then
IVARS[index] = -1;
TOTRETS[index] = IVARS[index] * RETURNDEVS[index];
end;
keep date totret1-totret20;
run;
如果我误解了这个问题,我很抱歉。
为了将来处理此类数据,请考虑不为新库存添加新列。请考虑转而使用包含stock
,date
,trigger
,pricedev
,returndev
,totret
列的数据的分类形式 - 简单然后,WHERE语句可以轻松选择感兴趣的股票和日期范围。
** ADDED **
对于遵循trigger,pricedev,returndev和totret的命名模式的语义命名列的情况,您可以使用宏来生成作为数组元素的变量的源代码。例如,
但是,如果您打算提高流程的复杂程度,那么您正在使用的结构的效用会降低。
处理不同股票组的totret
宏示例。
%let stocks = BNNESR CNNESR XYZFOO ACHOO SYNOJ;
... data gather process for stocks creates ALL ...
%totret (data=all, id=mrX_set1, stocks=&stocks)
%let stocks = GGL IBM ABC MSFT ORCL;
... data gather process for stocks creates ALL ...
%totret (data=all, id=mrX_set2, stocks=&stocks)
被调用的宏将是早期DATA步骤的抽象(模板化版本)。当调用该宏时,生成特定DATA步骤的源代码。宏将看起来像以下(未经测试):
%macro totret (data=, id=, stocks=, out=totret_&id);
%local trigger_vars pricedev_vars returndev_vars totret_vars;
%local i stock;
%* use macro to build up variable name lists;
%* the variable names for the concepts of
%* trigger, pricedev, returndev, and totret
%* must follow the expected naming conventions;
%let i = 1;
%do %while (%length(%scan(&stocks,&i)) > 0);
%let stock = %scan(&stocks,&i);
* naming convention;
* VVVVVVVVVVVVVVVVV;
%let trigger_vars = &trigger_vars &stock._t;
%let pricedev_vars = &pricedev_vars &stock;
%let returndev_vars = &returndev_vars &stock._r;
%let totret_vars = &totret_vars &stock._totret;
%let i = %eval (&i + 1);
%end;
data &OUT;
set &DATA end=end;
by date; * for safety, cause error if rows are not in date order;
array TRIGGERS &trigger_vars;
array PRICEDEVS &pricedev_vars;
array RETURNDEVS &returndev_vars;
array TOTRETS &totret_vars;
* array IVARS ivar1-ivar20; * does not need to be arrayified;
* process all 20 stocks price info on a day;
do index = 1 to DIM(TRIGGERS);
if PRICEDEVS[index] >= TRIGGERS[index] then
IVARS = 1
else
if PRICEDEVS[index] <= -TRIGGERS[index] then
IVARS = -1;
TOTRETS[index] = IVARS * RETURNDEVS[index];
end;
keep date &totret_vars;
run;
%mend;
注意:数据收集过程还应利用run_id的抽象和运行中正在处理的股票列表。
您可能还会在控制数据集中管理多组运行参数的条件时找到实用程序,类似于以下内容:
run_id, index, stock
mrX_set1, 1, BNNESR
mrX_set1, 2, CNNESR
mrX_set1, 3, XYZFOO
mrX_set1, 4, ACHOO
mrX_set1, 5, SYNOJ
...
mrX_set2, 1, GGL
...
mrX_set2, 5, ORCL
同样,从长远来看,考虑重构数据结构,以便将库存作为分类概念。