我有一个数据集,其中包含三列:station_code,dest_code,fare。基本上,station_code和dest_cde内的数据相同,而票价是前往特定车站所需的费用。
station_code dest_code fare
station1 station1 0
station1 station2 4.6
station1 station3 10
station1 station4 10
station1 station5 12.3
station1 station6 12.3
station1 station7 12.3
station1 station8 12.3
station1 station9 14.7
station1 station10 14.7
.
.
.
station1 station91 27.5
station2 station1 4.6
station2 station2 0
station2 station3 10
station2 station4 10
station2 station5 12.3
station2 station6 12.3
station2 station7 12.3
station2 station8 12.3
station2 station9 14.7
station2 station10 14.7
.
.
.
耕种站91
所以我的问题是如何使用数组技术创建看起来像这样的查找表。
fee 1 2 3 4 ...
1 0 4.6 10 10
2 4.6 0 10 10
3 10 10 0 4.6
4 10 10 4.6 0
5 12.3 12.3 4.6 4.6
... ... ... ... ...
如您所见,行和列中的索引实际上都代表站点名称,例如row1 = station1,column1 = station1,column2 = station2。
答案 0 :(得分:0)
问题是如此混乱,我想再次澄清一下我的表和查找表。
对于第一张图片,它是原始表格 enter image description here
enter image description here 第二张图片是我想要的查找表
答案 1 :(得分:0)
您可以通过PROC TRANSPOSE获得类似的表格,并提前对数据进行排序。
proc sort data=have ;
by station_code dest_code fare;
run;
proc transpose data=have out=want;
by station_code;
id dest_code;
var fare;
run;
答案 2 :(得分:0)
您将fare
描述为二维数组。如何“加载”阵列取决于您计划如何使用“查找”。
假设:
station_code
,dest_code
,fare
station1
…station91
personid
,step_num
,station_code
示例:
data totals(keep=personid totalfare);
* load the station fares into temporary array for use as lookup table;
array fares(91,91) _temporary_;
do until (lastfare);
set fares end=lastfare;
_from = input(substr(station_code,8),best.); * parse the number out of code;
_dest = input(substr(dest_code,8),best.);
fares(_from,_dest) = fare;
end;
* compute each persons total fare;
do until (endtrips);
totalfare = 0;
_from = 0;
do until (last.personid);
set trips end=endtrips;
by personid step_num;
_dest = input(substr(station_code,8),best.);
if _from and _dest then totalfare + fares(_from,_dest);
_from = _dest;
end;
output;
end;
stop;
run;
如果工作站代码值实际上不是不是一个可以解析1
…91
的值,则不能使用数组 -相反,应使用具有两个值的键的哈希对象作为查找对象。
data totals (keep=personid totalfare);
* load the station fares into hash for use as lookup table;
if 0 then set fares; * prep pdv;
declare fares hash(dataset:'fares');
fares.defineKey('dest_code', 'station_code'); * reversed keys make it easier to traverse trips;
fares.defineData('fare');
fares.defineDone(); * automatically reads dataset:fares and fills hash entries;
* compute each persons total fare;
do until (endtrips);
totalfare = 0;
dest_code = '';
do until (last.personid);
set trips end=endtrips; * read in the station for a persons step;
by personid step_num;
if fares.find()=0 then do; * 0 return code means variable fare has the value for the fare from station_code to dest_code;
totalfare + fare;
end;
* prepare for next leg of journey, this is what is meant by easier to traverse;
dest_code = station_code;
end;
output;
end;
stop;
run;