如何用另一个表的数据替换表中的数据,类似于Excel中的vlookup?

时间:2019-01-24 19:35:18

标签: sql sas

对不起,我的英语不好。

使用SAS,我试图替换一个表中的数据,我们称其为t1。作为替代,我将比较t1列1和t2列1。如果有匹配项,我想使用t2列2的值。

表1中有很多列,并且相关列中的数据可以重复。表2只有两列,第一列只有唯一的值,并且将与表1进行比较。之后,我将使用第二列的值。

出于某种原因,我正在生成笛卡尔积。

proc sql;
    create view 
        v1 as
    select
        t2.c2, (final result)
        t1.c10, (not relevant to problem)
        SUM(t1.c11) (not relevant to problem)
    from 
        _outres.table1 t1
    left join
        _outres.table2 t2
    on 
        t1.c1=t2.c1 (comparing the tables)
    where
        t1.c10= "criteria"
    group by
        t2.c2,
        t1.c10
    ;run;quit;

如果是Excel,我会这样解决:

Table 1
column 1
A
A
A
B
B
C
C

Table 2
Column 1    column 2
A           AA
B           BB
C           CC

= vlookup(表1 column1,表2、2,否)

Result:
Table 1
column 1
AA
AA
AA
BB
BB
CC
CC

------------------编辑-----------------

@DCR,根据您的回复,这是我用来测试的代码。我做了一些小的更改以更好地反映我的数据和表。这可以按预期工作,但是我无法将其转换为原始代码。

data tttttt1;
input col1 $ col11 col10 $;
datalines;
A           10           critA
A           12           critA
A           13           critA
A           13           critB
B           11           critA
B           41           critA
B           19           critA
C           20           critA
C           55           critA

;
run;


data tttttt2;
input col1 $ col2 $ ;
datalines;
A           AA
B           BB
C           CC
;
run;

proc sql noprint;
     create table tttttt3 as
            select  b.col2, SUM(a.col11), a.col10
                from (select * from tttttt1) as a
                left join (select * from tttttt2) as b
                    on a.col1 = b.col1
            where a.col10 = "critA"
            group by b.col2, a.col10
;quit;

期望和结果相同:

AA  35  critA
BB  71  critA
CC  75  critA

3 个答案:

答案 0 :(得分:0)

SAS具有自定义格式形式的独特功能。格式很像VLOOKUP那样将源值映射到目标值。

使用FORMAT语句将格式与变量关联。

proc format;
  value $MyFormat
  'A' = 'AA'
  'B' = 'BB'
  'C' = 'CC'
  ;
run;

data have;
  input col1 $ @@; 
  col1_formatted_value = put(col1,$MyFormat.); * typically don't have to do this;
  datalines;
  A A A B B C C D D A
run;

proc print data=have;
  title "Data rendered per attributes associated with variables in data set metadata";
run;
proc print data=have;
  title "col1 Format applied at step time";
  format col1 $MyFormat.;
run;

* col1 format attribute saved with data set;
data have2;
  input col1 $ @@; 
  format col1 $MyFormat.;
  datalines;
  A A A B B C C D D A
run;

proc print data=have2;
  title "Data rendered per format attributes associated with variables (in data set metadata)";
run;

SAS格式也可以直接从数据构造:

data formatMappingData;
input source $ target $;
fmtname = "$MyFormatFromData";
start = source;
label = target;
datalines;
A AA!
B BB!
C CC!
;
run;

proc format cntlin=formatMappingData;
run;

proc print data=have2;
  title "Data rendered per format attributes associated with variables (in data set metadata)";
  format col1 $MyFormatFromData.;
run;

答案 1 :(得分:0)

我认为您可能正在使用proc sql寻找左联接。请尝试以下操作:

data t1;
input col1 $ ;
datalines;
A
A
A
B
B
C
C
;
run;

data t2;
input col1 $ col2 $ ;
datalines;
A           AA
B           BB
C           CC
;
run;

proc sql noprint;
     create table t3 as
            select b.col2
                  from (select * from t1) as a
             left join (select * from t2) as b
             on a.col1 = b.col1;
quit;

答案 2 :(得分:0)

我找到了解决方法!

感谢大家,所有答案,他们给了我一些见识。

@nvioli和@DCR给了我巨大的见解。我正在努力了解所生成的笛卡尔积。我计算了行数,发现结果与原始t1表相比行数相同。但是总和值显然是错误的。所以我知道,以某种方式,我的代码是在每行中插入总和,而不是“ group by”的小计。

我用最简单的方法解决了它:我将视图分为两个不同的视图。第一个将进行分组和求和,因为此代码的较旧版本正确执行了该操作。第二个视图仅需简单选择即可保留联接并更改数据。最终代码是这样的(简化版本,如原始示例所示):

/*view to group and sum columns from t1*/
proc sql;
    create view 
        v1 as
    select
        t1.c1, (column that will be substitute later)
        t1.c10, (not relevant to problem, only to show the "criteria"/group by)
        SUM(t1.c11) (not relevant to problem, only to show sum)
    from 
        _outres.table1 t1
    where
        t1.c10= "criteria"
    group by
        t1.c1,
        t1.c10
    ;quit;run;

之后:

/*view to substitute the desired column from t1 (now v1) */
proc sql;
    create view 
        v2 as
select
        t2.c2, (column with new data)
        t1.c10, (now already grouped)
        Sum_of_t1.c11 (now already summed)
from 
    v1 
left join
    t2
on
    v1.c1 = t2.c1 (comparing view from t1 with t2)
;quit;run;