我试图从一个CSV文件中读取100个股票/ ETF的股票代码名称。我有两个CSV文件,一个包含90天内所有股票/ etfs的数据。第二个包含我有兴趣选择的100个库存/ etf代码的名称。下面是我的代码,WORK.ETFnames是一个列数据集,包含我想从fulldata中选择的100个ETF名称。如何使用此名称列表正确选择所需数据。在WORK.FULLdata中,名称存储在名为“Ticker”的列中。我已经按类型(ETF或Stock)对数据进行了排序,但是无法弄清楚如何从这些表中选择我真正感兴趣的行。谢谢!
PROC IMPORT OUT=WORK.Fulldata
DATAFILE="/folders/myshortcuts/myfolder/q2_2012_all.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
PROC IMPORT OUT = WORK.ETFnames
DATAFILE = "/folders/myshortcuts/myfolder/ETFs.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
PROC SQL;
CREATE TABLE stocks AS
SELECT *
from Fulldata
where Security EQ "Stock";
QUIT;
PROC SQL;
CREATE TABLE ETF AS
SELECT *
from Fulldata
where Security EQ "ETF"
QUIT;
答案 0 :(得分:0)
您可能想尝试合并两个数据集,并且只接受那些匹配" Ticker"值。我将假设数据集ETFnames的名称存储在变量" Ticker"太
PROC IMPORT OUT= WORK.Fulldata
DATAFILE= "/folders/myshortcuts/myfolder/q2_2012_all.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
PROC IMPORT OUT= WORK.ETFnames
DATAFILE= "/folders/myshortcuts/myfolder/ETFs.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
PROC SORT DATA=WORK.Fulldata OUT=WORK.Fulldatasort;
BY Ticker;
RUN;
PROC SORT DATA=WORK.EFTnames OUT=WORK.EFTnamessort;
BY Ticker;
RUN;
DATA WORK.Partdata;
MERGE WORK.Fulldatasort WORK.EFTnamessort(in=A);
BY Ticker;
IF A;
RUN;
PROC SQL;
CREATE TABLE stocks AS
SELECT *
from Partdata
Where Security EQ "Stock";
QUIT;
PROC SQL;
CREATE TABLE ETF AS
SELECT *
from Partdata
Where Security EQ "ETF"
QUIT;
据我所知,这会给你想要的结果。您也可以在PROC SQL语句中加入而不是MERGE,但MERGE更容易编写IMO。