我有一个读入SAS的文本文件,并清理到每一行都包含以下内容的位置:
<xsd:element name="ReportingUnit" type="reportingunit:ReportingUnit_def" minOccurs="1" maxOccurs="1"/>
我需要提取name的值和type的值。 因此,在这种情况下,我需要获取 ReportingUnit 和 ReportingUnit_def
任何帮助将不胜感激。 谢谢
答案 0 :(得分:3)
xsd
是xml
。不幸的是,不是格式化为xmlv2
引擎会支持的xml。如果您声明的xsd
是干净的,则使用input
指针控件平移@'character-string'
将提取您想要的数据。
示例代码
filename myxsd temp;
* example xsd from https://docs.microsoft.com/en-us/visualstudio/xml-tools/sample-xsd-file-simple-schema?view=vs-2015;
data _null_;
file myxsd;
input;
put _infile_;
datalines;
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:tns="http://tempuri.org/PurchaseOrderSchema.xsd"
targetNamespace="http://tempuri.org/PurchaseOrderSchema.xsd"
elementFormDefault="qualified">
<xsd:element name="PurchaseOrder" type="tns:PurchaseOrderType"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="ShipTo" type="tns:USAddress" maxOccurs="2"/>
<xsd:element name="BillTo" type="tns:USAddress"/>
</xsd:sequence>
<xsd:attribute name="OrderDate" type="xsd:date"/>
</xsd:complexType>
<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:integer"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
</xsd:complexType>
</xsd:schema>
run;
libname myxsd xmlv2;
proc copy in=myxsd out=work;
run;
data weak_parse;
infile myxsd dsd dlm=" />" missover;
length name type $100;
input @"name=" name @"type=" type;
run;
当proc复制尝试通过libname读取xsd时,将发生日志错误。但是输入语句运行得很好
536 libname myxsd xmlv2;
NOTE: Libref MYXSD was successfully assigned as follows:
Engine: XMLV2
Physical Name: C:\Users\Richard\AppData\Local\Temp\SAS Temporary
Files\_TD2764_HELIUM_\#LN00053
537
538 proc copy in=myxsd out=work;
539 run;
ERROR: XML data is not in a format supported natively by the XML libname engine. Files of this
type may require an XMLMap to be input properly.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE COPY used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
NOTE: The SAS System stopped processing this step because of errors.
540
541 data weak_parse;
542 infile myxsd dsd dlm=" />" missover;
543 length name type $100;
544 input @"name=" name @"type=" type;
545 run;
NOTE: The infile MYXSD is:
Filename=C:\Users\Richard\AppData\Local\Temp\SAS Temporary Files\_TD2764_HELIUM_\#LN00053,
RECFM=V,LRECL=32767,File Size (bytes)=1968,
Last Modified=21Sep2018:22:51:56,
Create Time=21Sep2018:22:51:56
NOTE: 24 records were read from the infile MYXSD.
The minimum record length was 80.
The maximum record length was 80.
NOTE: The data set WORK.WEAK_PARSE has 24 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
读入的数据将是
The SAS System
Obs name type
1
2
3
4
5 PurchaseOrder tns:PurchaseOrderType
6 PurchaseOrderType
7
8 ShipTo tns:USAddress
9 BillTo tns:USAddress
10
11 OrderDate xsd:date
12
13
14 USAddress
15
16 name xsd:string
17 street xsd:string
18 city xsd:string
19 state xsd:string
20 zip xsd:integer
21
22 country xsd:NMTOKEN
23
24
答案 1 :(得分:0)
遇到了同样的问题,请试试这个(它对我有用):
data want;
infile "M:\some\path\ihave\favorites.xml";
length line $100;
input;
line = _infile_;
run;