这个标记语言是什么? ...行结尾而不是结束标记

时间:2012-11-14 00:42:43

标签: xml xml-parsing markup

我正在尝试解析与此类似的文档:

<PRESOL>
<DATE>1112
<YEAR>12
<AGENCY>Defense Logistics Agency
<OFFICE>DLA Acquisition Locations
<LOCATION>DLA Land and Maritime
<ZIP>43218-3990
<CLASSCOD>59
<DESC>Proposed procurement for NSN 5365013055528 SPACER,PLATE:
Line 0001 Qty 70.00  UI EA  Deliver To: ARIZONA INDUSTRIES FOR THE BLIND By: 0180 DAYS ADOThe solicitation is an RFQ and will be available at the link provided in this notice.  Hard copies of this solicitation are not available.  Digitized drawings and Military Specifications   and Standards may be retrieved, or ordered, electronically.
All responsible sources may submit a quote which, if timely received, shall be considered.
Quotes must be submitted electronically.

<SETASIDE>HUBZone
.......
</PRESOL>

正如你所看到的那样奇怪,但也许它曾经是一些标准。整个文档似乎使用了一组有限的空白字符,例如我看不到[tab],但是我确实看到了一些较大数据块中的换行符。

这对任何人来说都很熟悉吗?

我正在寻找可能解析此问题的rails gem。

2 个答案:

答案 0 :(得分:5)

这是一个FederalBizOpps.gov Presolicitation Notice

Presolicitation Notice Template

  

Presolicitation Template用于发布通知   提议的收购。 FAR,第5.2节要求提交   本文件在发布任何进一步行动之前。 FBO   将拒绝任何涉及特定征求的其他文件   没有事先公布过Presolicitation Notice   邀请。

 Tag           Description [Format]
 <PRESOL>                                                              
 <DATE>        Month and day synopsis is submitted [MMDD]
 <YEAR>        Year synopsis is submitted [YY]
 <CBAC>*       User ID for the Office Location. Assigned/managed by your location Administrator.    [string]
 <PASSWORD>*   Password. Assigned/managed by your location Administrator. [string]
 <ZIP>         The Contracting Office's ZIP code [5 Digits]
 <CLASSCOD>*   Either one alphabetic code or a two-digit code for service or supply that the synopsis should be listed under. [Valid classification code (FAR, Section 5.207(g))]
 <NAICS>*      Six-digit code for service or supply that the synopsis would be listed under [Valid NAICS Code]
 <OFFADD>      The complete address of the contracting office [Up to 65535 characters]
 <SUBJECT>     The classification code, two hyphens, and a brief title description of the synopsis. [Up to 255 characters]
 <SOLNBR>*     Unique reference number for the solicitation [Up to 128 characters from the set: a-z A-Z 0-9 - _ ( ) { }]
 <RESPDATE>    Response deadline date [MMDDYY]
 <ARCHDATE>    The date when this notice will be archived. [MMDDYYYY]
 <CONTACT>     The names and phone numbers of officials to contact in regard to this synopsis. If there are two points of contact, their information shall be separated by semicolon [Up to 65535 characters]
 <DESC>        A narrative description of the procurement action. [Up to 65535 characters]
 <LINK>        A structural tag [No data required or accepted]
 <URL>         The Government Agency's URL that will be listed with this award. [Up to 255 characters, consist of a restricted set of characters (see URL specification - RFC 2396)]
 <DESC>        Visible hypertext description provided to the user for linking to the related site [Up to 255 characters]
 <EMAIL>       A structural tag [No data required or accepted]
 <ADDRESS>     The Government Agency contact's email address [Up to 128 characters]
 <DESC>        Visible hypertext description provided for linking to the Government Agency contact's email [Up to 255 characters]
 <SETASIDE>    Identify set-aside acquisitions. [Valid values: 'Competitive 8(a)', 'Emerging Small Business', 'Woman Owned Small Business', 'Economically Disadvantaged Woman Owned Small Business', 'HUBZone', 'Partial HBCU / MI', 'Partial Small Business', 'Service-Disabled Veteran-Owned Small Business', 'Total HBCU / MI', 'Total Small Business', 'Veteran-Owned Small Business']
 <POPADDRESS>  Place of performance address [Up to 65535 characters]
 <POPZIP>      Place of performance ZIP code [Up to 5 digits]
 <POPCOUNTRY>  Place of performance country [Up to 32 characters]
 </PRESOL>     

备注

  1. 所有红色标签代表所需数据。
    • 表示经过验证的数据。
  2. <LINK><URL><DESC>是一个群组数据,应该一起提供或省略。
  3. <EMAIL><ADDRESS><DESC>是一个群组数据,应该一起提供或省略。
  4. 实施例

    <PRESOL> 
    <DATE> 0521 
    <YEAR> 99 
    <CBAC> demo 
    <PASSWORD> DEMO 
    <ZIP> 22030 
    <CLASSCOD> B 
    <NAICS>123456 
    <OFFADD> Office of Environmental Studies; 1323 Y Street, Washington, DC 22030 
    <SUBJECT> B--ENERGY AND ENVIRONMENTAL SERVICES KNOWLEDGE DEVELOPMENT AND DISSEMINATION ACTIVITIES REGARDING THE HOMELESS MENTALLY ILL POPULATION 
    <SOLNBR> 208-94-0008 
    <RESPDATE> 061399 
    <ARCHDATE> 07131999 
    <CONTACT> Mary Ann Deal, Contract Specialist, 301-443-5329; Contracting Officer, Beatrice L. Woods, 301-443-0043 
    <DESC> The Center for Mental Health Services is soliciting proposals on a full and open competitive basis from qualified organizations to award a 3-year contract to develop and disseminate new knowledge about effective approaches to providing comprehensive community-based services to persons with serious mental illnesses who are homeless. 
    <LINK> 
    <URL> http://www.abc.gov 
    <DESC> Center for Mental Health <EMAIL> <ADDRESS> johndoe@usa.gov 
    <DESC> Center for Mental Health <SETASIDE> Total Small Disadvantage Business 
    <POPADDRESS> Office of Environmental Studies; 1323 Y Street; Washington, DC 22030 
    <POPZIP> 22030 
    <POPCOUNTRY> US 
    </PRESOL>
    

答案 1 :(得分:4)

(明白我之前没见过 - 这是一些挖掘的结果)

这是Presolicitation Notice的格式,由美国发布的Federal Business Opportunities ......某事。这是该组织定义的fifteen data interchange formats之一。

我找不到该模板的基本格式的描述。这是不幸的,因为SGML中有很多陷阱(正如我在评论中提到的,这肯定看起来很像SGML)如果你没有为它们做好准备会咬你。以下是维基百科的一个有趣示例:<QUOTE></QUOTE>也可以写成:<QUOTE//<QUOTE>

template documentation仅限于每个字段中预期的数据格式。例如:

  

<CLASSCOD>

     

要么列出概要的服务或供应的一个字母代码或两位数代码。有效的分类代码(FAR,第5.207(g)节)