如何使用Xquery提取国家/地区名称?

时间:2019-01-21 03:52:56

标签: xml xquery flwor

tempdata.xml

<ArticleSet>
<Article>
    <LastName>Chang</LastName>
    <ForeName>K W</ForeName>
    <Affiliation>Department of Surgery, Army General Hospital, Taiwan, Republic of
    China.</Affiliation>
</Article>
<Article>       
    <LastName>Ferree</LastName>
    <ForeName>B A</ForeName>
    <Affiliation>Children's Hospital Medical Center, Cincinnati, Ohio.</Affiliation>        
</Article>
<Article>
    <LastName>Dyck</LastName>
    <ForeName>P</ForeName>
    <Affiliation>Department of Neurosurgery, University of Southern California, Los Angeles.</Affiliation>      
</Article>
<Article>
    <LastName>Lonstein</LastName>
    <ForeName>J E</ForeName>
    <Affiliation>Minnesota Spine Center, Minneapolis 55454-1419.</Affiliation>      
</Article>
</ArticleSet>

Countries.xml

<Countries>
    <Country>
        <id>1</id>
        <name>Los Angeles</name>
        <code>ad</code>
    </Country>
    <Country>
        <id>2</id>
        <name>Republic of China</name>
        <code>ae</code>
    </Country>
    <Country>
        <id>3</id>
        <name>China</name>
        <code>af</code>
    </Country>
    <Country>
        <id>4</id>
        <name>Ohio</name>
        <code>ag</code>
    </Country>
</Countries>

XQuery代码

declare variable $tokens:="";
declare variable $aff:="";
for $article in doc("tempdata.xml")/ArticleSet/Article
  let $aff:=data($article/Affiliation)
  let $aff:=replace($aff,'[;,.]',',')
  for $tokens in tokenize($aff,',')
    for $countries in doc("countries.xml")/Countries/Country
      return if($countries/name= normalize-space($tokens))
        then <Country>{data($countries/name)}</Country>

此XQuery代码将Affiliation的{​​{1}}标记中的字符串与tempdata.xml文件中的国家列表匹配,并显示国家名称。首先,对关联字符串进行标记化,然后将每个标记与可用国家/地区列表进行匹配。

输出

Countries.xml

我想为没有国家/地区的字符串打印一个<Country>Republic of China</Country> <Country>Ohio</Country> <Country>Los Angeles</Country> 标签。例如在第4联盟中没有国家,因此在这种情况下,我想插入一个基于连字符的标签。所以我的问题是在哪里写其他部分,以便获得以下输出。

必需的输出

<Country>-</Country>

1 个答案:

答案 0 :(得分:0)

您当前的查询可能每篇文章返回多个<Country>元素,而每个匹配的从属元素都会返回一个。您最多只能依靠现有的一场比赛。您可以收集所有匹配项,将"-"添加为后备,然后采用该序列的第一个候选项:

for $article in doc("tempdata.xml")/ArticleSet/Article
let $country :=
  for $aff in tokenize($article/Affiliation, '[;,\.]')
  where doc("countries.xml")/Countries/Country/name = normalize-space($aff)
  return normalize-space($aff)
return <Country>{($country, '-')[1]}</Country>