Question

我即将启动一个使用Epidoc xml记录文本信息的项目。以下是一个示例：http://www.stoa.org/epidoc/gl/latest/supp-structure.html 我想将数据存储在Postgresql中。我理解xml，我理解postgresql的基础知识。将这两件事放在一起的正确/最佳方法是什么？

例如，我可以使用sql从db中选择*，其中xmltag = value

Answer 1

非常短且简化的迷你引物

创建表格，它们看起来像

CREATE TABLE xml_table 
(
    document_id integer /* you'd normally use serial */ PRIMARY KEY,
    xml_data xml
) ;

查看有关XML data type。

的PostgreSQL文档

您将使用以下查询填写表格：

/* If you use XML as content, you'd insert it this way */
INSERT INTO
    xml_table (document_id, xml_data)
VALUES
        (1, xmlparse(content '<doc><title>Doc title</title></doc>')),

    (2, xmlparse(content '<doc>
          <preface>This is the preface</preface>
             <chapter><title>Hello</title><content>This is a content</content></chapter>
         <chapter><title>Good Bye</title><content>This is the end</content></chapter>
     </doc>')),
    (3, xmlparse(content '<doc>
         <preface>Yet a preface</preface>
             <chapter><title>C1</title><content>Content of C1</content></chapter>
         <chapter><title>C2</title><content>Content of C2</content></chapter>
     </doc>')) ;

为了简洁起见，我现在不使用EpiDocs作为例子，但概念是一样的。

请注意，通常情况下，您不希望将整个数据库存储为单个XML文档（对大多数数据库来说效率低下），而是将数据标识为多个（或者更方便使用的数据）识别它们）

如果您插入整个文档（并且EpiDoc似乎需要这种方法）：

/* If your XML are documents, this way */
INSERT INTO
    xml_table (document_id, xml_data)
VALUES
    (4, xmlparse(document '<?xml version="1.0"?><book><title>Oh my God</title><content>Short book</content></book>')) ;

请注意，PostgreSQL将不检查您的文档是否符合您的DTD（这将要求数据库查询外部世界，这通常超出了数据库）。如果需要确保，在将值插入数据库之前，必须在软件中检查是否符合要求。

您将以这种方式检索整个文档（或内容）：

SELECT xml_data FROM xml_table WHERE document_id = 3 ;

虽然您通常会使用xpath和xpath_exists查询以获取特定项目。例如，想象一下你想获得每本书的 last 章节的标题（有章节）。您使用：

SELECT /* Get the text content of title of the last chapter of every doc */ xpath('/doc/chapter[last()]/title/text()', xml_data) AS result FROM xml_table WHERE /* Choose only the docs where they have (at least) a chapter with title */ xpath_exists('/doc/chapter/title', xml_data) ;

检查PostgreSQL XML个功能和XPath Intro。

Postgresql，Epidoc和都柏林核心 - 数据库和xml

1 个答案: