如何使用PowerShell提取Epub元数据(XML)?

时间:2012-11-30 17:07:16

标签: xml powershell epub

我不是PowerShell的新手,但我是XML解析。基本上我想从OPF文件中提取标题,创建者和发布者信息,该文件只是一个xml文件。下面的书是来自Google的epub v3样本集的Moby Dick。

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="pub-  id" prefix="cc: http://creativecommons.org/ns#">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title id="title">Moby-Dick</dc:title>
        <meta refines="#title" property="title-type">main</meta>
        <dc:creator id="creator">Herman Melville</dc:creator>
        <meta refines="#creator" property="file-as">MELVILLE, HERMAN</meta>
        <meta refines="#creator" property="role" scheme="marc:relators">aut</meta>
        <dc:identifier id="pub-id">code.google.com.epub-samples.moby-dick-basic</dc:identifier>
        <dc:language>en-US</dc:language>
        <meta property="dcterms:modified">2012-01-18T12:47:00Z</meta>
        <dc:publisher>Harper &amp; Brothers, Publishers</dc:publisher>
        <dc:contributor id="contrib1">Dave Cramer</dc:contributor>
        <meta refines="#contrib1" property="role" scheme="marc:relators">mrk</meta>
        <dc:rights>This work is shared with the public using the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.</dc:rights>        
        <link rel="cc:license" href="http://creativecommons.org/licenses/by-sa/3.0/"/>
        <meta property="cc:attributionURL">http://code.google.com/p/epub-samples/</meta>
    </metadata>
</package>

我试过了:

[xml]$opf = gc path/to/package.opf
$opf.package.metdata

我只能使用此而不是文本获取标记和属性信息。

1 个答案:

答案 0 :(得分:3)

您需要像这样使用#text属性来获取一些值:

[xml] $opf = gc .\moby.opf

$title = $opf.package.metadata.title.'#text'
$creator = $opf.package.metadata.creator.'#text'
$publisher = $opf.package.metadata.publisher

Write-Host "$title written by $creator and published by $publisher"