如何用他们的base64代码替换html文件中的所有图像? (Powershell的)

时间:2015-08-09 08:46:30

标签: html image powershell replace base64

我使用名为Belarc Avisor的软件,它提供所有硬件 - 软件详细信息的html输出,包括html格式的已安装软件的许可证/密钥/连续出版物。我通常在新PC上或在格式化PC之前从该软件创建此报告。但是chrome导出的文件使用单独的文件夹来存储图像,我需要一个包含所有细节和图像的独立html文件(包括html报告的css样式)。

我目前不得不用在线网站生成的base64代码替换notepad ++中的图像。我正在寻找一种替代方法来在批处理脚本或Powershell中执行此操作。我找到了两个stackoverflow问题{q1},{q2}和一个{blog-post},并提供以下代码:

    $original_file = 'path\filename.html'
    $destination_file =  'path\filename.new.html'
    (Get-Content $original_file) | Foreach-IMG-SELECTOR-Object {
        $path = $_ SOURCE-TAG-SELECTOR `
        -replace $path, [convert]::ToBase64String((get-content $path -encoding byte))
    } | Set-Content $destination_file

Foreach-Object中,也许html img标签可以选择对象?如果是,那么base64转换将非常简单!

要转换为base64,字符串为:     [convert]::ToBase64String((get-content $path -encoding byte))

其中$path是图像的路径。可以从<img src="">标记中复制。

我刚刚读到Windows 10有Powershell 5.0,所以我想我可以创建一个批处理文件来创建它。

所以,如果img标签&amp;可以选择src属性,只需要用base64标记替换它们。

答案

的修改版本

Alexendar提供的答案无效,因为在循环期间,属性值被设置为#Document,而它应该被设置为当前节点。在线搜索并阅读Powershell控制台后,我发现可以通过XPath选择当前节点来解决这个问题。这是修改后的答案:

Import-Module -Name "C:\HtmlAgilityPack.1.4.6\Net40\HtmlAgilityPack.dll" # Change to your actual path

function Convert_to_Base64 ($sImgFile)
{
#$sImgFile = "C:\image.jpg" # Change to your actual path
$oImgFormat = [System.Drawing.Imaging.ImageFormat]::Gif # Change to your format

$oImage = [System.Drawing.Image]::FromFile($sImgFile)
$oMemoryStream = New-Object -TypeName System.IO.MemoryStream
$oImage.Save($oMemoryStream, $oImgFormat)
$cImgBytes = [Byte[]]($oMemoryStream.ToArray())
$sBase64 = [System.Convert]::ToBase64String($cImgBytes)

$sBase64
}


$sInFile = "C:\Users\USER\Desktop\BelarcAdvisor win10\Belarc Advisor Computer Profile.html" # Change to your actual path
$sOutFile = "D:\Win10-Belarc.html" # Change to your actual path
$sPathBase = "C:\Users\USER\Desktop\BelarcAdvisor win10\"

$sXpath = "//img"
$sAttributeName = "src"

$oHtmlDocument = New-Object -TypeName HtmlAgilityPack.HtmlDocument
$oHtmlDocument.Load($sInFile)
$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | ForEach-Object {
    # If you need to download the image, here's how you can extract the image
    # URI (note that it may be realtive, not absolute):

    $sVarXPath = $_ #To get the Current Node and then later get Attributes + XPathXPath from this node variable.

    #$sVarXPath.XPath

    $sSrcPath = $sVarXPath.get_Attributes() `
        | Where-Object { $_.Name -eq $sAttributeName } `
        | Select-Object -ExpandProperty "Value"
    # Assembling absolute URI:
    $sUri = Join-Path -Path $sPathBase -ChildPath $sSrcPath.substring(2) #substring for "./" in the src string of the img in subfolder.
    #$sUri
    # Now you can d/l the image: Invoke-WebRequest -Uri $sUri
    #[System.Drawing.Image]::FromFile($sUri)

    # Put your Base64 conversion code here.
    $sBase64 = Convert_to_Base64($sUri)

    $sSrcValue = "data:image/png;base64," + $sBase64
    $oHtmlDocument.DocumentNode.SelectNodes($sVarXPath.XPath).SetAttributeValue($sAttributeName, $sSrcValue)
    #$oHtmlDocument.DocumentNode.SelectNodes($sVarXPath.XPath).GetAttributeValue($sAttributeName, "")
}

#$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | foreach-object { write-output $_ }

$oHtmlDocument.Save($sOutFile)

2 个答案:

答案 0 :(得分:4)

这很容易。您可以使用HtmlAgilityPack来解析HTML:

Import-Module -Name "C:\HtmlAgilityPack.dll" # Change to your actual path

$sInFile = "E:\Temp\test.html" # Change to your actual path
$sOutFile = "E:\temp\test1.html" # Change to your actual path
$sUriBase = "http://example.com/" # Change to your actual URI base

$sXpath = "//img"
$sAttributeName = "src"

$oHtmlDocument = New-Object -TypeName HtmlAgilityPack.HtmlDocument
$oHtmlDocument.Load($sInFile)
$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | ForEach-Object {
    # If you need to download the image, here's how you can extract the image
    # URI (note that it may be realtive, not absolute):
    $sSrcPath = $_.get_Attributes() `
        | Where-Object { $_.Name -eq $sAttributeName } `
        | Select-Object -ExpandProperty "Value"
    # Assembling absolute URI:
    $sUri = $sUriBase + $sSrcPath
    # Now you can d/l the image: Invoke-WebRequest -Uri $sUri


    # Put your Base64 conversion code here.
    $sBase64 = ...

    $sSrcValue = "data:image/png;base64," + $sBase64
    $_.SetAttributeValue($sAttributeName, $sSrcValue)
}

$oHtmlDocument.Save($sOutFile)

Converting image file to Base64 string

$sImgFile = "C:\image.jpg" # Change to your actual path
$oImgFormat = [System.Drawing.Imaging.ImageFormat]::Jpeg # Change to your format

$oImage = [System.Drawing.Image]::FromFile($sImgFile)
$oMemoryStream = New-Object -TypeName System.IO.MemoryStream
$oImage.Save($oMemoryStream, $oImgFormat)
$cImgBytes = [Byte[]]($oMemoryStream.ToArray())
$sBase64 = [System.Convert]::ToBase64String($cImgBytes)

答案 1 :(得分:1)

部分答案在这里,但我能够执行从单个图像文件到输出文本文件的转换,数据URI编码只需一行:

"data:image/png;base64," + [convert]::tobase64string([io.file]::readallbytes(($pwd).path + "\\image.png")) | set-content -encoding ascii "image.txt"

(请注意,输出文件编码似乎有所不同。)

主要是发布这个,因为这是我的网络搜索中出现的内容,它也简化了亚历山大答案中的转换。