将粗体标记之间的文本分配给变量

时间:2018-04-05 18:12:49

标签: powershell

为变量分配了一行HTML,如何最有效地提取Bold标签之间的文本并将该文本分配给第二个变量?即:

<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>

我想指定&#34; Cyan Cartridge&#34;到$ x。

3 个答案:

答案 0 :(得分:1)

首选选项是使用Parse local HTML file上列出的powershell的HTML解析。

但是,如果这不起作用(答案表明它有问题),那么您的下一个最佳选择是正则表达式:

Select-String

<b>可以在本地文件上运行一个表示文件内容的字符串。它默认情况下也不区分大小写,因此它会找到<B>askDelete() { echo -e " Still want to delete it? (y/n)\n" read answer if [ "$answer" = 'y']; then rm $1 else echo -e "\nFile was not removed\n" fi } ############################################# clear #script starts here echo -e "\n\tCleaner Script\n" dir=`pwd` while [ "$choice" -ne 3 ] || [ "$choice" != "quit" ] do echo -e "\nEnter 1 to delete by filename or type the word file." echo -e "\nEnter 2 to delete by a string within a file or type the word string" echo -e "\nEnter 3 or quit to exit this program.\n" read choice case "$choice" in 1|"file") echo -e"Enter the name of the file to delete: " read file result=$(find . -name "$file") if [ -z $result ]; then echo "File not found" else askDelete $file fi ;; 2|"string") echo -e "Enter the sting to delete the files that contain it: " read searchstring result=$(find $dir -type f -perm /400) echo $result for file in $result; do echo -e "String is $searchstring \nFile is $file" grep –q "$searchstring" "$file" if [ $? -eq 0 ]; then echo "****MATCH****" askDelete $file fi done ;; 3|"quit") echo -e "Exiting program" break;; *) echo -e "\nChoice not listed";; esac done

通常,正则表达式是解析HTML的一种不好的方法 - 仅将它们用于一次性或个人脚本,而不是需要依赖于不破坏的东西。

答案 1 :(得分:1)

你可以使用正则表达式 - 虽然谨慎,因为正则表达式和HTML可能是一个片状组合。

$x = '<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>'
$y = $x | select-string -pattern "(?<=<B>|b>)(.|\n)*?(?=<\/B|b)" | % {$_.matches} | % {$_.Value}
$y

<强>输出

  

青色墨盒

有关正则表达式的解释:https://regex101.com/r/Ub7LsG/1

答案 2 :(得分:1)

$html = New-Object -ComObject "HTMLFile"
$source = '<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>'
$html.IHTMLDocument2_write($source)

foreach($node in $html.body.childNodes)
{
    if($node.tagname -eq "b")
    {
        $node.innerHTML = "test"
    }

}

沿着这些方向的东西应该有效,虽然我的最终html值看起来有点偏,它似乎正确地更新了B标签之间的值。我可能一直在寻找最终身体的错误属性。