为变量分配了一行HTML,如何最有效地提取Bold标签之间的文本并将该文本分配给第二个变量?即:
<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>
我想指定&#34; Cyan Cartridge&#34;到$ x。
答案 0 :(得分:1)
首选选项是使用Parse local HTML file上列出的powershell的HTML解析。
但是,如果这不起作用(答案表明它有问题),那么您的下一个最佳选择是正则表达式:
Select-String
<b>
可以在本地文件上运行一个表示文件内容的字符串。它默认情况下也不区分大小写,因此它会找到<B>
和askDelete()
{
echo -e " Still want to delete it? (y/n)\n"
read answer
if [ "$answer" = 'y']; then
rm $1
else
echo -e "\nFile was not removed\n"
fi
}
#############################################
clear
#script starts here
echo -e "\n\tCleaner Script\n"
dir=`pwd`
while [ "$choice" -ne 3 ] || [ "$choice" != "quit" ]
do
echo -e "\nEnter 1 to delete by filename or type the word file."
echo -e "\nEnter 2 to delete by a string within a file or type the word string"
echo -e "\nEnter 3 or quit to exit this program.\n"
read choice
case "$choice" in
1|"file") echo -e"Enter the name of the file to delete: "
read file
result=$(find . -name "$file")
if [ -z $result ]; then
echo "File not found"
else
askDelete $file
fi
;;
2|"string") echo -e "Enter the sting to delete the files that contain it: "
read searchstring
result=$(find $dir -type f -perm /400)
echo $result
for file in $result;
do
echo -e "String is $searchstring \nFile is $file"
grep –q "$searchstring" "$file"
if [ $? -eq 0 ]; then
echo "****MATCH****"
askDelete $file
fi
done
;;
3|"quit") echo -e "Exiting program"
break;;
*) echo -e "\nChoice not listed";;
esac
done
。
通常,正则表达式是解析HTML的一种不好的方法 - 仅将它们用于一次性或个人脚本,而不是需要依赖于不破坏的东西。
答案 1 :(得分:1)
你可以使用正则表达式 - 虽然谨慎,因为正则表达式和HTML可能是一个片状组合。
$x = '<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>'
$y = $x | select-string -pattern "(?<=<B>|b>)(.|\n)*?(?=<\/B|b)" | % {$_.matches} | % {$_.Value}
$y
<强>输出强>
青色墨盒
有关正则表达式的解释:https://regex101.com/r/Ub7LsG/1
答案 2 :(得分:1)
$html = New-Object -ComObject "HTMLFile"
$source = '<TR><TD width="30%" colSpan=2><B>Cyan Cartridge</B></TD> <TD width="25%"> <TABLE borderColor=#000000 cellSpacing=0 width="100%" border=1>'
$html.IHTMLDocument2_write($source)
foreach($node in $html.body.childNodes)
{
if($node.tagname -eq "b")
{
$node.innerHTML = "test"
}
}
沿着这些方向的东西应该有效,虽然我的最终html值看起来有点偏,它似乎正确地更新了B标签之间的值。我可能一直在寻找最终身体的错误属性。