我需要将给定HTML的字体系列和字体大小更改为特定的字体系列和大小。 (例如:Times New Romen,大小:12)你知道如何使用HtmlAgilityPack完成它吗?
可以在给定的html中以多种方式定义字体大小。例如:使用<Font Size="" tag, <H3>
,也使用样式标记。因此,我需要将所有内容更改为特定的字体大小。
以下是示例HTML代码:
<html><H3 style="MARGIN: 0in 0in 0pt 0.5in"><SPAN style="mso-bidi-font-family: 'Tw Cen MT Condensed Extra Bold'; mso-fareast-font-family: 'Tw Cen MT Condensed Extra Bold'"><SPAN style="mso-list: Ignore"><FONT size="5" face="Tw Cen MT Condensed Extra Bold">1.1.1</FONT><SPAN style="FONT: 7pt 'Times New Roman'"> </SPAN></SPAN></SPAN><FONT size="5" face="Tw Cen MT Condensed Extra Bold">Sample text1: The following code iterates through all the items in the ListBox and addsPictureBoxes dynamically to a FlowLayoutPanel using the image sources retrieved in the previous step.</FONT></H3>
<P style="MARGIN: 0in 0in 0pt" class="MsoNormal"><?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /?></P>
<H3 style="MARGIN: 0in 0in 0pt 0.5in"><SPAN style="mso-bidi-font-family: 'Tw Cen MT Condensed Extra Bold'; mso-fareast-font-family: 'Tw Cen MT Condensed Extra Bold'"><SPAN style="mso-list: Ignore"><FONT size="5" face="Tw Cen MT Condensed Extra Bold">1.1.2</FONT><SPAN style="FONT: 7pt 'Times New Roman'"> </SPAN></SPAN></SPAN><FONT size="5" face="Tw Cen MT Condensed Extra Bold">Sample text 2: The following code iterates through all the items in the ListBox and addsPictureBoxes dynamically to a FlowLayoutPanel using the image sources retrieved in the previous step.</FONT></H3>
<P style="TEXT-INDENT: -0.25in; MARGIN: 0in 0in 0pt 0.5in; mso-list: l0 level1 lfo2; tab-stops: list .5in" class="MsoNormal"><SPAN style="FONT-FAMILY: 'Bauhaus 93'; FONT-SIZE: 20pt; mso-bidi-font-family: 'Bauhaus 93'; mso-fareast-font-family: 'Bauhaus 93'"><SPAN style="mso-list: Ignore">a.<SPAN style="FONT: 7pt 'Times New Roman'"> </SPAN></SPAN></SPAN><SPAN style="FONT-FAMILY: 'Bauhaus 93'; FONT-SIZE: 20pt">Sample text 3: The following code iterates through all the items in the ListBox and addsPictureBoxes dynamically to a FlowLayoutPanel using the image sources retrieved in the previous step.</SPAN></P>
<P style="MARGIN: 0in 0in 0pt 0.25in" class="MsoNormal"><SPAN style="FONT-FAMILY: 'Bauhaus 93'; FONT-SIZE: 20pt"></SPAN></P>
<P style="MARGIN: 0in 0in 0pt 0.5in" class="MsoNormal"><SPAN style="FONT-FAMILY: 'Bradley Hand ITC'; FONT-SIZE: 18pt">Sample Text 4: The following code iterates through all the items in the ListBox and addsPictureBoxes dynamically to a FlowLayoutPanel using the image sources retrieved in the previous step.</SPAN></P></html>
答案 0 :(得分:1)
我是使用HtmlAgilityPack完成的。以下是我开发的代码。
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(inputHtml);
var elementsWithStyleAttribute = doc.DocumentNode.SelectNodes(string.Concat("//", tagName));
if (null == elementsWithStyleAttribute)
{
return inputHtml;
}
foreach (var element in elementsWithStyleAttribute)
{
var classElement = element.GetAttributeValue("class", null);
if (!string.IsNullOrWhiteSpace(classElement))
{
// Remove class attribute.
element.Attributes["class"].Remove();
}
var styles = element.GetAttributeValue("style", null);
if (!string.IsNullOrWhiteSpace(styles)) //&& (styles.ToUpper().Contains("FONT-FAMILY:") || styles.ToUpper().Contains("FONT-SIZE:")))
{
element.Attributes["style"].Remove();
string[] splitter = { ";" };
string[] styleClasses = styles.Split(splitter, StringSplitOptions.None);
StringBuilder sbStyles = new StringBuilder("font-family:Arial; font-size:10pt;");
if (null != styleClasses && styleClasses.Length > 0)
{
foreach (var item in styleClasses)
{
if (!string.IsNullOrWhiteSpace(item) && !item.ToUpper().Contains("FONT-FAMILY:")
&& !item.ToUpper().Contains("FONT-SIZE:") && !item.ToUpper().Contains("FONT:"))
{
// Add existing styles except font size and font family styles.
sbStyles.Append(item.Trim());
sbStyles.Append(";");
}
}
}
element.SetAttributeValue("style", sbStyles.ToString());
}
else
{
element.SetAttributeValue("style", "font-family:Arial; font-size:10pt;");
}
}
return doc.DocumentNode.InnerHtml;