我需要将包含<h2>..</h2>
,<p>..</p>
和<a href=".."><img ..></a>
元素的HTML数据转换为具有正确格式的attributedString。我想将<h2>
分配给UIFontTextStyleHeadline1
和<p>
分配给UIFontTextStyleBody
并存储图片链接。我需要输出为仅带有标题和body元素的referencedString,我将分别处理图像。
到目前为止,我有这段代码:
NSMutableAttributedString *content = [[NSMutableAttributedString alloc]
initWithData:[[post objectForKey:@"content"]
dataUsingEncoding:NSUTF8StringEncoding]
options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]}
documentAttributes:nil error:nil];
输出到这样的东西:
Heading
{
NSColor = "UIDeviceRGBColorSpace 0 0 0 1";
NSFont = "<UICTFont: 0xd47bc00> font-family: \"TimesNewRomanPS-BoldMT\"; font-weight: bold; font-style: normal; font-size: 18.00pt";
NSKern = 0;
NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 14.94, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 2";
NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1";
NSStrokeWidth = 0;
}{
NSAttachment = "<NSTextAttachment: 0xd486590>";
NSColor = "UIDeviceRGBColorSpace 0 0 0.933333 1";
NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt";
NSKern = 0;
NSLink = "http://www.placeholder.com/image.jpg";
NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0";
NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0.933333 1";
NSStrokeWidth = 0;
}
Body text, body text, body text. Body text, body text, body text.
{
NSColor = "UIDeviceRGBColorSpace 0 0 0 1";
NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt";
NSKern = 0;
NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0";
NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1";
NSStrokeWidth = 0;
}
我是attributesString的新手,并寻求一种将这些属性转换为上述标准字体的有效方法。谢谢。
答案 0 :(得分:0)
如果有人会寻求类似的东西,我最终使用TFHpple librabry将图像与HTML数据中的文本元素分开,然后我改变了attributesString的格式属性,如下所示:
NSString *contentString = [self parseHTMLdata:bodyString];
NSMutableAttributedString *content = [[NSMutableAttributedString alloc] initWithData:[contentString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil];
// prepare new format
NSRange effectiveRange = NSMakeRange(0, 0);
NSDictionary *attributes;
while (NSMaxRange(effectiveRange) < [content length]) {
attributes = [content attributesAtIndex:NSMaxRange(effectiveRange) effectiveRange:&effectiveRange];
UIFont *font = [attributes objectForKey:@"NSFont"];
if (font.pointSize == 18.0f) {
[content addAttribute:NSFontAttributeName value:self.headlineFont range:effectiveRange];
} else {
[content addAttribute:NSFontAttributeName value:self.bodyFont range:effectiveRange];
}
}
和hpple部分:
- (NSString *)parseHTMLdata:(NSString *)content
{
NSData *data = [content dataUsingEncoding:NSUTF8StringEncoding];
TFHpple *parser = [[TFHpple alloc] initWithHTMLData:data];
NSString *xpathQueryString = @"//body";
NSArray *elements = [[[parser searchWithXPathQuery:xpathQueryString] firstObject] children];
NSMutableString *textContent = [[NSMutableString alloc] init];
for (TFHppleElement *element in elements) {
if ([[element tagName] isEqualToString:@"h2"] || [[element tagName] isEqualToString:@"p"]) {
if ([[[element firstChild] tagName] isEqualToString:@"a"]) {
// image element, just save it in array
} else {
// pure h2 or p element
[textContent appendString:[element raw]];
}
}
}
return textContent;
}
检查属性中的字体大小可能看起来很脆弱,如果它会导致一些问题我可以深入挖掘包含标题/正文标记的段落样式。