打开.webarchive修改它并保存它

时间:2011-10-27 14:00:26

标签: objective-c cocoa webview webarchive

我正在为Lion开发一个应用程序,我想要做的是打开一个.webarchive文件,修改DOM的片段,然后将修改后的DOM写出到同一个文件中。

到目前为止,这是我的代码。它打开webarchive,修改它,然后将其保存回文件。

    NSString *archivePath = @"/Users/tigger/Library/Mail/V2/MailData/Signatures/1216DD8D-C7E2-4DE1-9FCD-0A9A3412C788.webarchive";
    NSData *plistData = [NSData dataWithContentsOfFile:archivePath];
    NSString *error;
    NSPropertyListFormat format;
    NSMutableDictionary *plist;

    plist = (NSMutableDictionary *)[NSPropertyListSerialization propertyListFromData:plistData
                                             mutabilityOption:NSPropertyListMutableContainersAndLeaves
                                                       format:&format
                                             errorDescription:&error];
    if(!plist){
        printf("no plist");
        [error release];
    }else{
        NSString *s = [NSString stringWithUTF8String:[[[plist objectForKey:@"WebMainResource"] objectForKey:@"WebResourceData"] bytes]];
        NSString *new = [s stringByReplacingOccurrencesOfString:@"</body>" withString:@"hey there!</body>"];

        [[plist objectForKey:@"WebMainResource"] setObject:new forKey:@"WebResourceData"];
        printf("Archive: %s", [[plist description] UTF8String]);       
        NSData *data = [NSPropertyListSerialization dataFromPropertyList:plist format:NSPropertyListBinaryFormat_v1_0 errorDescription:nil];
        [data writeToURL:[NSURL fileURLWithPath:@"/Users/tigger/Library/Mail/V2/MailData/Signatures/test.webarchive"] atomically:YES];

    }

问题是生成的webarchive无效。原件看起来像这样:

bplist00—_WebMainResource’  
_WebResourceTextEncodingName_WebResourceFrameName^WebResourceURL_WebResourceData_WebResourceMIMETypeUUTF-8PUdata:O<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Dan Shipper</div><div>dshipper@gmail.com</div><div><br></div></body></span><br class="Apple-interchange-newline">Ytext/html(F]l~îöõ°™
¥

虽然生成的webarchive看起来像这样:

bplist00—_WebMainResource’  
^WebResourceURL_WebResourceFrameName_WebResourceMIMEType_WebResourceData_WebResourceTextEncodingNameUdata:PYtext/html_<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Dan Shipper</div><div>dshipper@gmail.com</div><div><br></div>hey there!</body></span><br class="Apple-interchange-newline">UUTF-8(7Ndvîöõ•∏
æ

任何人都有任何关于它无效或如何解决的想法?非常感谢你的帮助!

我也尝试使用textutil convert命令生成webarchive,但它不起作用,因为在我的原始HTML文件中我有这样的图像:

<img src="http://www.domainpolish.com/images/crowd.png">

但是当我使用textutil时,它会下载图像并将其保存为:

<img src"file:///1.png">

即使我不想让它下载或更改网址。我使用了noload,nostore和baseurl选项无济于事。

编辑:修正了!!所以问题是当我更换HTML时,我将它作为NSString而不是NSData插入:

NSString *s = [NSString stringWithUTF8String:[[[plist objectForKey:@"WebMainResource"] objectForKey:@"WebResourceData"] bytes]];
NSString *new = [s stringByReplacingOccurrencesOfString:@"</body>" withString:@"hi there!</body>"];
NSData *sourceData = [new dataUsingEncoding:NSUTF8StringEncoding];
[[plist objectForKey:@"WebMainResource"] setObject:sourceData forKey:@"WebResourceData"];

2 个答案:

答案 0 :(得分:3)

更新:我刚刚重新阅读了该问题并看到了解决方案......

您正在使用此行中的错误对象替换主资源数据:

[[plist objectForKey:@"WebMainResource"] setObject:new forKey:@"WebResourceData"];

newNSString,您应该成为NSData对象:

替换后,您应该将字符串内容转换为二进制数据。

[[plist objectForKey:@"WebMainResource"] setObject:[new dataUsingEncoding:NSUTF8StringEncoding] forKey:@"WebResourceData"];

答案 1 :(得分:1)

来自Wikipedia

  

webarchive格式是源文件的串联   文件名使用NSKeyedEncoder以二进制plist格式保存。

考虑到这一点,您可以使用NSKeyedEncoder查找文件列表,然后使用NSData分割文件并找到您要查找的HTML