如何在Applescript中返回HTML解析时获得正确格式化的引号?

时间:2012-11-20 21:55:09

标签: applescript

我正在使用此代码从网站上提取说明:

set astid to AppleScript's text item delimiters

set startHere to "<div id=\"doc-original-text\" itemprop=\"description\">"
set stopHere to "</div>"
set mysource_html to do shell script "curl https://play.google.com/store/movies/details?id=H9EKG4-JHSw"
set AppleScript's text item delimiters to startHere
set blurb1 to text item 2 of mysource_html
set AppleScript's text item delimiters to stopHere
set blurb2 to text item 1 of blurb1

set AppleScript's text item delimiters to astid
blurb2

这就是它的回报:

"Liam Neeson stars in producer/director Joe Carnahan&#39;s tense adventure thriller
about a group of tough-as-nails oil rig workers who must fight for their lives in the
Alaskan wilderness after their airplane crashes miles from civilization. With supplies 
running short and hungry wolves closing in, the shaken survivors face a fate worse than 
death if they don&#39;t act fast. Dermot Mulroney, Dallas Roberts, and Frank Grillo co-
star."

如何在报价时将报价正确格式化为'而不是&#39;

1 个答案:

答案 0 :(得分:1)

ASObjC Runner具有一些编码/解码文本的有用功能。

set astid to AppleScript's text item delimiters

set startHere to "<div id=\"doc-original-text\" itemprop=\"description\">"
set stopHere to "</div>"
set mysource_html to do shell script "curl https://play.google.com/store/movies/details?id=H9EKG4-JHSw"
set AppleScript's text item delimiters to startHere
set blurb1 to text item 2 of mysource_html
set AppleScript's text item delimiters to stopHere
set blurb2 to text item 1 of blurb1

set AppleScript's text item delimiters to astid
tell application "ASObjC Runner" to set blurb3 to modify string blurb2 so it is unencoded for XML

或者您可以继续使用文本项分隔符:

set astid to AppleScript's text item delimiters

set startHere to "<div id=\"doc-original-text\" itemprop=\"description\">"
set stopHere to "</div>"
set mysource_html to do shell script "curl https://play.google.com/store/movies/details?id=H9EKG4-JHSw"
set AppleScript's text item delimiters to startHere
set blurb1 to text item 2 of mysource_html
set AppleScript's text item delimiters to stopHere
set blurb2 to text item 1 of blurb1
set AppleScript's text item delimiters to "&#39;"
set blurb2 to text items of blurb2
set AppleScript's text item delimiters to "'"
set blurb2 to blurb2 as text
set AppleScript's text item delimiters to astid