所以我在字符串上有此代码:
<html class="
no - js ie ie6 lt - ie9 lt - ie8 lt - ie7 search - context en_US " lang="
en - US "> <![endif]--><!--[if IE 7]> <html class="
no - js ie ie7 lt - ie9 lt - ie8 search - context en_US " lang="
en - US "> <![endif]--><!--[if IE 8]> <html class="
no - js ie ie8 lt - ie9 search - context en_US " lang="
en - US "> <![endif]--><!--[if IE 9]> <html class="
no - js ie9 search - context en_US " lang="
en - US "> <![endif]--><!--[if gt IE 9]><!--> <html class="
no - js search - context en_US " lang="
en - US "> <!--<![endif]--> <head><title>art anime | Tumblr</title><!--[if ie]><meta http-equiv="
X - UA - Compatible " content="
IE = Edge, chrome = 1 "/><![endif]--><meta http-equiv="
Content - Type " content="
text / html;
charset = utf - 8 "> <meta http-equiv="
x - dns - prefetch - control " content="
off "> <meta name="
application - name " content="
Tumblr "> <meta name="
msapplication - TileColor " content="
#3645d"> <meta name= "msapplication-TileImage"
content = "https://assets.tumblr.com/images/msfavicon.png?_v=245323c5cb69e705ea213d9ed60e543a" > < link rel = "shortcut icon"
href = "https://assets.tumblr.com/images/favicons/favicon.ico?_v=8bfa6dd3e1249cd567350c606f8574dc"
type = "image/png" > < meta name = "p:domain_verify"
content = "d06c4fa470a9a6935c9a7b43d57eb7d2" > < link rel = "apple-touch-icon"
sizes = "57x57"
href = "https://assets.tumblr.com/images/apple-touch-icon-57x57.png?_v=81406f92242ce0166328bc17e4473e6e"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "60x60"
href = "https://assets.tumblr.com/images/apple-touch-icon-60x60.png?_v=20e3957f7027b72d5aa60085204ae63c"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "72x72"
href = "https://assets.tumblr.com/images/apple-touch-icon-72x72.png?_v=8df24181ba31066d8710f67c9a287241"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "76x76"
href = "https://assets.tumblr.com/images/apple-touch-icon-76x76.png?_v=455617cae13eff40acffac5e489bde50"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "120x120"
href = "https://assets.tumblr.com/images/apple-touch-icon-120x120.png?_v=5604f95b165810101ea055f2cb5206b9"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "128x128"
href = "https://assets.tumblr.com/images/apple-touch-icon-128x128.png?_v=fd39307925fa7f1ada28b08d67e93da1"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "144x144"
href = "https://assets.tumblr.com/images/apple-touch-icon-144x144.png?_v=dfd5b392d423c5f0278d9f498abda2fa"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "152x152"
href = "https://assets.tumblr.com/images/apple-touch-icon-152x152.png?_v=73c2019bf6a75f7e476e01ba136cebec"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "180x180"
href = "https://assets.tumblr.com/images/apple-touch-icon-180x180.png?_v=00127c0342d97d5f36cfd8aa6439ca10"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "195x195"
href = "https://assets.tumblr.com/images/apple-touch-icon-195x194.png?_v=0"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "196x196"
href = "https://assets.tumblr.com/images/apple-touch-icon-196x196.png?_v=bb4b7eef0ef8e28101acdf2f0c265cc7"
type = "image/png" > < link rel = "apple-touch-icon"
sizes = "228x228"
href = "https://assets.tumblr.com/images/apple-touch-icon-228x228.png?_v=0c7874da12e347c2bdf95e5baa3f396a"
type = "image/png" > < link rel = "canonical"
href = "https://www.tumblr.com/search/art%20anime/recent" > < meta name = "robots"
id = "robots"
content = "noodp,noydir" > < meta name = "description"
id = "description"
content = "Tumblr is a place to express yourself, discover yourself, and bond over the stuff you love. It's where your interests connect you with your people." > < meta name = "keywords"
id = "keywords"
content = "tumblelog, blog, tumblog, tumbler, tumblr, tlog, microblog" > < meta name = "viewport"
id = "viewport"
content = "width=960" > < meta name = "tumblr-form-key"
id = "tumblr_form_key"
content = "!1231553203788|mOZjoWSzPU6eNRkpuhEtzoBJmdA" > < meta name = "tumblr-gpop"
id = "tumblr_gpop"
content = "Tumblr" > < meta name = "og:title"
id = "og_title"
content = "art anime | Tumblr" > < meta name = "og:image"
id = "og_image"
content = "https://66.media.tumblr.com/06afadc2cafb6945065a4b10d61f3b45/tumblr_poqjiuUUGO1tvlw71o1_r1_500.jpg" > < script type = "application/ld+json" >
我想得到content="https://66.media.tumblr.com ...
等
该字符串的链接,您知道字符串每小时都会更新
我以这种方式尝试不起作用,结果为-1和0 在Google脚本(Java脚本)编码上
function urll() { var response =
UrlFetchApp.fetch("https://www.tumblr.com/search/art+anime/recent");
var str = response.getContentText(); var m=str.search('is=“og_image”
content=“(^”*)'); Logger.log(m);
}
答案 0 :(得分:2)
使用jquery
来抓取它。按照上面的示例,id
中有一个唯一的og_image
,因此可以使用它来获取meta
元素,然后将其缩小到content
属性。
$("meta #og_image").attr("content");
假设您已经将整个混乱放在单个变量myString中:
var regex = /66\.[^"]*/;
var myLink = myString.match(regex);
myLink [0]将是网址。
要使此答案对其他类似情况(而不只是这种小问题)更有用,如果您尝试从包含许多内容的长字符串中拉出链接,则可以使用:
var regex = /http[^"]*/g;
var links = myString.match(regex);
这将为您提供所有链接的数组。
links[links.length-1];
将是最后一个,在这种情况下,这就是您想要的。
答案 1 :(得分:0)
您可以使用正则表达式,例如“ is =” og_image” content =“(^” *)”,它将使用每个内容网址 如果只需要最后一个,则还可以添加id参数验证