我需要在WordPress中的帖子内容中提取图像源以及图像的链接。我知道正则表达式可以做到这一点,但我不是很好。帖子内容可能包含大量文本,并且可能有多个图像,其中一些图像可能包含指向图像源或其他链接的链接。我需要获取与链接地址相关联的图像源。你能帮忙吗?
内容可能如下所示:
<a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Nymphs blitz quick vex dwarf jog. DJs flock by when MTV ax quiz prog.</p>
<p>Big fjords vex quick waltz nymph. Bawds jog, flick quartz, vex nymph. Waltz job vexed quick frog nymphs.</p><a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Junk MTV quiz graced by fox whelps. Bawds jog, flick quartz, vex nymphs. Waltz, bad nymph, for quick jigs vex! Fox nymphs grab quick-jived waltz.</p><a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Brick quiz whangs jumpy veldt fox. Glib jocks quiz nymph to vex dwarf. Bright vixens jump; dozy fowl quack. Vexed nymphs go for quick waltz job. Quick wafting zephyrs vex bold Jim.</p>
<p>Quick zephyrs blow, vexing daft Jim. Quick blowing zephyrs vex daft Jim. Sphinx of black quartz, judge my vow. Sex-charged fop blew my junk TV quiz. Both fickle dwarves jinx my pig quiz. Fat hag dwarves quickly zap jinx mob.</p><a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Hick dwarves jam blitzing foxy quip. Fox dwarves chop my talking quiz job. Public junk dwarves quiz mighty fox. Jack fox bids ivy-strewn phlegm quiz. How quickly daft jumping zebras vex. Two driven jocks help fax my big quiz. “Now fax quiz Jack!” my brave ghost pled.</p>
<p>Jack, love my big wad of sphinx quartz! Fickle jinx bog dwarves spy math quiz. Big dwarves heckle my top quiz of jinx. Fickle bog dwarves jinx empathy quiz. Public junk dwarves hug my quartz fox. Jumping hay dwarves flock quartz box. Five jumping wizards hex bolty quick. Five hexing wizard bots jump quickly.</p><a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Vamp fox held quartz duck just by wing. Five quacking zephyrs jolt my wax bed. The five boxing wizards jump quickly. Jackdaws love my big sphinx of quartz. My jocks box, get hard, unzip, quiver, flow. Kvetching, flummoxed by job, W. zaps Iraq. My ex pub quiz crowd gave joyful thanks. Cozy sphinx waves quart jug of bad milk. A very bad quack might jinx zippy fowls.</p><a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>Pack my box with five dozen liquor jugs. Few quips galvanized the mock jury box. Quick brown fox jumps over the lazy dog. Jumpy halfling dwarves pick quartz box. Vex quest wizard, judge my backflop hand. The jay, pig, fox, zebra and my wolves quack! Blowzy red vixens fight for a quick jump. Sex prof gives back no quiz with mild joy. The quick brown fox jumps over a lazy dog.</p>
<a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>A quick brown fox jumps over the lazy dog. Quest judge wizard bonks foxy chimp love. Boxers had zap of gay jock love, quit women. Joaquin Phoenix was gazed by MTV for luck. JCVD might pique a sleazy boxer with funk.[2] Quizzical twins proved my hijack-bug fix. The quick brown fox jumps over the lazy dog. Waxy and quivering, jocks fumble the pizza. When zombies arrive, quickly fax judge Pat. Heavy boxes perform quick waltzes and jigs.</p>
<a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
<p>A quick chop jolted my big sexy frozen wives. A wizard’s job is to vex chumps quickly in fog. Sympathizing would fix Quaker objectives. Pack my red box with five dozen quality jugs. Quads of blowzy fjord ignite map vex’d chicks. Fake bugs put in wax jonquils drive him crazy. Watch “Jeopardy!”, Alex Trebek’s fun TV quiz game. GQ jock wears vinyl tuxedo for showbiz promo. The quick brown fox jumped over the lazy dogs. Woven silk pyjamas exchanged for blue quartz. Brawny gods just flocked up to quiz and vex him.</p>
<a href="http://www.example.com/a.html"><img src="imge1.jpg"/></a>
答案 0 :(得分:1)
这应该这样做
<a\s[^>]*href=['"](.*?)['"][^>]*>\s*<img\s[^>]*src=['"](.*?)['"][^>]*>\s*</a>
解释
<a # Matches the start of the link
\s # After the a there should be an space
[^>]* # Matches everything but '>' which would close the tag (needed for classes and stuff)
href= # Matches the href part
['"] # Matches either ' or " (needed because you can't be sure which one will appear)
(.*?) # Captures everything (ungreedy, so it searches for the shortest possible match)
['"]
[^>]*
> # Matches the '>' so the end of the tag
\s* # 0 or more white spaces (in case there is an enter after the first tag)
<img # Start of the img tag
\s # 1 white space
[^>]*
src= # Matches the src part
['"]
(.*?) # Captures everything (ungreedy, so it searches for the shortest possible match)
['"]
[^>]*
>
\s*
</a>