我试图匹配整个json-ld条目,而不管特定的标记,换行符等等。
为什么没有这么简单的事情:
\<script type\=\"application\/ld\+json\"\>(.*?)\<\/script\>
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Recipe",
"author": "John Smith",
"cookTime": "PT1H",
"datePublished": "2009-05-08",
"description": "This classic banana bread recipe comes from my mom -- the walnuts add a nice texture and flavor to the banana bread.",
"image": "bananabread.jpg",
"recipeIngredient": [
"3 or 4 ripe bananas, smashed",
"1 egg",
"3/4 cup of sugar"
],
"interactionStatistic": {
"@type": "InteractionCounter",
"interactionType": "http://schema.org/Comment",
"userInteractionCount": "140"
},
"name": "Mom's World Famous Banana Bread",
"nutrition": {
"@type": "NutritionInformation",
"calories": "240 calories",
"fatContent": "9 grams fat"
},
"prepTime": "PT15M",
"recipeInstructions": "Preheat the oven to 350 degrees. Mix in the ingredients in a bowl. Add the flour last. Pour the mixture into a loaf pan and bake for one hour.",
"recipeYield": "1 loaf",
"suitableForDiet": "http://schema.org/LowFatDiet"
}
</script>
我希望输出内容是标签中的所有内容。
答案 0 :(得分:0)
在这里,我们可能想用一个开放的json / ld标签作为开始边界来绑定表达式,然后收集所有字符和换行符,最后添加一个右边界并使用关闭脚本标签,可能类似于:
(<script type="application\/ld\+json">)([\s\S]*)(<\/script>)
或
^(<script type="application\/ld\+json">)([\w\W]*)(<\/script>)$
但是,也许对用户正则表达式而言,这并不是最好的主意,并且应该有很多方法可以使操作起来容易得多。
const regex = /^(<script type="application\/ld\+json">)([\w\W]*)(<\/script>)$/gm;
const str = `<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Recipe",
"author": "John Smith",
"cookTime": "PT1H",
"datePublished": "2009-05-08",
"description": "This classic banana bread recipe comes from my mom -- the walnuts add a nice texture and flavor to the banana bread.",
"image": "bananabread.jpg",
"recipeIngredient": [
"3 or 4 ripe bananas, smashed",
"1 egg",
"3/4 cup of sugar"
],
"interactionStatistic": {
"@type": "InteractionCounter",
"interactionType": "http://schema.org/Comment",
"userInteractionCount": "140"
},
"name": "Mom's World Famous Banana Bread",
"nutrition": {
"@type": "NutritionInformation",
"calories": "240 calories",
"fatContent": "9 grams fat"
},
"prepTime": "PT15M",
"recipeInstructions": "Preheat the oven to 350 degrees. Mix in the ingredients in a bowl. Add the flour last. Pour the mixture into a loaf pan and bake for one hour.",
"recipeYield": "1 loaf",
"suitableForDiet": "http://schema.org/LowFatDiet"
}
</script>`;
const subst = `$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
如果不需要此表达式,可以在regex101.com中对其进行修改或更改。
jex.im可视化正则表达式: