如何使用Scrapy从<script>元素中抓取数据

时间:2019-09-29 11:01:03

标签: python python-3.x scrapy

我正在尝试从magento运行站点中抓取图像网址。产品照片网址列在

<script type="text/x-magento-init">
    {
        "[data-gallery-role=gallery-placeholder]": {
            "mage/gallery/gallery": {
                "mixins":["magnifier/magnify"],
                "magnifierOpts": {"fullscreenzoom":"20","top":"","left":"","width":"","height":"","eventType":"hover","enabled":false},
                "data": [{"thumb":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/7298259c5e8adb86380aac\/m\/a\/product-image-1.jpg",
                "img":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/d86efc38c3706eb137091cd\/m\/a\/product-image-2.jpg",
                "full":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/9222bce87e716be292\/m\/a\/product-image-3.jpg",
                "caption":"Product Title","position":"1","isMain":true,"type":"image","videoUrl":null},
                {"thumb":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/7298259c41aa66380aac\/m\/a\/product-image-1.jpg",
                "img":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/d86efc38c3706eb13bb117091cd\/m\/a\/product-image-2.jpg","full":"https:\/\/example.com.com\/media\/catalog\/product\/cache\/9222bce87e78616be292\/m\/a\/product-image-3.jpg",
                "caption":"Product 2 Title","position":"2","isMain":false,"type":"image","videoUrl":null},
...

我想要的是full值。我不是Python专家,但是我能够将script元素的内部放入dict对象中。不知道这是否是正确的步骤,以及如何从那里继续。

有什么提示吗?

0 个答案:

没有答案