如何解析网站并使用firebase函数将数据插入firebase?

时间:2018-03-27 12:45:26

标签: firebase parsing firebase-realtime-database web google-cloud-functions

我们可以在firebase上编写函数,它会触发每小时并将某个给定网站的页面解析成xml并将该数据插入到firebase数据库中吗? 如果有可能做到这一点(一些帮助对我有用)?

提前致谢!

2 个答案:

答案 0 :(得分:2)

是的,你可以做到。使用cron触发该功能。在该函数中,您将拥有从网站获取数据并将其保存在数据库中的逻辑。

答案 1 :(得分:0)

对于其他发现类似问题的人:

lgvalle 发布了有关如何在云功能中抓取网站的有用信息:

const rp = require('request-promise');
const cheerio = require('cheerio');

const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();

const db = admin.firestore();

exports.allyPallyFarmersMarket = functions.https.onRequest((request, response) => {
    const topic = "allyPallyFarmersMarket"
    const url = 'https://weareccfm.com/city-country-farmers-markets/market-profiles/alexandra-palace-market/'
    const options = {
        uri: url,
        headers: { 'User-Agent': 'test' },
        transform: (body) => cheerio.load(body)
    }    
    rp(options)
        .then(($) => {
            const scrap = $('strong').text()
            const [location, date, address] = scrap.split("–")

            //EDIT BY neogucky: 
            //Here you can access scrapped vars: location, date, address
        })
        .catch((err) => response.status(400).send(err))
});

https://gist.github.com/lgvalle/df2a0a7ee10266ca8056fa15654307d8

添加所需的依赖项,您的package.json应该如下所示:

"dependencies": {
    "firebase-admin": "~6.0.0",
    "firebase-functions": "^2.0.3",
    "request-promise": "~4.2.2",
    "cheerio": "~0.22.0"
},

如果您在请求请求中发送JSON数据{website:'https://myurl.org'},则可以使用以下方法进行访问:

request.body.website