尝试在Express / NodeJS中检索URL元数据

时间:2019-04-10 15:52:18

标签: node.js express meta-tags

我有一个资源,其中包含用户输入的URL字段。我正在尝试使用以下软件包:https://github.com/mozilla/page-metadata-parser来检索与URL关联的标题和描述,并在创建时将其保存到数据库中。

我已将包装文档中建模的代码添加到Express中的发布请求中,并且没有错误,创建了新书签,但未返回元数据值。

这是我的模特:

const mongoose = require('mongoose');

const { Schema } = mongoose;

const BookmarksSchema = new Schema({
  userId: {
    type: Schema.Types.ObjectId,
    required: true
  },
  url: {
    type: String,
    trim: true,
    required: true
  },

...

  title: {
    type: String,
    trim: true,
    required: false
  },
  description: {
    type: String,
    trim: true,
    required: false
  }
});

mongoose.model('Bookmarks', BookmarksSchema);

我的创建方法:

const mongoose = require('mongoose');
const passport = require('passport');
const router = require('express').Router();
const auth = require('../auth');
const Bookmarks = mongoose.model('Bookmarks');

router.post('/', auth.required, (req, res, next) => {
  const userId = req.user.id;
  const bookmark = req.body.bookmark;

  if (!bookmark.url) {
    return res.status(422).json({
      errors: {
        url: 'is required',
      },
    });
  }

  const { getMetadata } = require('page-metadata-parser');
  const domino = require('domino');

  const url = bookmark.url;
  const response = fetch(url);
  const html = response.text();
  const doc = domino.createWindow(html).document;
  const metadata = getMetadata(doc, url);

  bookmark.userId = userId;
  bookmark.title = metadata.title;
  bookmark.description = metadata.description;

  const finalBookmark = new Bookmarks(bookmark);

  return finalBookmark.save()
    .then(() => res.json({ bookmark: finalBookmark }));
});

和package.json:

{
  "name": "server",
  "version": "1.0.0",
  "description": "",
  "main": "app.js",
  "dependencies": {
    "body-parser": "^1.18.3",
    "cors": "^2.8.5",
    "domino": "^2.1.3",
    "errorhandler": "^1.5.0",
    "express": "^4.16.4",
    "express-jwt": "^5.3.1",
    "express-session": "^1.15.6",
    "jsonwebtoken": "^8.5.1",
    "mongoose": "^5.4.20",
    "morgan": "^1.9.1",
    "page-metadata-parser": "^1.1.3",
    "passport": "^0.4.0",
    "passport-local": "^1.0.0",
    "path": "^0.12.7"
  },
  "devDependencies": {},
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1",
    "dev": "nodemon app"
  },
  "author": "",
  "license": "ISC"
}

2 个答案:

答案 0 :(得分:0)

在此处发布答案,以便我们将答案标记为正确。

由于对fetch()的调用是异步调用并且未使用关键字await的事实而引起错误。 NPM网站上的示例位于:

https://www.npmjs.com/package/page-metadata-parser

表明他们在await呼叫中使用了fetch()。为了使用await匿名回调函数,以(req, res, next)开头的函数必须在其前面带有关键字async。呼叫应如下所示:

router.post('/', auth.required, async (req, res, next) => {
     // Do your stuff here as before.
     const url = bookmark.url;
     const response = await fetch(url);
     const html = response.text();
     const doc = domino.createWindow(html).document;
     const metadata = getMetadata(doc, url);
     // Finish stuff here.
});

现在正在填充响应,程序将等待fetch调用完成后再继续前进,从而填充剩余的变量并能够获取元数据。

答案 1 :(得分:0)

检查以下nodejs模块,将其用于我的移动应用程序,以显示有关用户在应用程序中发布的链接的各种元数据

https://metascraper.js.org/#/?id=usage