在不使用模块的情况下从另一个网站抓取数据

时间:2019-02-09 11:29:02

标签: javascript node.js web-scraping

我正在尝试使用node.js和wix-code

从另一个网站抓取数据

使用此后端代码

import { fetch } from 'wix-fetch';

export function fetchData() {
let url = 'https://www.brainyquote.com/topics/hacker';

let option = {
    "method": "GET"
}
return fetch(url, option)
.then(result => {
    return result.text();
}).catch(reason => {
    return reason;

 })
}

和客户端代码

fetchData().then(function (result) {
    console.log(result);
})

我可以从此website

获得完整的答复

但是我想要的是仅获取报价,然后将其添加到我的数据库集合中,而无需使用诸如Cheerio之类的模块!

3 个答案:

答案 0 :(得分:1)

如果没有任何帮助,您将不得不自己解析HTML内容,这将很痛苦。您将必须分析HTML响应,将响应加载到字符串中,然后使用正则表达式或其他方法解析出所需的部分。

以下是使用正则表达式的一些示例:

https://www.javamex.com/tutorials/regular_expressions/example_scraping_html.shtml

答案 1 :(得分:0)

wix解决方案:

async function getQuotes(){
  const res = await fetch('https://www.brainyquote.com/topics/hacker');
  const text = await res.text();
  return text.match(/(?<=title="(view quote|view author)">)(.*?)(?=<\/a>)/g)
}

更多信息:

使用正则表达式捕获title="view quote">(或title="view quote"></a>之间的所有文本

const https = require('https');

https.get('https://www.brainyquote.com/topics/hacker', (res) => {
  console.log('statusCode:', res.statusCode);
  console.log('headers:', res.headers);

  const data = [];
  res.on('data', (d) => {
    data.push(d);
  });

  res.on('end', ()=>{

        const result = data
        .join("")
        .match(/(?<=title="(view quote|view author)">)(.*?)(?=<\/a>)/g)

        console.log(result);
  });

}).on('error', (e) => {
  console.error(e);
});

会返回:

[
  'Very smart people are often tricked by hackers, by phishing. I don&#39;t exclude myself from that. It&#39;s about being smarter than a hacker. Not about being smart.',
    'Harper Reed',
    'I&#39;m a hacker, but I&#39;m the good kind of hackers. And I&#39;ve never been a criminal.',
    'Mikko Hypponen',
    'At the end of the day, my goal was to be the best hacker.',
    'Kevin Mitnick',
    'Software Engineering might be science; but that&#39;s not what I do. I&#39;m a hacker, not an engineer.',
    'Jamie Zawinski',
    'If you give a hacker a new toy, the first thing he&#39;ll do is take it apart to figure out how it works.',
    'Jamie Zawinski',
    'Social engineering has become about 75% of an average hacker&#39;s toolkit, and for the most successful hackers, it reaches 90% or more.',
    'John McAfee',
    'I&#39;m a really good hacker, but I&#39;m not a sensible person.',
    'Richard D. James',
    'A hacker is someone who uses a combination of high-tech cybertools and social engineering to gain illicit access to someone else&#39;s data.',
    'John McAfee',
    'The hacker mindset doesn&#39;t actually see what happens on the other side, to the victim.',
    'Kevin Mitnick',
    'I look like a geeky hacker, but I don&#39;t know anything about computers.',
    'Justin Long',
    'The hacker community may be small, but it possesses the skills that are driving the global economies of the future.',
    'Heather Brooke',
    'I&#39;m a bit of a hacker fanatic and know a fair bit about that industry and cyber crime and cyber warfare.',
    'Seth Gordon',
    'It&#39;s true, I had hacked into a lot of companies, and took copies of the source code to analyze it for security bugs. If I could locate security bugs, I could become better at hacking into their systems. It was all towards becoming a better hacker.',
    'Kevin Mitnick',
    'It&#39;s not enough to have a hacker culture anymore. You have to have a design culture, too.',
    'Robert Scoble',
    'If you go to a coffee shop or at the airport, and you&#39;re using open wireless, I would use a VPN service that you could subscribe for 10 bucks a month. Everything is encrypted in an encryption tunnel, so a hacker cannot tamper with your connection.',
    'Kevin Mitnick',
    'I wasn&#39;t a hacker for the money, and it wasn&#39;t to cause damage.',
    'Kevin Mitnick',
    'I&#39;m not an economist; I&#39;m a hacker who has spent his career exploring and repairing large networks.',
    'Dan Kaminsky',
    'In the &#39;80s, society created a caricature of what a hacker or a programmer looked like: a guy wearing a hoodie, drinking energy drinks, sitting in a basement somewhere coding. Today, programmers look like the men we see in the show &#39;Silicon Valley&#39; on HBO. If you look at the message girls are getting, it&#39;s saying, &#39;This is not for you.&#39;',
    'Reshma Saujani',
    'I don&#39;t condone anyone causing damage in my name, or doing anything malicious in support of my plight. There are more productive ways to help me. As a hacker myself, I never intentionally damaged anything.',
    'Kevin Mitnick',
    'I think Linux is a great thing, in the big picture. It&#39;s a great hacker&#39;s tool, and it has a lot of potential to become something more.',
    'Jamie Zawinski',
    'Bitcoin is here to stay. There would be a hacker uproar to anyone who attempted to take credit for the patent of cryptocurrency. And I wouldn&#39;t want to be on the receiving end of hacker fury.',
    'Adam Draper',
    'It was on a bulletin board that I first learned about hacker culture, the &#39;Let&#39;s just break through this wall and see what&#39;s on the other side&#39; mentality.',
    'Harper Reed',
    'Everything about Mark Zuckerberg is pure hacker. Hackers don&#39;t take realities of the world for granted; they seek to break and rebuild what they don&#39;t like. They seek to outsmart the world.',
    'Sarah Lacy',
    'If you&#39;re a juvenile delinquent today, you&#39;re a hacker. You live in your parent&#39;s house; they haven&#39;t seen you for two months. They put food outside your door, and you&#39;re shutting down a government of a foreign country from your computer.',
    'John Waters',
    'The key to social engineering is influencing a person to do something that allows the hacker to gain access to information or your network.',
    'Kevin Mitnick',
    'A smartphone links patients&#39; bodies and doctors&#39; computers, which in turn are connected to the Internet, which in turn is connected to any smartphone anywhere. The new devices could put the management of an individual&#39;s internal organs in the hands of every hacker, online scammer, and digital vandal on Earth.',
    'Charles C. Mann'
]

如果要将上述代码转换为对象并将其放入数据库中,则可以执行以下操作:

const data = ['Very smart people are often tricked by hackers, by phishing. I don&#39;t exclude myself from that. It&#39;s about being smarter than a hacker. Not about being smart.','Harper Reed','I&#39;m a hacker, but I&#39;m the good kind of hackers. And I&#39;ve never been a criminal.','Mikko Hypponen','At the end of the day, my goal was to be the best hacker.','Kevin Mitnick','Software Engineering might be science; but that&#39;s not what I do. I&#39;m a hacker, not an engineer.','Jamie Zawinski','If you give a hacker a new toy, the first thing he&#39;ll do is take it apart to figure out how it works.','Jamie Zawinski','Social engineering has become about 75% of an average hacker&#39;s toolkit, and for the most successful hackers, it reaches 90% or more.','John McAfee','I&#39;m a really good hacker, but I&#39;m not a sensible person.','Richard D. James','A hacker is someone who uses a combination of high-tech cybertools and social engineering to gain illicit access to someone else&#39;s data.','John McAfee','The hacker mindset doesn&#39;t actually see what happens on the other side, to the victim.','Kevin Mitnick','I look like a geeky hacker, but I don&#39;t know anything about computers.','Justin Long','The hacker community may be small, but it possesses the skills that are driving the global economies of the future.','Heather Brooke','I&#39;m a bit of a hacker fanatic and know a fair bit about that industry and cyber crime and cyber warfare.','Seth Gordon','It&#39;s true, I had hacked into a lot of companies, and took copies of the source code to analyze it for security bugs. If I could locate security bugs, I could become better at hacking into their systems. It was all towards becoming a better hacker.','Kevin Mitnick','It&#39;s not enough to have a hacker culture anymore. You have to have a design culture, too.','Robert Scoble','If you go to a coffee shop or at the airport, and you&#39;re using open wireless, I would use a VPN service that you could subscribe for 10 bucks a month. Everything is encrypted in an encryption tunnel, so a hacker cannot tamper with your connection.','Kevin Mitnick','I wasn&#39;t a hacker for the money, and it wasn&#39;t to cause damage.','Kevin Mitnick','I&#39;m not an economist; I&#39;m a hacker who has spent his career exploring and repairing large networks.','Dan Kaminsky','In the &#39;80s, society created a caricature of what a hacker or a programmer looked like: a guy wearing a hoodie, drinking energy drinks, sitting in a basement somewhere coding. Today, programmers look like the men we see in the show &#39;Silicon Valley&#39; on HBO. If you look at the message girls are getting, it&#39;s saying, &#39;This is not for you.&#39;','Reshma Saujani','I don&#39;t condone anyone causing damage in my name, or doing anything malicious in support of my plight. There are more productive ways to help me. As a hacker myself, I never intentionally damaged anything.','Kevin Mitnick','I think Linux is a great thing, in the big picture. It&#39;s a great hacker&#39;s tool, and it has a lot of potential to become something more.','Jamie Zawinski','Bitcoin is here to stay. There would be a hacker uproar to anyone who attempted to take credit for the patent of cryptocurrency. And I wouldn&#39;t want to be on the receiving end of hacker fury.','Adam Draper','It was on a bulletin board that I first learned about hacker culture, the &#39;Let&#39;s just break through this wall and see what&#39;s on the other side&#39; mentality.','Harper Reed','Everything about Mark Zuckerberg is pure hacker. Hackers don&#39;t take realities of the world for granted; they seek to break and rebuild what they don&#39;t like. They seek to outsmart the world.','Sarah Lacy','If you&#39;re a juvenile delinquent today, you&#39;re a hacker. You live in your parent&#39;s house; they haven&#39;t seen you for two months. They put food outside your door, and you&#39;re shutting down a government of a foreign country from your computer.','John Waters','The key to social engineering is influencing a person to do something that allows the hacker to gain access to information or your network.','Kevin Mitnick','A smartphone links patients&#39; bodies and doctors&#39; computers, which in turn are connected to the Internet, which in turn is connected to any smartphone anywhere. The new devices could put the management of an individual&#39;s internal organs in the hands of every hacker, online scammer, and digital vandal on Earth.','Charles C. Mann']

const res = [];
for(let i = 0; i < data.length; i+=2){
  res.push({quote: data[i], author: data[i+1]});
}
console.log(res);

答案 2 :(得分:-1)

抓取是一件坏事,因为您基本上是在窃取属于其他作者的内容,因此也许您应该考虑寻找提供类似内容的API。

但是,如果您真的想抓取-这里有一些有关抓取的教程。

首先-您可以在前端执行在后端执行的操作。但是实际上,我们不需要在前端进行任何操作。前端应该只从后端接收报价。爬网和保存到数据库应该只在后端进行。

Cron作业解雇了scraper-> scraper完成工作并将偷来的东西保存到DB中->服务器使用端点提供内容

您确实需要cheeriophantom.js之类的内容来进行抓取,请不要担心,它们是非常简单的工具。

所以计划:

  • 使用任何可以轻松从整个页面提取html元素的工具创建一个刮板。该脚本应连接到您的数据库并在其中保存项目。
  • 每隔N小时/分钟使用node-cron运行刮板。

  • 在服务器上创建将服务于这些报价的端点。

您的前端不应参与任何形式的刮擦或射击刮刀。它应该只显示数据。