Apache Nutch 2.3:不会注入网址(挂起)& hadoop日志显示警告

时间:2017-06-22 19:56:16

标签: hadoop nutch

我试图用Elasticsearch 5.4设置Nutch 2.3。问题出在Nutch,因为我无法注入我的网址。 hadoop日志显示以下警告:

控制台:

aurora apache-nutch-2.3.1 # cat runtime/local/logs/hadoop.log 
2017-06-14 17:08:28,339 INFO  crawl.InjectorJob - InjectorJob: starting at 2017-06-14 17:08:28
2017-06-14 17:08:28,340 INFO  crawl.InjectorJob - InjectorJob: Injecting urlDir: urls/seed.txt
2017-06-14 17:08:28,992 WARN  util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

**它挂在这里**

Hadoop日志:

import { Injectable } from '@angular/core';
import { Http, Response } from "@angular/http";
import { Observable } from "rxjs/Rx";
import 'rxjs/add/operator/map';


@Injectable()
export class HttpAPIService {
  metrics: Object;

  constructor(private http: Http) {
    // https://thinkster.io/tutorials/angular-2-http
    this.getMetrics('https://jsonplaceholder.typicode.com/posts/1/comments').subscribe(data => this.metrics = data);
  }

  getMetrics(url: string) {
    return Observable.interval(5000)
      .switchMap(() => this.http.get(url))
      .map((res: Response) => res.json());
  }

}

我已尝试在此线程(Hadoop "Unable to load native-hadoop library for your platform" warning)之后设置我的Hadoop环境变量,但我仍然遇到同样的错误。

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

  1. 不要担心警告。我相信你在Linux发行版上运行
  2. Nutch2.3与ES 5.x不兼容。我编写了一个自定义的IndexWriter,它在给定端口调用Logstash,后者又调用Elastic Search。您可以尝试这种方法或周围的东西。