SOLR:使用POST索引CSV时忽略错误的行

时间:2017-01-29 07:32:00

标签: csv unix post solr lucene

我正在尝试使用POST工具将一个大文件(1.5 gb,包含600万行)索引到solr-6.3.0。我使用以下命令在特定行上获得IO错误。我正在使用ubuntu 16.04 < / p>

java -classpath /opt/solr-6.3.0/dist/solr-core-6.3.0.jar -Dauto=yes -Dc=mycore3 -Ddata=files org.apache.solr.util.SimplePostTool /home/shubham/combined.csv
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/mycore3/update...
Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file combined.csv (text/csv) to [base]
SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/mycore3/update
SimplePostTool: WARNING: Response: <?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int name="QTime">6977</int></lst><lst name="error"><lst name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str name="root-error-class">org.apache.solr.common.SolrException</str></lst><str name="msg">CSVLoader: input=null, line=94221,expected 14 values but got 13
    values={'Plot No-941','SS II Room','R.F.Naik High School, Junior and Senior College','','400709','','','Sector 8','','','Nueva Bombay','Bombay/Bombaim','IND',}</str><int name="code">400</int></lst>
</response>
SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/mycore3/update
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/mycore3/update...
Time spent: 0:01:11.862

我想跳过这些行,并希望继续索引其余行。请帮忙.Below是错误log.core的核心是mycore3

HouseNumber,BuildingName,Landmark,Street,Postcode,Block,DependentLocality,SubLocality,Locality,SubMunicipality,Town,District,State,Country
40E,,,,641006,,,,Poosari Palayam,Poosari Palayam,Coimbatore,Coimbatore,Tamil Nadu,IND
45,Ganesh Illam,,,641027,,,,Sanganur,Sanganur,Coimbatore,Coimbatore,Tamil Nadu,IND
131,,,,641037,,Jayasimma Puram,,Papanaicker Palyam,Papanaicker Palyam,Coimbatore,Coimbatore,Tamil Nadu,IND
4,Ratna Niwas,,,641670,,Rajiv Gandhi Nagar,,Uppili Palayam,Uppili Palayam,Coimbatore,Coimbatore,Tamil Nadu,IND

在1.1 gb之外,在错误爆发前已将26 mb编入索引

Solr admin dashboard for mycore3

根据要求,这些是csv文件的前几行

import { Injectable } from '@angular/core';
import { Message } from 'primeng/primeng';

@Injectable()
export class NotificationService {
    message: Message[];

    constructor() {
        this.message = [];
    }

    success(detail: string, summary?: string): void {
        this.message.push({
            severity: 'success', summary: summary, detail: detail
        });
    }

    info(detail: string, summary?: string): void {
        this.message.push({
            severity: 'info', summary: summary, detail: detail
        });
    }

    warning(detail: string, summary?: string): void {
        this.message.push({
            severity: 'warn', summary: summary, detail: detail
        });
    }

    error(detail: string, summary?: string): void {
        this.message.push({
            severity: 'error', summary: summary, detail: detail
        });
    }
}

0 个答案:

没有答案