运行地精工作的问题

时间:2018-06-25 06:46:01

标签: json csv exception

我是Gobblin的新手,我试图以独立模式运行一个简单的作业,但出现以下错误:

Task failed due to "com.google.gson.JsonSyntaxException:
com.google.gson.stream.MalformedJsonException: Expected name at line 1 
column 72 path $.fields."

我的工作文件是:

job.name=MRJob1
job.group=MapReduce
job.description=A getting started job for MapReduce
source.class=gobblin.source.extractor.filebased.TextFileBasedSource
source.filebased.downloader.class=gobblin.source.extractor.filebased.CsvFileDownloader
converter.classes=gobblin.converter.csv.CsvToJsonConverterV2,gobblin.converter.avro.JsonIntermediateToAvroConverter
writer.builder.class=gobblin.writer.AvroDataWriterBuilder
source.filebased.fs.uri=file:///
source.filebased.data.directory=/home/sahil97/Downloads/gobblin-dist/CsvSource/
source.schema=[{"ColumnName":"FIRST_NAME","comment": "","isNullable": "true","dataType":{"type":"string"}}{"ColumnName":"LAST_NAME","comment": "","isNullable": "true","dataType":{"type":"string"}},{"ColumnName":"GENDER","comment": "","isNullable": "true","dataType":{"type":"string"}},{"ColumnName":"AGE","comment": "","isNullable": "true","dataType":{"type":"int"}}]
source.skip.first.record=true
source.csv_file.delimiter=,
converter.csv.to.json.delimiter=,
extract.table.type=append_only
extract.table.name=CsvToAvro
extract.namespace=MapReduce
converter.classes=gobblin.converter.csv.CsvToJsonConverterV2,gobblin.converter.avro.JsonIntermediateToAvroConverter
writer.destination.type=HDFS
writer.output.format=AVRO
data.publisher.type=gobblin.publisher.BaseDataPublisher

我的CSV文件是:Repo.txt

FIRST_NAME,LAST_NAME,GENDER,AGE
Sahil,Gaur,Male,22
Sagar,Gaur,Male,21
Dushyant,Saini,Male,23
Devyani,Kaulas,Female,21
Sanchi,Theraja,Female,22
Shreya,Gupta,Female,21
Chirag,Thakur,Male,22
Manish,Sharma,Male,23
Abhishek,Soni,Male,24
Varnita,Sachdeva,Female,22
Deepam,Chaurishi,Male,22

1 个答案:

答案 0 :(得分:0)

如果这是实际的json:您在这里有一个额外的逗号。错误表明您使用了错误的json语法。 因此,这可能是最先要看的地方之一。

{
  "ColumnName": "FIRST_NAME",
  "comment": "",
  "isNullable": "true",
  "dataType": {
    "type": "string"
  }
},                  // Try this comma while defining the json
{
  "ColumnName": "LAST_NAME",
  "comment": "",
  "isNullable": "true",
  "dataType": {
    "type": "string"
  }
},
{
  "ColumnName": "GENDER",
  "comment": "",
  "isNullable": "true",
  "dataType": {
    "type": "string"
  }
},
{
  "ColumnName": "AGE",
  "comment": "",
  "isNullable": "true",
  "dataType": {
    "type": "int"
  }
}