使用apache drill 1.11.0在以下csv文件上尝试select *查询。
id,email,first_name,last_name,middle_name,suffix,work_phone,mobile_phone,gender,picture,speciality,taxonomy_code,education_details,experience_details,keywords,doctor_npi,wait_time,created_tstamp,created_by,last_updated_tstamp,last_updated_by,is_deleted
1,xxxx@gmail.com,XXXXX,XXXX,,Dr,912225711234,,M,assets/images/doctorIcon.png,Primary Care Physician,Primary Care Doctor,M.D,3 years,Primary Care Doctor,1043259765,10,2015-04-22 17:20:48.0,,2015-12-16 12:06:27.0,,N
2,xxxx@gmail.com,XXXX,XXXX,,Dr,913375311234,,M,assets/images/doctorIcon.png,Eye Doctor,EYE Care Doctor,MD,5 years,,1619932076,20,2015-04-30 11:07:57.0,,2015-11-07 08:49:57.0,,N
我收到此错误:
org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: Error processing input: , line=1, char=292. Content parsed: [ ] Failure while reading file file:/..... Happened at or shortly before byte position 292. Fragment 0:0 [Error Id: 1ce7d94a-c06e-4633-af97-f3eceb1b5350 on 172.16.16.57:31010]
这里有什么问题?
答案 0 :(得分:1)
这似乎是Apache Drill中的一个错误,但Praveen是正确的,问题与后缀列有关。后缀列是Drill中的四个隐式列(文件名,后缀,fqn,文件路径)[1]。虽然这里的预期行为应该是隐式列后缀输出(即csv)而不是错误的结果。我会为此提交Jira文件。
如果列名与隐式列具有相同的名称,则可以使用ALTER SYSTEM|SESSION SET
命令更改默认的隐式列名。
例如:
ALTER session SET `drill.exec.storage.implicit.suffix.column.label` = 'appendix';
[1] https://drill.apache.org/docs/querying-a-file-system-introduction/
答案 1 :(得分:0)
以某种方式列标题名称“后缀”无效。如果我使用任何其他标题名称,它就会起作用。