我正在尝试使用OpenCSVSerde在以下CSV数据集的顶部创建配置单元表
WITH SERDEPROPERTIES ("quoteChar"='\"', "separatorChar"=',')
,但配置单元表丢失了£
符号,并显示了替换字符�
。
FWID,GENDER,Ethnicity,AgeAtPeriodEnd,RC_UnitCost,QUANTITY,ElemTypeDesc
2100001,F,White,WEEK,"£2,027.07",3455,AA - Community Meals
2100011,F,White,YEAR,"£75.00,488776",AA - Community Meals
2100044,M,White,WEEK,"£5.40,39.0",123,Ld-ExtDc - Day
2100044,M,White,WEEK,£5.40,9856,FF - Community Meals
2100044,M,White,WEEK,£5.40,"789,193",FF - Community Meals
2100044,M,White,WEEK,£5.40,"876,241",FE - Community Meals
2100044,M,White,WEEK,£5.40,3888,"Community Meals,ExtDc - Day"
2100044,M,White,WEEK,£5.40,235,Ld-ExtDc - Day
2100044,M,White,WEEK,£5.40,8789,FE - Community Meals
2100044,M,White,WEEK,"£10.07,027.7",16478,FE - Community Meals
2100051,F,White,WEEK,£470.00,12375,RG - Community Meals
此外,我尝试使用LazySimpleSerDe创建表
WITH SERDEPROPERTIES ( 'escape.delim'='\"', 'field.delim'=',', 'line.delim'='\n', 'serialization.encoding'='windows-1252')
在这种情况下,数据使用£
符号正确解析,但是由于缺少quotechar
如\"
而导致值对齐无法正常工作。
请提出解决此问题的方法。
答案 0 :(得分:0)
这是一种实现方法:
posts