我想拆分区域转换的字符串。我有这样的数据。
(149Sq.Yards)
(151Sq.Yards)
(190Sq.Yards)
(190Sq.Yards)
我想像这样拆分上述数据。
149 sq.yards
151 sq.yards
我尝试了以下代码。
a = LOAD '/user/ahmedabad/Makkan_PropertyDetails_Apartment_Ahmedabad.csv' using PigStorage('\t') as (SourceWebSite:chararray,PropertyID:chararray,ListedOn:chararray,ContactName:chararray,TotalViews:int,Price:chararray,PriceperArea:chararray,NoOfBedRooms:int,NoOfBathRooms:int,FloorNoOfProperty:chararray,TotalFloors:int,Possession:chararray,BuiltUpArea:chararray,Furnished:chararray,Ownership:chararray,NewResale:chararray,Facing:chararray,title:chararray,PropertyAddress:chararray,NearByFacilities:chararray,PropertyFeatures:chararray,Sellerinfo:chararray,Description:chararray);
b = FOREACH a GENERATE BuiltUpArea;
c = FILTER b BY (BuiltUpArea matches '.*Sq.Yards.*');
d = FOREACH c GENERATE (bigdecimal) REGEX_EXTRACT(BuiltUpArea,'(.*)', 1) * 9;
while dump d .it打印为null。
答案 0 :(得分:0)
您提到的正则表达式将匹配所有字符,因此它会尝试像(149Sq.Yards * 9)
那样相乘。这是输出中null的原因。
以下正则表达式会单独从输入中分割数字,并像(149 * 9)
那样相乘。
d = FOREACH c GENERATE (bigdecimal) REGEX_EXTRACT(BuiltUpArea,'(^[0-9]+)', 1) * 9;
dump d;