我在Windows 10 powershell环境下使用jq 1.5转换json文件并将其导入到MS SQL数据库。原始的json文件约为1,1mb。我将文件存储在这里:Json origin file。我使用以下jq命令来转换数据:
[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}]
该命令产生以下文件transformed file。现在,我不得不将UNIX时间戳转换为日期。所以我修改了命令:
[.legs[] | {Legid: .legId, Farecode: .fareBasisCode, Travelduration: .travelDuration, Traveldistance: .totalTravelDistance, Distanceunit: .totalTravelDistanceUnits, Refundable: .isRefundable , Nonstop: .isNonStop, Departure_Airport: .segments[].departureAirportName, Departure_Code: .segments[].departureAirportCode, Arrival_Airport: .segments[].arrivalAirportName, Arrival_Code: .segments[].arrivalAirportCode, Departure_Time: .segments[].departureTimeEpochSeconds, Arrival_Time: .segments[].arrivalTimeEpochSeconds, Airline: .segments[].airlineName, Airline_Code: .segments[].airlineCode, Flight_Number: .segments[].flightNumber, Equipment: .segments[].equipmentDescription}] | .[].Departure_Time |= todate | .[].Arrival_Time |= todate
没有日期转换的转换文件大约有3 mb。在转换日期之后,文件大小约为40 mb。我认为我的命令中存在逻辑错误,但找不到它。提示?
问候 蒂莫
答案 0 :(得分:0)
您使用迭代(.segments[]
)会导致乘法行为:在您的情况下,由于在四种情况下.segments|length
为2,因此您在本地得到2 ^ 10的扩展,是四倍。 / p>
在这种情况下,使用少量但经过精心选择的数据子集(或者更容易的是人工数据集)来检查代码是有意义的。
也许您想要的更像是:
[ .legs[] | range(0; .segments|length) as $i | .... ]