我正在尝试将CSV转换为特定格式。我已经成功获取了所需的格式,但是现在我意识到它已经将所有csv值都视为字符串,其中有些应该是浮点数,立即数或整数。
有没有一种方法可以将列值正确格式化为JSON转储所需的值。我敢肯定,还有一种更简单的方法可以做到这一点。
j = []
with open("output.csv") as f:
reader = csv.DictReader(f, delimiter=",")
for row in reader:
row["id"] = row["Id2"]
row["ideca"] = row["schemaV"]
row["schem"] = row["schema"]
row["lore"] = row["serialNumber"]
row["tore"] = row["msgschem"]
row["fore"] = row["dschema"]
row["created"] = row["created"]
row["loaded"] = row["created"]
row["geometry1"] = {
"type": "point",
"coordinates": row["latlong"]
}
row["location1"] = {
"timestamp": row["timestamp"],
"geometry": row["geometry1"]
}
row["ignition1"] = {
"timestamp": row["timestamp"],
"state": row["Value"]
}
row["data"] = {
"engine_status": row["Value"],
"state": row["Value"],
"switch_status": row["Value"],
"latitude": row["lat"],
"longtitude": row["long"],
"plantNumber": row["plantno"],
"ignition": row["ignition1"],
"location": row["location1"]
}
del(row[""])
del(row["Epoch"])
del(row["AssetId"])
del(row["Value"])
del(row["lat"])
del(row["long"])
del(row["serialNumber"])
del(row["plantno"])
del(row["msgschem"])
del(row["dschema"])
del(row["created"])
del(row["intEpoch"])
del(row["timestamp"])
del(row["latlong"])
del(row["Id"])
del(row["Id2"])
del(row["ignition1"])
del(row["geometry1"])
del(row["location1"])
# Collect the changed row in the list of rows.
j.append(row)
print(json.dumps(j, indent=4))
数据示例如下:
Id Epoch AssetId Value lat long serialNumber plantno Id2 schemaV schema msgschem dschema created intEpoch timestamp latlong
0 1538317366 875 0 -1.6478 1.9428 1688889 1042225 168888;1538317366 1 d2xxxage mxxx;v1 Sxxxxs 154900000 15900000 30/09/2018 2:22:46 PM [ -1.647766499999996, 1.9428143 ]
答案 0 :(得分:1)
@Ari是正确的。简单地说:
"state": float(row["Value"])
解决了问题
答案 1 :(得分:1)
通过将内容放入row
字典中,然后随后必须再次进行清理,您的代码变得很复杂。我会做些类似的事情:
import csv
import json
results = []
for row in csv.DictReader(fileobj):
# parse non-string columns
value = float(row["Value"])
latlong = json.loads(row['latlong'])
# create nested structures
ignition = {
"timestamp": row["timestamp"],
"state": value,
}
location = {
"timestamp": row["timestamp"],
"geometry": {
"type": "point",
"coordinates": latlong,
}
}
# create dict and append to results
results.append({
'id': row['Id2'],
'ideca': row["schemaV"],
'schem': row["schema"],
'lore': row["serialNumber"],
'tore': row["msgschem"],
'fore': row["dschema"],
'loaded': row["created"],
'engine_status': value,
'state': value,
'switch_status': value,
'latitude': row["lat"],
'longtitude': row["long"],
'plantNumber': row["plantno"],
'ignition': ignition,
'location': location,
})
请注意,我也正在解析看起来像JSON的latlong
列,以期很有用。看到代码中的相关内容有点尴尬,所以事情可能会有所改变,但希望这样会更容易看到发生了什么