我想使用spark数据框替换json文件的嵌套字段值。我想将此更改的值另存为另一个文件,我的意思是整个内容和此更改的值(不仅是更改的值)。 我尝试了几种方法,但是使用withColumn似乎是正确的方法,但无法实现这一点。 我需要专家的帮助。需要更改三个字段feed.ip,ip_location.geo_point.lat,ip_location.geo_point.lon
{
"device": {
"browser": "Chrome 62.0",
"operatingsystemversion": "10"
},
"feed": {
"environment": "prod",
"ip": "106.223.93.50",
"useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"
},
"ip_location": {
"country_name": "India",
"geo_point": {
"lat": 12.983,
"lon": 77.583
},
"postal_code": "",
"region_name": "Karnataka"
},
"tag": {
"browser_timestamp": "1512118862384",
"path": {
"crid": "833575ed-06d6-466d-b72e-024fabd054cc",
"truncated_url": "/document"
},
"prft": "4477",
"ttfb": "2360",
"url": {
"crid": "833575ed-06d6-466d-b72e-024fabd054cc",
},
"urlref": {
"crid": "e31d9e7f-1425-4809-ba64-131a686449db",
},
"user_color_depth": "24",
"topicparentguid": ""
}
}
myjson=spark.read.json(local_path)
myjson.select("feed.ip").show()
myjson.withColumn("feed.ip", "YYY")
myjson.select("feed.ip").show()