我遇到的问题是Im从df_modified['lat'] = df.coordinates.apply(lambda x: x[0])
收到错误,它返回错误TypeError: 'float' object is not subscriptable
。由于“坐标”已经是列表(请参阅JSON SNIPPET),因此我尝试使用lambda提取元素[0]
并将其放置在名为“ lat”的新列中,并将元素[1]
放置在新列名为“ long”。任何有关此问题的帮助将不胜感激。谢谢!
import pandas as pd
import json
import requests
from pandas.io.json import json_normalize
# READS IN JSON
source = requests.get('www.url.com')
data = json.loads(source.text)
# Flattens the JSON data since it had nested dictionaries
df = pd.io.json.json_normalize(data)
# Renamed "lat_long.coordinates" because the "." was confusing .apply() function
df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)
# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count']]
# Attempt to create a new column "lat" and "long" and place the elemnts accordingly i.e. [-75.802503, 41.820569]
df_modified['lat'] = df.coordinates.apply(lambda x: x[0])
df_modified['long'] = df.coordinates.apply(lambda x: x[1])
print(df_modified.head(30))
样本JSON片段
{
":@computed_region_amqz_jbr4": "587",
":@computed_region_d3gw_znnf": "18",
":@computed_region_nmsq_hqvv": "55",
":@computed_region_r6rf_p9et": "36",
":@computed_region_rayf_jjgk": "295",
"arrests": "1",
"county_code": "44",
"county_code_text": "44",
"county_name": "Mifflin",
"fips_county_code": "087",
"fips_state_code": "42",
"incident_count": "1",
"lat_long": {
"type": "Point",
"coordinates": [
-77.620031,
40.612749
]
}
答案 0 :(得分:0)
您可以采用其他方法。在过滤列之前先使用lat
和long
。
import pandas as pd
import json
with open('sample.json') as infile:
data = json.load(infile)
df = pd.io.json.json_normalize(data)
df.rename(columns={'lat_long.coordinates': 'coordinates'}, inplace=True)
df['lat'] = df['coordinates'].apply(lambda x: x[0])
df['long'] = df['coordinates'].apply(lambda x: x[1])
# Created a new data frame with seleted columns
df_modified = df.loc[:, ['county_name', 'arrests', 'incident_count', 'lat',
'long']]