我的数据框的其中一列是
data['countries']
"[{'iso_3166_1': 'KR', 'name': 'South Korea'}]"
"[{'iso_3166_1': 'US', 'name': 'United States of America'}]"
如何仅提取国家名称:'South Korea','United States of America'
等
答案 0 :(得分:2)
import json
import numpy as np
countries = [ json.loads(c.replace("'", '"')) for c in data['countries'] if not np.isnan(c)]
country_names = [cn for cn[0]['name'] in countries]
输出将是:
>>> ['South Korea', 'United States of America']
答案 1 :(得分:1)
如果您不想更改DataFrame,而只是解析其中包含的字符串的内容,则可以使用split。
<div id="divProvinces" style="">
或:
>>> a = "[{'iso_3166_1': 'KR', 'name': 'South Korea'}]"
>>> a.split("'name': ")[1].split("'")[1]
'South Korea'
答案 2 :(得分:0)
这应该有效
data['countries'] = data['countries'].apply(lambda x: eval(x))
data['countries'].apply(lambda x: x[0]['name'])
输出
0 South Korea
1 United States of America
Name: 1, dtype: object
list(data[1].apply(lambda x: x[0]['name']))
输出
['South Korea', 'United States of America']