我有一个dataframe
,看起来像这样-
id genres
1 [{'id': 35, 'name': 'Comedy'}]
2 [{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10751, 'name': 'Family'}, {'id': 10749, 'name': 'Romance'}]
3 [{'id':31, 'name':'Romance'}]
我想从每个row
中提取类型,并将它们存储在list
中。例如-
id genres
1 ['Comedy']
2 ['Comedy','Drama','Family','Romance']
3 ['Romance']
我尝试过-
[j['name'] for i in data['genres'] for j in i]
但它会将所有行都写到一个列表中。
答案 0 :(得分:3)
使用apply
例如:
import pandas as pd
df = pd.DataFrame({"genres": [[{'id': 35, 'name': 'Comedy'}],[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10751, 'name': 'Family'}, {'id': 10749, 'name': 'Romance'}],[{'id':31, 'name':'Comedy'}]]})
df["genres"] = df["genres"].apply(lambda x: [i["name"] for i in x])
print(df)
输出:
genres
0 [Comedy]
1 [Comedy, Drama, Family, Romance]
2 [Comedy]
答案 1 :(得分:1)
使用嵌套列表理解:
data['genres'] = [[j['name'] for j in i] for i in data['genres']]
对于更一般的解决方案,更好的方法是get
-如果不存在name
键,则不会失败,但是返回None
或另一个指定的值:
data['genres'] = [[j.get('name') for j in i] for i in data['genres']]
data['genres'] = [[j.get('name', 'missing') for j in i] for i in data['genres']]
print (data)
id genres
0 1 [Comedy]
1 2 [Comedy, Drama, Family, Romance]
2 3 [Romance]
答案 2 :(得分:0)
另外一种可能的方法是使用apply():
df['genres'] = df['genres'].apply(lambda x: [d.get('name') for d in x])