将单词列表(在数据框内)转换为一组单词

时间:2018-12-23 19:53:37

标签: python pandas dataframe

在我的数据框中,我有一列包含数据的列,例如[细胞,蛋白质,表达],我想将其转换为一组单词,例如细胞,蛋白质,表达,它应适用于数据框。请提出可行的方法。

2 个答案:

答案 0 :(得分:0)

问题在于df['Final_Text']不是列表,而是字符串。尝试先使用ast.literal_eval

import ast
from io import StringIO

# your sample df

s = """
,Final_Text
0,"['study', 'response', 'cell']"
1,"['cell', 'protein', 'effect']"
2,"['cell', 'patient', 'expression']"
3,"['patient', 'cell', 'study']"
4,"['study', 'cell', 'activity']"
"""

df = pd.read_csv(StringIO(s))

# convert you string of a list of to an actual list
df['Final_Text'] = df['Final_Text'].apply(ast.literal_eval)

# use a lambda expression with join to keep the text inside the list
df['Final_Text'] = df['Final_Text'].apply(lambda x: ', '.join(x))

    Unnamed: 0      Final_Text
0      0            study, response, cell
1      1            cell, protein, effect
2      2            cell, patient, expression
3      3            patient, cell, study
4      4            study, cell, activity

答案 1 :(得分:0)

尝试

data['column_name'] = data['column_name'].apply(lambda x: ', '.join(x))