我遇到了一些熊猫问题。
我有以下DataFrame:
name random_words
Anne [hello, hi, bye]
John [red, blue, green, yellow, grey, black]
Marie [orange, lemon, pear, apple]
Mark [cat, dog]
我使用pd.read_csv()
函数加载了DataFrame。问题是我需要将random_words列设置为set类型。
我尝试使用astype()
函数,但是它不起作用。
答案 0 :(得分:1)
将值转换为具有strip
和split
的列表,然后转换为set
:
print (df)
name random_words
0 Anne [hello, hi, bye]
1 John [red, blue, green, yellow, grey, black]
2 Marie [orange, lemon, pear, apple]
3 Mark [cat, dog]
print (type(df.loc[0,'random_words']))
<class 'str'>
df['random_words'] = df['random_words'].str.strip('[]').str.split(', ').apply(set)
print (df)
name random_words
0 Anne {bye, hi, hello}
1 John {yellow, grey, blue, red, green, black}
2 Marie {pear, lemon, apple, orange}
3 Mark {dog, cat}
或者在自定义lambda函数中:
df['random_words'] = df['random_words'].apply(lambda x: set(x.strip('[]').split(', ')))
如果字符串周围有''
(不在示例数据中,但在实际数据中可能):
import ast
df['random_words'] = df['random_words'].apply(lambda x: set(ast.literal_eval(x)))
如果值是列表:
print (type(df.loc[0,'random_words']))
<class 'list'>
df['random_words'] = df['random_words'].apply(set)
编辑:
如果出现此错误,显然是缺少值的问题:
print (df)
name random_words
0 Anne NaN
1 John [red, blue, green, yellow, grey, black]
2 Marie [orange, lemon, pear, apple]
3 Mark [cat, dog]
df['random_words'] = df['random_words'].str.strip('[]').str.split(', ').apply(set)
print (df)
TypeError:“ float”对象不可迭代
然后可以将其转换为字符串,但是可以使用NaN
的字符串repr进行设置(什么应该是完全可以的,取决于需要):
df['random_words'] = df['random_words'].astype(str).str.strip('[]').str.split(', ').apply(set)
print (df)
name random_words
0 Anne {nan}
1 John {yellow, grey, blue, red, green, black}
2 Marie {pear, lemon, apple, orange}
3 Mark {dog, cat}
答案 1 :(得分:0)
df = pd.DataFrame({"name": ["Anne", "John", "Marie", "Mark"],
"random_words":[["hello", "hi", "bye"],
["red", "blue", "green", "yellow", "grey", "black"],
["orange", "lemon", "pear", "apple"],
["cat", "dog"]]})
df['random_words'] = df['random_words'].apply(set)
df
name random_words
0 Anne {hi, bye, hello}
1 John {blue, yellow, green, black, red, grey}
2 Marie {orange, pear, apple, lemon}
3 Mark {cat, dog}