将数据框中的元素转换为字符串

时间:2018-08-04 12:34:01

标签: python-3.x pandas numpy dataframe

我想在数据帧中将字节转换为字符串。

data['CleanedText'].head()
0    b'witti littl book make son laugh loud recit c...
1    b'grew read sendak book watch realli rosi movi...
2    b'fun way children learn month year learn poem...
3    b'great littl book read nice rhythm well good ...
4    b'book poetri month year goe month cute littl ...
Name: CleanedText, dtype: object

我正在使用常规的 for循环来执行此操作,但是转换花费了太多时间。

for i,j in enumerate(text_data):
    data['newtext'][i] = text_data[i].decode('utf-8')

由于计算速度快,是否可以使用 numpy 将字节转换为字符串?

2 个答案:

答案 0 :(得分:0)

您可以使用apply()Lambda functions

data['newtext'] = data['CleanedText'].apply(lambda x: x.decode('utf-8'))

答案 1 :(得分:0)

您可以使用str.decode

>>> df.CleanedText.str.decode('utf-8')

0    witti littl book make son laugh loud recit c...
1    grew read sendak book watch realli rosi movi...
2    fun way children learn month year learn poem...
3    great littl book read nice rhythm well good ...
4    book poetri month year goe month cute littl ...