我在df中有一个列如下:
pd.DataFrame(["[u'one_element']", "[u'two_elememts', u'two_elements']", "[u'three_elements', u'three_elements', u'three_elements']"])
0
0 [u'one_element']
1 [u'two_elememts', u'two_elements']
2 [u'three_elements', u'three_elements', u'three_elements']
这些元素是字符串:
type(df[0].iloc[2]) == str
最终结果如下:
0
0 one_element
1 two_elememts, two_elements
2 three_elements, three_elements, three_elements
我尝试过:
df[column] = df[column].map(lambda x: x.lstrip('[u').rstrip(']').replace("u'","").replace("'",""))
但是当你有很多行时,显然这很慢。
有更好的方法吗? df有许多不同类型的列:字符串,整数,浮点数。
谢谢!
答案 0 :(得分:3)
您可以使用正则表达式和条带
df[0] = df[0].str.strip("[]").str.replace("u'|'",'')
0 one_element
1 two_elememts, two_elements
2 three_elements, three_elements, three_elements
Name: 0, dtype: object
答案 1 :(得分:1)
您不需要地图,您可以将str属性用于pandas系列:
(df[0].str.lstrip('[u')
.str.rstrip(']')
.str.replace("u'","")
.str.replace("'","")))
实现相同的结果,但不使用地图
0 one_element
1 two_elememts, two_elements
2 three_elements, three_elements, three_elements
Name: 0, dtype: object
答案 2 :(得分:1)
使用 ast模块。
import pandas as pd
import ast
df = pd.DataFrame(["[u'one_element']", "[u'two_elememts', u'two_elements']", "[u'three_elements', u'three_elements', u'three_elements']"])
print(df[0].apply(lambda x: ", ".join(ast.literal_eval(x))))
<强>输出:强>
0 one_element
1 two_elememts, two_elements
2 three_elements, three_elements, three_elements
Name: 0, dtype: object