Question

我有一个名为＆＃39; market_cap_（in_us _ $）＆＃39;的列。哪些值如下：

$5.41 
$18,160.50 
$9,038.20 
$8,614.30 
$368.50 
$2,603.80 
$6,701.50 
$8,942.40

我的最终目标是能够根据特定数值进行过滤（例如，＆gt; 2000.00）。

通过阅读本网站的其他问题，我按照以下说明进行操作：

cleaned_data['market_cap_(in_us_$)'].replace( '$', '', regex = True ).astype(float)

但是，我收到以下错误

TypeError: replace() got an unexpected keyword argument 'regex'

如果我删除＆＃34; regex = True＆＃34;从替换参数，我得到

ValueError: could not convert string to float: $5.41

那么，我该怎么办？

Answer 1

此处提供了正确使用的正则表达式，因为您要删除$和,：

In [7]:

df['market_cap_(in_us_$)'].replace('[\$,]', '', regex=True).astype(float)
Out[7]:
0        5.41
1    18160.50
2     9038.20
3     8614.30
4      368.50
5     2603.80
6     6701.50
7     8942.40
Name: market_cap_(in_us_$), dtype: float64

但是，由于您遇到keyword argument 'regex'错误，因此您必须使用非常旧的版本，并且应该更新。

Answer 2

问题是$是正则表达式中的一个特殊字符，表示字符串的开头，因此只替换字符串的开头不会替换任何内容！

你必须在系列上使用str.replace（使用文字$和，）：

In [11]: s.replace('\$|,', '', regex=True)
Out[11]:
0        5.41
1    18160.50
2     9038.20
3     8614.30
4      368.50
5     2603.80
6     6701.50
7     8942.40
dtype: object

In [12]: s.replace('\$|,', '', regex=True).astype('float64')
Out[12]:
0        5.41
1    18160.50
2     9038.20
3     8614.30
4      368.50
5     2603.80
6     6701.50
7     8942.40
dtype: float64

你可能想要使用全部美分而不是浮动美元（删除文字。）：

In [13]: s.replace('\$|,|\.', '', regex=True).astype('int64')
Out[13]:
0        541
1    1816050
2     903820
3     861430
4      36850
5     260380
6     670150
7     894240
dtype: int64

尝试将Python Pandas中的字符串转换为Float时出错

2 个答案: