我试图从此列中以浮动形式获取Market Cap
Company Info
Workhorse Group, Inc. (WKHS) Market Cap: $65.94M
Xencor, Inc. (XNCR) Market Cap: $1.99B
Zillow Group, Inc. (ZG) Market Cap: $10.28B
Zillow Group, Inc. (Z) Market Cap: $10.17B
Zogenix, Inc. (ZGNX) Market Cap: $1.99B
所需的输出
Market Cap
65940000.00
1990000000.00
10280000000.00
10170000000.00
1990000000.00
我可以用这个号码(可能是更好的方法)
df['market_cap'] = df['Company Info'].str.split('$').str.get(1).str[:-1]
market_cap
1.13B
283.56M
763.51M
231.31M
1.3B
但是我需要它作为浮点数,它是基于M
列末尾的B
或Company Info
的乘数
multiplier = {'M': 1e6, 'B': 1e9}
答案 0 :(得分:2)
基本上像您一样提取market_cap
,除了转换为float之外:
df['market_cap'] = df['Company Info'].str.split('$').str.get(1).str[:-1].astype(float)
使用正则表达式提取乘数:
df['multiplier'] = df['Company Info'].str.extract('\d+\.\d+(\w)')
将您的市值乘以您提供的映射:
df['Market Cap'] = df.market_cap.mul(df['multiplier'].map({'M': 1e6, 'B': 1e9}))
>>> df['Market Cap']
0 6.594000e+07
1 1.990000e+09
2 1.028000e+10
3 1.017000e+10
4 1.990000e+09
Name: Market Cap, dtype: float64
这里与一根班轮相同:
df['Market Cap'] = (df['Company Info'].str.split('$')
.str.get(1).str[:-1]
.astype(float)
.mul(df['Company Info']
.str.extract('\d+\.\d+(\w)')
.map({'M': 1e6, 'B': 1e9})))
>>> df
Company Info Market Cap
0 Workhorse Group, Inc. (WKHS) Market Cap: $65.94M 6.594000e+07
1 Xencor, Inc. (XNCR) Market Cap: $1.99B 1.990000e+09
2 Zillow Group, Inc. (ZG) Market Cap: $10.28B 1.028000e+10
3 Zillow Group, Inc. (Z) Market Cap: $10.17B 1.017000e+10
4 Zogenix, Inc. (ZGNX) Market Cap: $1.99B 1.990000e+09
答案 1 :(得分:1)
使用 str.extract
和 replace
和 prod
:
(df['Company Info'].str.extract(r'\$([\d\.]+)([MB])')
.replace({'M': 1e6, 'B': 1e9})
.astype(float).prod(1)
)
0 6.594000e+07
1 1.990000e+09
2 1.028000e+10
3 1.017000e+10
4 1.990000e+09
Name: 1, dtype: float64