我试图按市值对股票代码进行排序 我尝试了下面的代码,但列表没有正确排序。有没有一种简单的方法可以删除$并将M和B转换为数字?
library(TTR)
listings <- stockSymbols()
listings <- listings[order(as.numeric(listings$MarketCap),decreasing=TRUE),]
head(listings,20)
我很感激你的帮助。
答案 0 :(得分:2)
可能存在提供转换功能的包,例如, "$23.93M"
到239600000.00
,但这是使用基本R函数的一种方法:
listings$MktCap <- as.numeric(
sub("\\$(\\d+(\\.\\d+)?)[A-Z]?", "\\1", listings$MarketCap)) *
ifelse(gsub("[^A-Z]", "", listings$MarketCap) == "M", 1e6,
ifelse(gsub("[^A-Z]", "", listings$MarketCap) == "B", 1e9, 1.0))
head(listings[order(listings$MktCap, decreasing = TRUE),], 5)
# Symbol Name LastSale MarketCap IPOyear Sector
#382 AAPL Apple Inc. 94.69 $525.02B 1980 Technology
#1637 GOOGL Alphabet Inc. 717.29 $493.72B NA Technology
#1636 GOOG Alphabet Inc. 695.85 $478.97B 2004 Technology
#2238 MSFT Microsoft Corporation 51.18 $404.8B 1986 Technology
#6664 XOM Exxon Mobil Corporation 81.23 $338.16B NA Energy
#
# Industry Exchange MktCap
#382 Computer Manufacturing NASDAQ 525020000000
#1637 Computer Software: Programming, Data Processing NASDAQ 493720000000
#1636 Computer Software: Programming, Data Processing NASDAQ 478970000000
#2238 Computer Software: Prepackaged Software NASDAQ 404800000000
#6664 Integrated oil Companies NYSE 338160000000
简而言之,
sub("\\$(\\d+(\\.\\d+)?)[A-Z]?", "\\1", listings$MarketCap)
仅提取MarketCap
的十进制数字部分,例如来自525.02
的{{1}};结果传递给$525.02B
as.numeric
删除除大写字母以外的所有内容,据我所知,该内容应该只有gsub("[^A-Z]", "", listings$MarketCap)
或B
M
语句,上述表达式的结果会将ifelse
映射到"B"
,将1e9
映射到"M"
,将所有其他值映射到1 < / LI>
此时,1e6
应该是MktCap
的正确数字表示,并且排序很简单。