R:TTR:按市值排序

时间:2016-02-24 11:39:35

标签: r

我试图按市值对股票代码进行排序 我尝试了下面的代码,但列表没有正确排序。有没有一种简单的方法可以删除$并将M和B转换为数字?

library(TTR)
listings <- stockSymbols() 
listings <- listings[order(as.numeric(listings$MarketCap),decreasing=TRUE),]
head(listings,20)

我很感激你的帮助。

1 个答案:

答案 0 :(得分:2)

可能存在提供转换功能的包,例如, "$23.93M"239600000.00,但这是使用基本R函数的一种方法:

listings$MktCap <- as.numeric(
  sub("\\$(\\d+(\\.\\d+)?)[A-Z]?", "\\1", listings$MarketCap)) * 
  ifelse(gsub("[^A-Z]", "", listings$MarketCap) == "M", 1e6,
         ifelse(gsub("[^A-Z]", "", listings$MarketCap) == "B", 1e9, 1.0)) 

head(listings[order(listings$MktCap, decreasing = TRUE),], 5)
#     Symbol                    Name LastSale MarketCap IPOyear     Sector
#382    AAPL              Apple Inc.    94.69  $525.02B    1980 Technology
#1637  GOOGL           Alphabet Inc.   717.29  $493.72B      NA Technology
#1636   GOOG           Alphabet Inc.   695.85  $478.97B    2004 Technology
#2238   MSFT   Microsoft Corporation    51.18   $404.8B    1986 Technology
#6664    XOM Exxon Mobil Corporation    81.23  $338.16B      NA     Energy
#
#                                            Industry Exchange       MktCap
#382                           Computer Manufacturing   NASDAQ 525020000000
#1637 Computer Software: Programming, Data Processing   NASDAQ 493720000000
#1636 Computer Software: Programming, Data Processing   NASDAQ 478970000000
#2238         Computer Software: Prepackaged Software   NASDAQ 404800000000
#6664                        Integrated oil Companies     NYSE 338160000000

简而言之,

  • sub("\\$(\\d+(\\.\\d+)?)[A-Z]?", "\\1", listings$MarketCap)仅提取MarketCap的十进制数字部分,例如来自525.02的{​​{1}};结果传递给$525.02B
  • as.numeric删除除大写字母以外的所有内容,据我所知,该内容应该只有gsub("[^A-Z]", "", listings$MarketCap)B
  • 使用嵌套的M语句,上述表达式的结果会将ifelse映射到"B",将1e9映射到"M",将所有其他值映射到1 < / LI>

此时,1e6应该是MktCap的正确数字表示,并且排序很简单。