我想获取此页面上的数据: https://www.zacks.com/stock/research/STNG/earnings-announcements
我试过用rvest做这个,但也许我必须使用RSelenium?不确定怎么做,有人可以指导我吗?
test <- specific_stocks_earnings %>%
html_nodes("#earnings_announcements_tabs , .sorting_1") %>%
html_text()
test
答案 0 :(得分:4)
(a)我非常震惊Zacks并不禁止或不鼓励刮擦,但在我扫描的合法的mumbo-jumbo页面中没有任何内容表明这是不好的。
(b)数据在那里,但不是在很好的渲染形式。它们卡在<script>
标记中,动态呈现它们。但是,有了一点肘部油脂和V8
包装,我们可以得到它:
library(rvest)
library(stringi)
library(V8)
ctx <- v8()
pg <- read_html("https://www.zacks.com/stock/research/STNG/earnings-announcements")
html_nodes(pg, xpath=".//script[contains(., 'obj_data')]") %>%
html_text() %>%
stri_replace_all_fixed('document.', '') %>%
ctx$eval() -> ignore_the_blank_return_value
dat <- ctx$get("obj_data")
str(dat)
## List of 6
## $ earnings_announcements_earnings_table : chr [1:28, 1:7] "9/18/2017" "4/26/2017" "2/13/2017" "11/14/2016" ...
## $ earnings_announcements_webcasts_table : chr [1:13, 1:5] "2/13/2017" "11/14/2016" "7/28/2016" "4/27/2016" ...
## $ earnings_announcements_revisions_table: chr [1:4, 1:6] "7/21/2017" "7/21/2017" "7/21/2017" "7/21/2017" ...
## $ earnings_announcements_splits_table : list()
## $ earnings_announcements_dividends_table: chr [1:17, 1:4] "9/29/2017" "6/14/2017" "3/30/2017" "12/22/2016" ...
## $ earnings_announcements_guidance_table : chr [1:4, 1:3] "1/21/2015" "10/14/2014" "6/12/2014" "1/28/2013" ...
dat
## $earnings_announcements_earnings_table
## [,1] [,2] [,3] [,4] [,5]
## [1,] "9/18/2017" "6/2017" "-$0.05" "--" "--"
## [2,] "4/26/2017" "3/2017" "-$0.08" "-$0.07" "<div class=\"right pos positive pos_icon showinline up\">+0.01</div>"
## [3,] "2/13/2017" "12/2016" "-$0.2" "-$0.18" "<div class=\"right pos positive pos_icon showinline up\">+0.02</div>"
## [4,] "11/14/2016" "9/2016" "-$0.11" "-$0.11" "<div class=\"right pos_na showinline\">0.00</div>"
## [5,] "7/28/2016" "6/2016" "$0.03" "$0.04" "<div class=\"right pos positive pos_icon showinline up\">+0.01</div>"
## [6,] "4/27/2016" "3/2016" "$0.18" "$0.18" "<div class=\"right pos_na showinline\">0.00</div>"
## [7,] "2/29/2016" "12/2015" "$0.19" "$0.21" "<div class=\"right pos positive pos_icon showinline up\">+0.02</div>"
## [8,] "11/4/2015" "9/2015" "$0.43" "$0.46" "<div class=\"right pos positive pos_icon showinline up\">+0.03</div>"
## [9,] "7/29/2015" "6/2015" "$0.30" "$0.32" "<div class=\"right pos positive pos_icon showinline up\">+0.02</div>"
## [10,] "4/27/2015" "3/2015" "$0.26" "$0.24" "<div class=\"right neg negative neg_icon showinline down\">-0.02</div>"
## [11,] "3/2/2015" "12/2014" "$0.13" "$0.12" "<div class=\"right neg negative neg_icon showinline down\">-0.01</div>"
## [12,] "10/27/2014" "9/2014" "$0.02" "-$0.01" "<div class=\"right neg negative neg_icon showinline down\">-0.03</div>"
## [13,] "7/28/2014" "6/2014" "-$0.06" "-$0.06" "<div class=\"right pos_na showinline\">0.00</div>"
## [14,] "4/28/2014" "3/2014" "$0.04" "$0.01" "<div class=\"right neg negative neg_icon showinline down\">-0.03</div>"
## [15,] "2/24/2014" "12/2013" "$0.01" "-$0.08" "<div class=\"right neg negative neg_icon showinline down\">-0.09</div>"
## [16,] "10/28/2013" "9/2013" "$0.02" "$0.00" "<div class=\"right neg negative neg_icon showinline down\">-0.02</div>"
## [17,] "7/29/2013" "6/2013" "$0.05" "$0.03" "<div class=\"right neg negative neg_icon showinline down\">-0.02</div>"
## [18,] "4/29/2013" "3/2013" "$0.05" "$0.08" "<div class=\"right pos positive pos_icon showinline up\">+0.03</div>"
## [19,] "2/25/2013" "12/2012" "-$0.05" "-$0.08" "<div class=\"right neg negative neg_icon showinline down\">-0.03</div>"
## [20,] "10/29/2012" "9/2012" "-$0.15" "-$0.09" "<div class=\"right pos positive pos_icon showinline up\">+0.06</div>"
## [21,] "7/31/2012" "6/2012" "--" "--" "--"
## [22,] "5/3/2012" "3/2012" "--" "--" "--"
## [23,] "2/23/2012" "12/2011" "--" "-$2.21" "--"
## [24,] "11/14/2011" "9/2011" "--" "--" "--"
## [25,] "8/16/2011" "6/2011" "--" "--" "--"
## [26,] "5/10/2011" "3/2011" "--" "--" "--"
## [27,] "3/17/2011" "12/2010" "--" "--" "--"
## [28,] "11/15/2010" "9/2010" "--" "--" "--"
## [,6] [,7]
## [1,] "--" "Before Open"
## [2,] "<div class=\"right pos positive pos_icon showinline up\">+12.50%</div>" "Before Open"
## [3,] "<div class=\"right pos positive pos_icon showinline up\">+10.00%</div>" "Before Open"
## [4,] "<div class=\"right pos_na showinline\">0.00%</div>" "Before Open"
## [5,] "<div class=\"right pos positive pos_icon showinline up\">+33.33%</div>" "Before Open"
## [6,] "<div class=\"right pos_na showinline\">0.00%</div>" "Before Open"
## [7,] "<div class=\"right pos positive pos_icon showinline up\">+10.53%</div>" "Before Open"
## [8,] "<div class=\"right pos positive pos_icon showinline up\">+6.98%</div>" "Before Open"
## [9,] "<div class=\"right pos positive pos_icon showinline up\">+6.67%</div>" "Before Open"
## [10,] "<div class=\"right neg negative neg_icon showinline down\">-7.69%</div>" "Before Open"
## [11,] "<div class=\"right neg negative neg_icon showinline down\">-7.69%</div>" "Before Open"
## [12,] "<div class=\"right neg negative neg_icon showinline down\">-150.00%</div>" "Before Open"
## [13,] "<div class=\"right pos_na showinline\">0.00%</div>" "--"
## [14,] "<div class=\"right neg negative neg_icon showinline down\">-75.00%</div>" "Before Open"
## [15,] "<div class=\"right neg negative neg_icon showinline down\">-900.00%</div>" "Before Open"
## [16,] "<div class=\"right neg negative neg_icon showinline down\">-100.00%</div>" "Before Open"
## [17,] "<div class=\"right neg negative neg_icon showinline down\">-40.00%</div>" "Before Open"
## [18,] "<div class=\"right pos positive pos_icon showinline up\">+60.00%</div>" "Before Open"
## [19,] "<div class=\"right neg negative neg_icon showinline down\">-60.00%</div>" "After Close"
## [20,] "<div class=\"right pos positive pos_icon showinline up\">+40.00%</div>" "Before Open"
## [21,] "--" "Before Open"
## [22,] "--" "Before Open"
## [23,] "--" "--"
## [24,] "--" "Before Open"
## [25,] "--" "--"
## [26,] "--" "--"
## [27,] "--" "--"
## [28,] "--" "--"
##
## $earnings_announcements_webcasts_table
## [,1] [,2] [,3]
## [1,] "2/13/2017" "Q4 2016 Earnings Call" "--"
## [2,] "11/14/2016" "Q3 2016 Earnings Call" "--"
## [3,] "7/28/2016" "Q2 2016 Earnings Call" "--"
## [4,] "4/27/2016" "Q1 2016 Earnings Call" "--"
## [5,] "2/29/2016" "Q4 2015 Earnings Call" "--"
## [6,] "11/4/2015" "Q3 2015 Earnings Call" "--"
## [7,] "7/29/2015" "Q2 2015 Earnings Call" "--"
## [8,] "4/27/2015" "Q1 2015 Earnings Call" "--"
## [9,] "3/2/2015" "Q4 2014 Earnings Call" "--"
## [10,] "4/28/2014" "Q1 2014 Earnings Call" "--"
## [11,] "2/24/2014" "Q4 2013 Earnings Call" "--"
## [12,] "10/28/2013" "Q3 2013 Earnings Call" "--"
## [13,] "7/29/2013" "Q2 2013 Earnings Call" "--"
## [,4]
## [1,] "<a href=\"http://seekingalpha.com/article/4045508-scorpio-tankers-stng-ceo-emanuele-lauro-q4-2016-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [2,] "<a href=\"http://seekingalpha.com/article/4023325-scorpio-tankers-stng-ceo-emanuele-lauro-q3-2016-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [3,] "<a href=\"http://seekingalpha.com/article/3993429-scorpio-tankers-stng-management-q2-2016-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [4,] "<a href=\"http://seekingalpha.com/article/3968728-scorpio-tankers-stng-ceo-emanuele-lauro-q1-2016-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [5,] "<a href=\"http://seekingalpha.com/article/3941526-scorpio-tankers-stng-ceo-emanuele-lauro-q4-2015-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [6,] "<a href=\"http://seekingalpha.com/article/3646266-scorpio-tankers-stng-ceo-emanuele-lauro-on-q3-2015-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [7,] "<a href=\"http://seekingalpha.com/article/3387875-scorpio-tankers-stng-ceo-emanuele-lauro-on-q2-2015-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [8,] "<a href=\"http://seekingalpha.com/article/3106706-scorpio-tankers-stng-ceo-emanuele-lauro-on-q1-2015-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [9,] "<a href=\"http://seekingalpha.com/article/2966166-scorpio-tankers-stng-ceo-emanuele-lauro-on-q4-2014-results-earnings-call-transcript?source=feed_tag_transcripts\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [10,] "<a href=\"http://seekingalpha.com/article/2169963-scorpio-tankers-ceo-discusses-q1-2014-results-earnings-call-transcript?source=feed\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [11,] "<a href=\"http://seekingalpha.com/article/2044333-scorpio-tankers-management-discusses-q4-2013-results-earnings-call-transcript?source=feed\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [12,] "<a href=\"http://seekingalpha.com/article/1779812-scorpio-tankers-ceo-discusses-q3-2013-results-earnings-call-transcript?source=feed\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [13,] "<a href=\"http://seekingalpha.com/article/1582052-scorpio-tankers-ceo-discusses-q2-2013-results-earnings-call-transcript?source=feed\" target = \"_blank\" ><img height=\"15\" width=\"15\" src=\"https://staticx.zacks.com/images/icons/general/transcripts.png\">Open</a>"
## [,5]
## [1,] "9:00 AM"
## [2,] "10:30 AM"
## [3,] "11:00 AM"
## [4,] "10:30 AM"
## [5,] "10:30 AM"
## [6,] "10:00 AM"
## [7,] "10:00 AM"
## [8,] "9:30 AM"
## [9,] "10:00 AM"
## [10,] "11:00 AM"
## [11,] "11:00 AM"
## [12,] "12:00 PM"
## [13,] "2:30 PM"
##
## $earnings_announcements_revisions_table
## [,1] [,2] [,3] [,4]
## [1,] "7/21/2017" "<span class=\"hotspot\">Dec 2017 (Q) </span>" "$0.25" "<div class=\"down\">$0.03</div>"
## [2,] "7/21/2017" "<span class=\"hotspot\">Dec 2017 (FY) </span>" "$0.36" "<div class=\"down\">-$0.24</div>"
## [3,] "7/21/2017" "<span class=\"hotspot\">Sep 2017 (Q) </span>" "--" "<div>-$0.12</div>"
## [4,] "7/21/2017" "<span class=\"hotspot\">Jun 2017 (Q) </span>" "-$0.05" "<div class=\"down\">-$0.09</div>"
## [,5] [,6]
## [1,] "<span class=\"hotspot\">Mavrinac</span>" "<span title='Jefferies & Company' >Jefferies & Company</span>"
## [2,] "<span class=\"hotspot\">Mavrinac</span>" "<span title='Jefferies & Company' >Jefferies & Company</span>"
## [3,] "<span class=\"hotspot\">Mavrinac</span>" "<span title='Jefferies & Company' >Jefferies & Company</span>"
## [4,] "<span class=\"hotspot\">Mavrinac</span>" "<span title='Jefferies & Company' >Jefferies & Company</span>"
##
## $earnings_announcements_splits_table
## list()
##
## $earnings_announcements_dividends_table
## [,1] [,2] [,3] [,4]
## [1,] "9/29/2017" "$0.01" "9/14/2017" "9/22/2017"
## [2,] "6/14/2017" "$0.01" "4/27/2017" "5/9/2017"
## [3,] "3/30/2017" "$0.01" "2/14/2017" "2/21/2017"
## [4,] "12/22/2016" "$0.13" "11/14/2016" "11/22/2016"
## [5,] "9/29/2016" "$0.13" "7/28/2016" "9/13/2016"
## [6,] "3/30/2016" "$0.13" "2/29/2016" "3/8/2016"
## [7,] "12/11/2015" "$0.13" "11/4/2015" "11/20/2015"
## [8,] "9/4/2015" "$0.13" "7/29/2015" "8/12/2015"
## [9,] "6/10/2015" "$0.13" "4/27/2015" "5/19/2015"
## [10,] "3/30/2015" "$0.12" "3/2/2015" "3/11/2015"
## [11,] "12/12/2014" "$0.12" "11/11/2014" "11/21/2014"
## [12,] "9/10/2014" "$0.10" "7/28/2014" "8/20/2014"
## [13,] "6/12/2014" "$0.09" "4/28/2014" "5/22/2014"
## [14,] "3/26/2014" "$0.08" "2/24/2014" "3/7/2014"
## [15,] "12/18/2013" "$0.07" "10/28/2013" "11/29/2013"
## [16,] "9/25/2013" "$0.04" "7/29/2013" "9/6/2013"
## [17,] "6/25/2013" "$0.03" "4/15/2013" "6/7/2013"
##
## $earnings_announcements_guidance_table
## [,1] [,2] [,3]
## [1,] "1/21/2015" "$0.21" "$0.11 - $0.31"
## [2,] "10/14/2014" "$0.03" "-$0.03 - $0.07"
## [3,] "6/12/2014" "$0.01" "-$0.04 - $0.08"
## [4,] "1/28/2013" "-$0.02" "-$0.06 - $0.05"
你需要清理一下,但它不需要使用Selenium或Splash。