Question

我对我最近的背心刮伤有疑问。

我想抓这个页面（以及其他一些股票）： http://www.finviz.com/quote.ashx?t=AA&ty=c&p=d&b=1

我需要一个市场资本清单，这是第二行的第一个框。该清单应包含约50-100种股票。

我正在使用rvest。

library(rvest)

html = read_html("http://www.finviz.com/quote.ashx?t=A")

cast = html_nodes(html, "table-dark-row")

问题是，我无法绕过html_nodes。有关如何找到html_nodes的正确节点的任何想法吗？

我正在使用firebug / firefinder来查看网页。

Answer 1

不确定这是否是你想要的，因为我找不到aprox列表。 50-100股。

但值得一提的是，使用SelectorGadget我想出了这个节点.table-dark-row：nth-child（2）.snapshot-td2：nth-child（2），来选择市场上限（本页第二行的第一个方框http://www.finviz.com/quote.ashx?t=AA&ty=c&p=d&b=1）。

repo:tag

如果这不是您想要的，只需使用SelectorGadget找到您想要的内容。

希望这有帮助。

编辑：

这里有完整的解决方案：

ubuntu:15.04