Question

我的问题有几层，所以请耐心等待：

我正在尝试从动态网页中抓取数据。 This is the page

我试图削减每个货币对的多头或多头的交易者数量。如下图所示：

现在，我需要交易者的数量 - 而不是百分比。

以下是我的所作所为：

# Load rvest package
library(rvest)

# Forex Factory

# EURUSD
# Read the url
eurusd <- read_html("http://www.forexfactory.com/#tradesPositions-sort=instruments&tradesPositions-sortOrder=asc&tradesPositions-details=0")

#Set up the nodes for EURUSD
#FF stands for 'Forex Factory'
FFeurusd<-eurusd%>%html_nodes(".label")%>%html_text()

#Scrape HTML page into a dataframe
shorteurusd_table<-data.frame(FFeurusd[1])
longeurusd_table<-data.frame(FFeurusd[2])

#This is where I attempt to merge the results from long and short traders and clean up the results (doesn't really work)
shorteurusd_table$FFeurusd.1.<-gsub("%","",shorteurusd_table$FFeurusd.1.)

#Adjust attributes of table
final<-merge(shorteurusd_table,longeurusd_table)
colnames(final) <-c("Short","Long")
rownames(final) <-c("EURUSD")

这是上述代码的输出。

问题：我只想要每列中的数字，即263和315.

我知道这是建立数据框架的一种非常低效的方式，但我在R中的经验有限。

我甚至无法让其他货币对工作，因为我无法获得准确的路径。如果我在XPath中复制粘贴，它永远不会工作。

我还希望能够计算长短之间的比率 - 存储数据，规范分数并为分布建模。这样就可以将当前比率与历史数据进行比较。最终我需要将代码放在服务器上，每天刮几次信息，使其自动化。我明白最后一项任务是相当可观的，但这是我自己设定的任务。

在对数学建模和统计进行数据建模和统计时，我对R很有信心 - 一旦数据抓取器实际运行并存储这些数据点，我可能会感到很好。 对不起文字墙 - 需要详细说明。

由于

Answer 1

这是开始。

请进行数据清理，您将在线获得大量帮助。

url<-"http://www.forexfactory.com/?flexId=flex_trades/positions_tradesPositionsCopy1&more=1" 
pgsession<-html_session(url) 
for(i in 1:10){   
if(i == 1){
        table<-html_table(html_nodes(read_html(pgsession), xpath=paste0('//*[@id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[',i,']/table')), fill= TRUE)[[1]]   
}else{
        table<-rbind(t,html_table(html_nodes(read_html(pgsession), xpath=paste0('//*[@id="flexBox_flex_trades/positions_tradesPositionsCopy1"]/div[',i,']/table')), fill= TRUE)[[1]])   
} 
}

数据抓取动态页面并构建钟形曲线模型

1 个答案: