将得分文本数据拉入r

时间:2016-07-21 17:18:36

标签: r text import

试图弄清楚如何将以下数据拉入r:

http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0

这几乎可行,但我想消除顶部和底部的垃圾,然后得到分数。

read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0', 
         widths=c(11,26,3,26,3,4,21),  
         skip = 8) 

1 个答案:

答案 0 :(得分:0)

首先欢迎堆叠交换!所以我改变了代码中的某些东西,比如你只需要6个宽度,你有一个额外的列,所以我摆脱了它。当我从在线提取数据时,我注意到第一行非常奇怪,所以我只是把它全部放在一起然后再手动添加它。

data <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',widths=c(10,26,3,26,3,4), sep = "\t", header = FALSE, skip = 8)
# This line subsets the data so you don't have that "junk" at the bottom and deletes the row
# with the html tagging. 

data <- data[2:2424,]
data <- data.frame(data)

# Create a vector that has the column headers
names <- c("date", "Team1","Runs", "Team 2","Runs","Something")
colnames(data) <- names

# Create the first row of data that we previously deleted.

firstrow = data.frame("2016-04-03", "@Pirates", 4, "Cardinals",1,"")
colnames(firstrow) <- names

finaldata <- rbind.data.frame(firstrow,data)

如果您可以发布您认为垃圾邮件的屏幕截图,以便将来帮助您尝试帮助您解决问题。

<强>更新

data <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',
                     widths=c(10,26,3,26,3,4), sep = "\t", header = FALSE, skip = 9)

data <- data.frame(data)

# This line subsets the data so you don't have that "junk" at the bottom and deletes the row
# with the html tagging. 

firstrow <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',
                  widths=c(-8,-1,-1,9,26,3,26,3,4), sep = "\t", header = FALSE, n = 1, skip = 8)
firstrow <- data.frame(firstrow,stringsAsFactors=FALSE)

firstrow[,1] <- paste("2",firstrow[1,1],sep = "")

# Create a vector that has the column headers
names <- c("date", "Team1","Runs", "Team 2","Runs","Something")
colnames(data) <- names



colnames(firstrow) <- names

finaldata <- rbind.data.frame(firstrow,data)

用于移动数据的列的负值,我只是用它来玩,直到它完成,以便第一行中缺少的所有内容都是&#34; 2&#34;。然后我粘贴在&#34; 2&#34;并使用rbind函数创建完整的数据框。我希望能帮到你。

我也在此页面上对其进行了测试:http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=2&sch=on&format=0 它按预期工作。