我使用此代码从网页中删除了一张表
library(XML)
url2 <- "http://www.baseball-reference.com/leagues/MLB/"
data2 <- readHTMLTable(url2, stringAsFactor = FALSE)
它给了我一个看起来像这样的列表,
$teams_team_wins3000
Year G ARI ATL BLA BAL BOS CHC CHW CIN CLE COL DET HOU KCR ANA LAD FLA
1 2016 149 62 57 81 84 94 72 62 86 71 78 78 75 64 84 73
2 2015 162 79 67 81 78 97 76 64 81 68 74 86 95 85 92 71
3 2014 162 64 79 96 71 73 73 76 85 66 90 70 89 98 94 77
4 2013 163 81 96 85 97 66 63 90 92 74 93 51 86 78 92 62
5 2012 162 81 94 93 69 61 85 97 68 64 88 55 72 89 86 69
6 2011 162 94 89 69 90 71 79 79 80 73 95 56 71 86 82 72
7 2010 162 65 91 66 89 75 88 91 69 83 81 76 67 80 80 80
8 2009 163 70 86 64 95 83 79 78 65 92 86 74 65 97 95 87
9 2008 163 82 72 68 95 97 89 74 81 74 74 86 75 100 84 84
如果您愿意,只需将代码复制到顶部即可获得相同的表格。问题是R正在读取这个列表,我希望它是一个数据框。
通常情况下,我会使用此代码将其转换为数据框,但这次它不能正常工作。
do.call(rbind, data2) %>% as.data.frame
我还是R的新手,我想做的是将此列表转换为数据框,以便我可以将数据结构化为这样,
Year Team Wins Games
2016 ARI 62 149
2016 ATL 57 149
感谢所有帮助。
答案 0 :(得分:1)
一些问题。拼写:它是stringsAsFactors
。那里有 一个数据帧,但由于该函数准备接受多个表,因此它作为列表项存在。你可以用#34; [[&#34;就像你对任何列表一样:
str(data2[[1]])
'data.frame': 120 obs. of 33 variables:
$ Year: Factor w/ 117 levels "1901","1902",..: 116 115 114 113 112 111 110 109 108 107 ...
$ G : Factor w/ 15 levels "111","117","129",..: 6 12 12 13 12 12 12 13 13 13 ...
$ ARI : Factor w/ 19 levels "","100","51",..: 4 10 5 11 11 17 6 7 12 15 ...
$ ATL : Factor w/ 55 levels "101","103","104",..: 16 26 37 53 51 46 48 44 31 42 ...
$ BLA : Factor w/ 4 levels "","50","68","BLA": 1 1 1 1 1 1 1 1 1 1 ...
$ BAL : Factor w/ 53 levels "100","101","102",..: 37 37 50 40 47 26 23 21 25 26 ...
$ BOS : Factor w/ 51 levels "101","104","105",..: 35 29 22 48 21 41 40 46 46 47 ...
$ CHC : Factor w/ 47 levels "100","104","107",..: 42 44 21 14 10 19 23 31 44 33 ...
$ CHW : Factor w/ 46 levels "100","49","51",..: 20 24 21 11 32 27 35 27 36 20 ...
$ CIN : Factor w/ 45 levels "100","102","108",..: 10 11 22 36 42 25 37 24 20 18 ...
$ CLE : Factor w/ 44 levels "100","111","51",..: 31 26 30 37 13 25 14 10 26 40 ...
snipped rest of the 33 columns
尝试:
data2 <- readHTMLTable(url2, stringsAsFactors = FALSE)
str(data2[[1]])