在R中创建新变量时引用列标签

时间:2017-03-04 22:20:22

标签: r reference label col

我已经将数据帧写入R,如下所示:

page.201702050atl = read_html("http://www.pro-football-reference.com/boxscores/201702050atl.htm")
comments.201702050atl = page.201702050atl %>% html_nodes(xpath = "//comment()")
team.stats.201702050atl = comments.201702050atl[27] %>% html_text() %>% read_html() %>% html_node("#team_stats") %>% html_table()
> team.stats.201702050atl
                                NWE           ATL
1         First Downs            37            17
2        Rush-Yds-TDs      25-104-2      18-104-1
3   Cmp-Att-Yd-TD-INT 43-63-466-2-1 17-23-284-2-0
4        Sacked-Yards          5-24          5-44
5      Net Pass Yards           442           240
6         Total Yards           546           344
7        Fumbles-Lost           1-1           1-1
8           Turnovers             2             1
9     Penalties-Yards          4-23          9-65
10   Third Down Conv.          7-14           1-8
11  Fourth Down Conv.           1-1           0-0
12 Time of Possession         40:31         23:27
> str(team.stats.201702050atl)
'data.frame':   12 obs. of  3 variables:
 $    : chr  "First Downs" "Rush-Yds-TDs" "Cmp-Att-Yd-TD-INT" "Sacked-Yards" ...
 $ NWE: chr  "37" "25-104-2" "43-63-466-2-1" "5-24" ...
 $ ATL: chr  "17" "18-104-1" "17-23-284-2-0" "5-44" ...

正如您所看到的,R在第2和第3列已标记的情况下擦除此表。我想给这些列添加通用标签并将c("", "NWE", "ATL")移动到表本身中,以便我可以使用它。另外,当我将行移入表格时,我想用自己的文本填充那个空单元格。换句话说,我希望得到一些看起来像的东西:

> team.stats.201702050atl.a
                       V1            V2            V3
    1                  Tm           NWE           ATL
    2         First Downs            37            17
    3        Rush-Yds-TDs      25-104-2      18-104-1
    4   Cmp-Att-Yd-TD-INT 43-63-466-2-1 17-23-284-2-0
    5        Sacked-Yards          5-24          5-44
    6      Net Pass Yards           442           240
    7         Total Yards           546           344
    8        Fumbles-Lost           1-1           1-1
    9           Turnovers             2             1
    10     Penalties-Yards          4-23          9-65
    11   Third Down Conv.          7-14           1-8
    12  Fourth Down Conv.           1-1           0-0
    13 Time of Possession         40:31         23:27

我知道我可以这样做:

team.stats.201702050atl.a = as.data.frame(t(team.stats.201702050atl))
team.stats.201702050atl.a$r1 = c("Tm", "NWE", "ATL")
team.stats.201702050atl = as.data.frame(t(team.stats.201702050atl.a))

...但是如何让R直接引用team.stats.201702050atl $ V2和team.stats.201702050atl $ V3中的列标签而不明确地输入它们?而且,如何在该行的第一列插入我自己的原始文本时这样做?

1 个答案:

答案 0 :(得分:0)

您不需要转置,您可以使用rbind将列名称的向量添加为行,例如:

team.stats.201702050atl2 <- rbind(c("Tm", "NWE", "ATL"), team.stats.201702050atl)

或者使用colnames直接替换列名,并添加缺少的“Tm”值:

team.stats.201702050atl2 <- rbind(colnames(team.stats.201702050atl), team.stats.201702050atl)
team.stats.201702050atl2[1,1] <- "Tm"

请参阅?colnames?rownames以引用列名和行名。例如,您可以通过索引来引用特定的列名称。例如:colnames(team.stats.201702050atl2)[1]colnames(team.stats.201702050atl2)[2:3],这提供了另一种方法:

team.stats.201702050atl2 <- rbind(c("Tm", colnames(team.stats.201702050atl)[2:3]), team.stats.201702050atl)

或变体:

team.stats.201702050atl2 <- rbind(c("Tm", colnames(team.stats.201702050atl)[2:ncol(team.stats.201702050atl)]), team.stats.201702050atl)

最后,使用colnames分配新的列名:

colnames(team.stats.201702050atl2) <- c("V1", "V2", "V3")