我一直在查看stackoverflow和youtube试图找到一种方法来执行以下操作。
我有这种格式的数据:
structure(list(year = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), ID = c(222L,
222L, 333L, 333L, 222L, 222L, 333L, 333L), sport = c(" baseball",
" football", " baseball", " football", " baseball", " football",
" baseball", " football"), money_raised = c(5L, 6L, 4L, 5L, 5L,
6L, 4L, 5L), money_used = c(3L, 4L, 2L, 3L, 3L, 4L, 2L, 3L),
money_total = c(7L, 6L, 7L, 8L, 7L, 6L, 7L, 8L)), .Names = c("year",
"ID", "sport", "money_raised", "money_used", "money_total"), class = "data.frame", row.names = c(NA,
-8L))
这只是数据的一个例子,实际上,而不是每个ID的2项运动,我有5个。
我希望将数据组织成列,这样我只有一行用于ID和年份,其中每个运动都有列和他们筹集,使用和总计的钱,这样我的数据将如下所示:
structure(list(year = c(1L, 1L), ID = c(222L, 333L), money_raised_baseball = c(5L,
4L), money_used_baseball = c(3L, 2L), money_total_baseball = c(7L,
7L), money_raised_football = c(6L, 5L), money_used_football = c(4L,
3L), money_total_football = c(6L, 8L)), .Names = c("year", "ID",
"money_raised_baseball", "money_used_baseball", "money_total_baseball",
"money_raised_football", "money_used_football", "money_total_football"
), class = "data.frame", row.names = c(NA, -2L))
答案 0 :(得分:0)
# Load package
library(tidyverse)
# Create the example data frame
dt <- read.csv(text = "year,ID,sport,money_raised,money_used,money_total
1,222,baseball,5,3,7
1,222,football,6,4,6
1,333,baseball,4,2,7
1,333,football,5,3,8
2,222,baseball,5,3,7
2,222,football,6,4,6
2,333,baseball,4,2,7
2,333,football,5,3,8",
stringsAsFactors = FALSE)
# Process the data
dt2 <- dt %>%
gather(money, value, contains("money")) %>%
unite(money_sport, money, sport, sep = "_") %>%
spread(money_sport, value) %>%
select(year, ID, money_raised_baseball, money_used_baseball, money_total_baseball,
money_raised_football, money_used_football, money_total_football)