我有一个宽格式数据框 abc.csv 有变量 ID,pc_2007-pc_2011(这些中的值是不同年份的邮政编码)和 rd_2007-rd_2011(这些中的值是每年的审查日期。
ID | pc_2007 | pc_2008 | pc_2009 | pc_2010 | pc_2011 | rd_2007 | rd_2008 | rd_2009 | rd_2010 | rd_2011 |
---|---|---|---|---|---|---|---|---|---|---|
A | 1 | 4 | 7 | 10 | 13 | 16 | 19 | 22 | 25 | 28 |
B | 2 | 5 | 8 | 11 | 14 | 17 | 20 | 23 | 26 | 29 |
C | 3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 |
我想将此数据帧转换为长格式
ID | 年 | pc | rd |
---|---|---|---|
A | 2007 | 1 | 16 |
A | 2008 | 4 | 19 |
A | 2009 | 7 | 22 |
答案 0 :(得分:2)
您可以在 names_sep
中使用 pivot_longer
df2 <- tidyr::pivot_longer(df1,
cols = -ID,
names_to = c('.value', 'year'),
names_sep = '_')
df2
# ID year pc rd
# <chr> <chr> <int> <int>
# 1 A 2007 1 16
# 2 A 2008 4 19
# 3 A 2009 7 22
# 4 A 2010 10 25
# 5 A 2011 13 28
# 6 B 2007 2 17
# 7 B 2008 5 20
# 8 B 2009 8 23
# 9 B 2010 11 26
#10 B 2011 14 29
#11 C 2007 3 18
#12 C 2008 6 21
#13 C 2009 9 24
#14 C 2010 12 27
#15 C 2011 15 30
答案 1 :(得分:0)
请看下面的建议
# your data
df <- structure(list(ID = c("A", "B", "C"), pc_2007 = 1:3, pc_2008 = 4:6,
pc_2009 = 7:9, pc_2010 = 10:12, pc_2011 = 13:15, rd_2007 = 16:18,
rd_2008 = 19:21, rd_2009 = 22:24, rd_2010 = 25:27, rd_2011 = 28:30), class = "data.frame", row.names = c(NA,
-3L))
# packages needed
library(dplyr)
library(tidyr)
library(stringr)
# suggestion
df %>%
# your columns names are difficult to work with, I propose you use a "transition" table and use
# regular expressions to select elements you need ...
pivot_longer(cols = 2:last_col(), names_to = "year_code", values_to = "value") %>%
mutate(year = str_extract(year_code, "[0-9]{4}$"),
code = str_extract(year_code, "^[a-z]{2}")) %>%
select(-year_code) %>%
# ...and then pivot your table back
pivot_wider(names_from = code, values_from = value)
输出:
ID year pc rd
<chr> <chr> <int> <int>
1 A 2007 1 16
2 A 2008 4 19
3 A 2009 7 22
4 A 2010 10 25
5 A 2011 13 28
6 B 2007 2 17
7 B 2008 5 20
8 B 2009 8 23
9 B 2010 11 26
10 B 2011 14 29
11 C 2007 3 18
12 C 2008 6 21
13 C 2009 9 24
14 C 2010 12 27
15 C 2011 15 30
答案 2 :(得分:0)
这是 tidyverse
的一种方式,其中 pivot_longer
后跟 pivot_wider
。
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(
cols = -ID,
names_to = c("name", "year"),
names_sep = "_"
) %>%
pivot_wider(
id_cols = c(ID, year),
names_from = name,
values_from = value
)
## A tibble: 15 x 4
# ID year pc rd
# <chr> <chr> <int> <int>
# 1 A 2007 1 16
# 2 A 2008 4 19
# 3 A 2009 7 22
# 4 A 2010 10 25
# 5 A 2011 13 28
# 6 B 2007 2 17
# 7 B 2008 5 20
# 8 B 2009 8 23
# 9 B 2010 11 26
#10 B 2011 14 29
#11 C 2007 3 18
#12 C 2008 6 21
#13 C 2009 9 24
#14 C 2010 12 27
#15 C 2011 15 30
dput
格式的数据df1 <-
structure(list(ID = c("A", "B", "C"), pc_2007 = 1:3, pc_2008 = 4:6,
pc_2009 = 7:9, pc_2010 = 10:12, pc_2011 = 13:15, rd_2007 = 16:18,
rd_2008 = 19:21, rd_2009 = 22:24, rd_2010 = 25:27, rd_2011 = 28:30),
class = "data.frame", row.names = c(NA, -3L))