如何在R中转换我的数据

时间:2017-10-19 14:05:56

标签: r

我有一个具有这样格式的数据,为了能够将我需要的数据可视化,将格式更改为第二个例子,你知道如何更改格式吗?

The first row is Age range
0–14    15–24   25–34   35–44   45–54   55–64   65 years and over   
1,873.4 1,088.4 1,296.4 1,157.2 1,207.5 1,177.5 1,498.7 
513.0   351.8   339.1   419.1   485.0   624.1   925.7   
1,049.9 666.4   594.2   682.9   645.7   650.2   727.1   
422.6   287.7   354.1   344.9   400.6   411.5   528.3   
2,069.1 1,234.7 1,429.0 1,310.3 1,323.1 1,229.6 1,514.9 
178.0   306.8   253.8   248.9   178.5   75.2    42.1    
2,247.2 1,541.5 1,682.9 1,559.2 1,501.5 1,304.8 1,557.0 

如何将数据转换为如下所示:

Age     Count
0-14    1,873.4
15-24   1,088.4
25-34   1,296.4
35-44   1,157.2
45-54   1,207.5
55-64   1,177.5
65+     1,498.7
0-14    513.0
15-24   351.8
25-34   339.1
35-44   419.1
45-54   485.0
55-64   624.1
65+     925.7
0-14    1,049.0
15-24   666.4
25-34   594.2
35-44   682.9
45-54   645.7
55-64   650.2
65+     727.1
0-14    422.6
15-24   287.7
25-34   354.1
35-44   344.9
45-54   400.6
55-64   411.5
65+     528.3

3 个答案:

答案 0 :(得分:0)

我会使用reshape2::melt但不同。 首先是dput格式的数据。

Tamra <-
structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6", 
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8", 
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4", 
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2", 
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5", 
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5", 
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7", 
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14", 
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA, 
-7L))

现在代码。

molten <- reshape2::melt(Tamra, measure.vars = names(Tamra),
                         variable.name = "Age", value.name = "Count")
#head(molten)
#   Age   Count
#1 0–14 1,873.4
#2 0–14   513.0
#3 0–14 1,049.9
#4 0–14   422.6
#5 0–14 2,069.1
#6 0–14   178.0

答案 1 :(得分:0)

dplyr + tidyr的解决方案:

library(dplyr)
library(tidyr)
df %>%
  gather(AgeRange, count) %>%
  mutate(count = as.numeric(gsub(",", "", count))) %>%
  arrange(rep(1:nrow(df), ncol(df)))

<强>结果:

   AgeRange  count
1      0–14 1873.4
2     15–24 1088.4
3     25–34 1296.4
4     35–44 1157.2
5     45–54 1207.5
6     55–64 1177.5
7       65+ 1498.7
8      0–14  513.0
9     15–24  351.8
10    25–34  339.1
11    35–44  419.1
12    45–54  485.0
13    55–64  624.1
14      65+  925.7
15     0–14 1049.9
16    15–24  666.4
17    25–34  594.2
18    35–44  682.9
19    45–54  645.7
20    55–64  650.2
 ...

注意:

我还添加了mutate步骤,将counts列转换为numeric

数据:

df = structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6", 
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8", 
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4", 
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2", 
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5", 
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5", 
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7", 
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14", 
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA, 
-7L))

答案 2 :(得分:-1)

如果您的表名为tab,我会这样做:

# Load library
  library(reshape)

# Use melt() function
  melt.tab <- melt(tab)

您正在寻找的是melt.tab的第二和第三行;那就是:

 melt.tab[,-1]