我有一个具有这样格式的数据,为了能够将我需要的数据可视化,将格式更改为第二个例子,你知道如何更改格式吗?
The first row is Age range
0–14 15–24 25–34 35–44 45–54 55–64 65 years and over
1,873.4 1,088.4 1,296.4 1,157.2 1,207.5 1,177.5 1,498.7
513.0 351.8 339.1 419.1 485.0 624.1 925.7
1,049.9 666.4 594.2 682.9 645.7 650.2 727.1
422.6 287.7 354.1 344.9 400.6 411.5 528.3
2,069.1 1,234.7 1,429.0 1,310.3 1,323.1 1,229.6 1,514.9
178.0 306.8 253.8 248.9 178.5 75.2 42.1
2,247.2 1,541.5 1,682.9 1,559.2 1,501.5 1,304.8 1,557.0
如何将数据转换为如下所示:
Age Count
0-14 1,873.4
15-24 1,088.4
25-34 1,296.4
35-44 1,157.2
45-54 1,207.5
55-64 1,177.5
65+ 1,498.7
0-14 513.0
15-24 351.8
25-34 339.1
35-44 419.1
45-54 485.0
55-64 624.1
65+ 925.7
0-14 1,049.0
15-24 666.4
25-34 594.2
35-44 682.9
45-54 645.7
55-64 650.2
65+ 727.1
0-14 422.6
15-24 287.7
25-34 354.1
35-44 344.9
45-54 400.6
55-64 411.5
65+ 528.3
答案 0 :(得分:0)
我会使用reshape2::melt
但不同。
首先是dput
格式的数据。
Tamra <-
structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6",
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8",
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4",
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2",
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5",
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5",
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7",
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14",
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA,
-7L))
现在代码。
molten <- reshape2::melt(Tamra, measure.vars = names(Tamra),
variable.name = "Age", value.name = "Count")
#head(molten)
# Age Count
#1 0–14 1,873.4
#2 0–14 513.0
#3 0–14 1,049.9
#4 0–14 422.6
#5 0–14 2,069.1
#6 0–14 178.0
答案 1 :(得分:0)
dplyr
+ tidyr
的解决方案:
library(dplyr)
library(tidyr)
df %>%
gather(AgeRange, count) %>%
mutate(count = as.numeric(gsub(",", "", count))) %>%
arrange(rep(1:nrow(df), ncol(df)))
<强>结果:强>
AgeRange count
1 0–14 1873.4
2 15–24 1088.4
3 25–34 1296.4
4 35–44 1157.2
5 45–54 1207.5
6 55–64 1177.5
7 65+ 1498.7
8 0–14 513.0
9 15–24 351.8
10 25–34 339.1
11 35–44 419.1
12 45–54 485.0
13 55–64 624.1
14 65+ 925.7
15 0–14 1049.9
16 15–24 666.4
17 25–34 594.2
18 35–44 682.9
19 45–54 645.7
20 55–64 650.2
...
注意:强>
我还添加了mutate
步骤,将counts
列转换为numeric
数据:强>
df = structure(list(`0–14` = c("1,873.4", "513.0", "1,049.9", "422.6",
"2,069.1", "178.0", "2,247.2"), `15–24` = c("1,088.4", "351.8",
"666.4", "287.7", "1,234.7", "306.8", "1,541.5"), `25–34` = c("1,296.4",
"339.1", "594.2", "354.1", "1,429.0", "253.8", "1,682.9"), `35–44` = c("1,157.2",
"419.1", "682.9", "344.9", "1,310.3", "248.9", "1,559.2"), `45–54` = c("1,207.5",
"485.0", "645.7", "400.6", "1,323.1", "178.5", "1,501.5"), `55–64` = c("1,177.5",
"624.1", "650.2", "411.5", "1,229.6", "75.2", "1,304.8"), `65+` = c("1,498.7",
"925.7", "727.1", "528.3", "1,514.9", "42.1", "1,557.0")), .Names = c("0–14",
"15–24", "25–34", "35–44", "45–54", "55–64", "65+"), class = "data.frame", row.names = c(NA,
-7L))
答案 2 :(得分:-1)
如果您的表名为tab
,我会这样做:
# Load library
library(reshape)
# Use melt() function
melt.tab <- melt(tab)
您正在寻找的是melt.tab
的第二和第三行;那就是:
melt.tab[,-1]