我目前正在使用下面的数据框。
head(pdo,24)
Date Year Month Value Season
1 198001 1980 1 0.06 Winter
2 198002 1980 2 0.60 Spring
3 198003 1980 3 0.60 Spring
4 198004 1980 4 0.72 Spring
5 198005 1980 5 0.57 Summer
6 198006 1980 6 -0.78 Summer
7 198007 1980 7 -0.32 Summer
8 198008 1980 8 -0.12 Fall
9 198009 1980 9 -0.29 Fall
10 198010 1980 10 0.92 Fall
11 198011 1980 11 0.70 Winter
12 198012 1980 12 0.36 Winter
13 198101 1981 1 1.18 Winter
14 198102 1981 2 1.25 Spring
15 198103 1981 3 1.16 Spring
16 198104 1981 4 1.01 Spring
17 198105 1981 5 1.22 Summer
18 198106 1981 6 1.77 Summer
19 198107 1981 7 0.71 Summer
20 198108 1981 8 -0.11 Fall
21 198109 1981 9 0.34 Fall
22 198110 1981 10 -0.15 Fall
23 198111 1981 11 0.45 Winter
24 198112 1981 12 0.60 Winter
这是2年(1980-1981)更大数据框架的子集。我需要一种方法来整理整个数据框(1980-2014),以便按顺序选择冬季。
我需要的是:
Date Year Month Value Season
11 198011 1980 11 0.70 Winter
12 198012 1980 12 0.36 Winter
13 198101 1981 1 1.18 Winter
知道怎么做吗?我需要这个的原因是我可以平均得到"值"冬季的专栏。
感谢您的帮助!
答案 0 :(得分:0)
您是否只想提取冬季月份的“价值”平均值?使用基数R,你可以做到:
set.seed(123)
df <- data.frame(year = c(1980,1980,1980,1981,1981,1981,1982,1982,1982),
month = c(6,11,12,6,11,12,6,11,12),
season = c("Summer", "Winter", "Winter", "Summer", "Winter", "Winter", "Summer", "Winter", "Winter"),
value = sample(1:20, 9))
df
year month season value
1 1980 6 Summer 6
2 1980 11 Winter 15
3 1980 12 Winter 8
4 1981 6 Summer 16
5 1981 11 Winter 17
6 1981 12 Winter 1
7 1982 6 Summer 18
8 1982 11 Winter 12
9 1982 12 Winter 7
> mean(df[df$season == "Winter",]$value, na.rm = TRUE)
[1] 10
答案 1 :(得分:0)
您可以扩充您的数据,以反映您的数据中特定季节开始的年份。
pdo$SeasonYear <- with(pdo, Year - (Season == "Winter" & Month < 6))
pdo[pdo$Season == "Winter",]
# Date Year Month Value Season SeasonYear
# 1 198001 1980 1 0.06 Winter 1979
# 11 198011 1980 11 0.70 Winter 1980
# 12 198012 1980 12 0.36 Winter 1980
# 13 198101 1981 1 1.18 Winter 1980
# 23 198111 1981 11 0.45 Winter 1981
# 24 198112 1981 12 0.60 Winter 1981
从这里开始,
aggregate(pdo$Value, list(Season = pdo$Season, SeasonYear = pdo$SeasonYear), mean)
# Season SeasonYear x
# 1 Winter 1979 0.06000000
# 2 Spring 1980 0.64000000
# 3 Summer 1980 -0.17666667
# 4 Fall 1980 0.17000000
# 5 Winter 1980 0.74666667
# 6 Spring 1981 1.14000000
# 7 Summer 1981 1.23333333
# 8 Fall 1981 0.02666667
# 9 Winter 1981 0.52500000
消耗品数据:
pdo <- read.table(text=' Date Year Month Value Season
198001 1980 1 0.06 Winter
198002 1980 2 0.60 Spring
198003 1980 3 0.60 Spring
198004 1980 4 0.72 Spring
198005 1980 5 0.57 Summer
198006 1980 6 -0.78 Summer
198007 1980 7 -0.32 Summer
198008 1980 8 -0.12 Fall
198009 1980 9 -0.29 Fall
198010 1980 10 0.92 Fall
198011 1980 11 0.70 Winter
198012 1980 12 0.36 Winter
198101 1981 1 1.18 Winter
198102 1981 2 1.25 Spring
198103 1981 3 1.16 Spring
198104 1981 4 1.01 Spring
198105 1981 5 1.22 Summer
198106 1981 6 1.77 Summer
198107 1981 7 0.71 Summer
198108 1981 8 -0.11 Fall
198109 1981 9 0.34 Fall
198110 1981 10 -0.15 Fall
198111 1981 11 0.45 Winter
198112 1981 12 0.60 Winter', header=TRUE)
pdo$Season <- factor(pdo$Season, levels = c("Spring", "Summer", "Fall", "Winter"))
我冒昧地强制要素水平,以便正确地命令它们。