两年后的冬季月份

时间:2018-04-18 20:21:19

标签: r

我目前正在使用下面的数据框。

 head(pdo,24)
     Date Year Month Value Season
1  198001 1980     1  0.06 Winter
2  198002 1980     2  0.60 Spring
3  198003 1980     3  0.60 Spring
4  198004 1980     4  0.72 Spring
5  198005 1980     5  0.57 Summer
6  198006 1980     6 -0.78 Summer
7  198007 1980     7 -0.32 Summer
8  198008 1980     8 -0.12   Fall
9  198009 1980     9 -0.29   Fall
10 198010 1980    10  0.92   Fall
11 198011 1980    11  0.70 Winter
12 198012 1980    12  0.36 Winter
13 198101 1981     1  1.18 Winter
14 198102 1981     2  1.25 Spring
15 198103 1981     3  1.16 Spring
16 198104 1981     4  1.01 Spring
17 198105 1981     5  1.22 Summer
18 198106 1981     6  1.77 Summer
19 198107 1981     7  0.71 Summer
20 198108 1981     8 -0.11   Fall
21 198109 1981     9  0.34   Fall
22 198110 1981    10 -0.15   Fall
23 198111 1981    11  0.45 Winter
24 198112 1981    12  0.60 Winter

这是2年(1980-1981)更大数据框架的子集。我需要一种方法来整理整个数据框(1980-2014),以便按顺序选择冬季。

我需要的是:

     Date Year Month Value Season
11 198011 1980    11  0.70 Winter
12 198012 1980    12  0.36 Winter
13 198101 1981     1  1.18 Winter

知道怎么做吗?我需要这个的原因是我可以平均得到"值"冬季的专栏。

感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

您是否只想提取冬季月份的“价值”平均值?使用基数R,你可以做到:

set.seed(123)
df <- data.frame(year = c(1980,1980,1980,1981,1981,1981,1982,1982,1982),
             month = c(6,11,12,6,11,12,6,11,12),
             season = c("Summer", "Winter", "Winter", "Summer", "Winter", "Winter", "Summer", "Winter", "Winter"),
             value = sample(1:20, 9))

df
  year month season value
1 1980     6 Summer     6
2 1980    11 Winter    15
3 1980    12 Winter     8
4 1981     6 Summer    16
5 1981    11 Winter    17
6 1981    12 Winter     1
7 1982     6 Summer    18
8 1982    11 Winter    12
9 1982    12 Winter     7

> mean(df[df$season == "Winter",]$value, na.rm = TRUE)
[1] 10

答案 1 :(得分:0)

您可以扩充您的数据,以反映您的数据中特定季节开始的年份。

pdo$SeasonYear <- with(pdo, Year - (Season == "Winter" & Month < 6))
pdo[pdo$Season == "Winter",]
#      Date Year Month Value Season SeasonYear
# 1  198001 1980     1  0.06 Winter       1979
# 11 198011 1980    11  0.70 Winter       1980
# 12 198012 1980    12  0.36 Winter       1980
# 13 198101 1981     1  1.18 Winter       1980
# 23 198111 1981    11  0.45 Winter       1981
# 24 198112 1981    12  0.60 Winter       1981

从这里开始,

aggregate(pdo$Value, list(Season = pdo$Season, SeasonYear = pdo$SeasonYear), mean)
#   Season SeasonYear           x
# 1 Winter       1979  0.06000000
# 2 Spring       1980  0.64000000
# 3 Summer       1980 -0.17666667
# 4   Fall       1980  0.17000000
# 5 Winter       1980  0.74666667
# 6 Spring       1981  1.14000000
# 7 Summer       1981  1.23333333
# 8   Fall       1981  0.02666667
# 9 Winter       1981  0.52500000

消耗品数据:

pdo <- read.table(text='  Date Year Month Value Season
198001 1980     1  0.06 Winter
198002 1980     2  0.60 Spring
198003 1980     3  0.60 Spring
198004 1980     4  0.72 Spring
198005 1980     5  0.57 Summer
198006 1980     6 -0.78 Summer
198007 1980     7 -0.32 Summer
198008 1980     8 -0.12   Fall
198009 1980     9 -0.29   Fall
198010 1980    10  0.92   Fall
198011 1980    11  0.70 Winter
198012 1980    12  0.36 Winter
198101 1981     1  1.18 Winter
198102 1981     2  1.25 Spring
198103 1981     3  1.16 Spring
198104 1981     4  1.01 Spring
198105 1981     5  1.22 Summer
198106 1981     6  1.77 Summer
198107 1981     7  0.71 Summer
198108 1981     8 -0.11   Fall
198109 1981     9  0.34   Fall
198110 1981    10 -0.15   Fall
198111 1981    11  0.45 Winter
198112 1981    12  0.60 Winter', header=TRUE)
pdo$Season <- factor(pdo$Season, levels = c("Spring", "Summer", "Fall", "Winter"))

我冒昧地强制要素水平,以便正确地命令它们。