我有一个原始数据框(df),其中包含大约10年(1994-2003)的数据。头部(df)如下所示:
Sl.no Date Year Month Season val1 val2 val3
1 1 1993-12-01 1993 Dec Winter 21.0 16.0 3.0
2 2 1994-01-01 1994 Jan Winter 21.0 15.5 0.0
3 3 1994-02-01 1994 Feb Winter 21.0 18.5 0.0
4 4 1994-03-01 1994 Mar Spring 30.0 24.0 1.9
5 5 1994-04-01 1994 Apr Spring 35.5 27.0 0.5
6 6 1994-05-01 1994 May Spring 36.0 30.0 1.5
因为我想将Months转换为因子,所以为了绘制boxplot,我使用了:
df$Month <- as.factor(format(df$Date, "%b"))
levels(df$Month) <- c("Jan","Feb","Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec")
然而输出如下:(月份不像原始df那样顺序)
Sl.no Date Year Month Season val1 val2 val3
1 1 1993-12-01 1993 Mar Winter 21.0 16.0 3.0
2 2 1994-01-01 1994 May Winter 21.0 15.5 0.0
3 3 1994-02-01 1994 Apr Winter 21.0 18.5 0.0
4 4 1994-03-01 1994 Aug Spring 30.0 24.0 1.9
5 5 1994-04-01 1994 Jan Spring 35.5 27.0 0.5
6 6 1994-05-01 1994 Sep Spring 36.0 30.0 1.5
所以在上面的df中,注意到月份是扭曲的,否则应该在日期之后按顺序。
那我怎么能纠正这个问题呢?我们将非常感谢您的帮助。 亲切的问候
答案 0 :(得分:0)
使用
df$Month <- factor(format(df$Date, "%b"), month.abb, ordered = TRUE)
演示您遇到的问题:
set.seed(1)
M <- sample(month.abb, 20, TRUE)
M
# [1] "Apr" "May" "Jul" "Nov" "Mar" "Nov" "Dec" "Aug" "Aug" "Jan" "Mar" "Mar" "Sep" "May"
# [15] "Oct" "Jun" "Sep" "Dec" "May" "Oct"
your_attempt <- as.factor(M)
# [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Apr Aug Dec Jan Jul Jun Mar May Nov Oct Sep
## At this step, you're basically asking R to replace "Apr" with "Jan",
## "Aug" with "Feb", and so on. Not what you're looking for....
levels(your_attempt) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
your_attempt
# [1] Jan Aug May Sep Jul Sep Mar Feb Feb Apr Jul Jul Nov Aug Oct Jun Nov Mar Aug Oct
# Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## ordered = TRUE not necessarily required. Depends on what you want to do
new_attempt <- factor(M, levels = month.abb, ordered = TRUE)
new_attempt
# [1] Apr May Jul Nov Mar Nov Dec Aug Aug Jan Mar Mar Sep May Oct Jun Sep Dec May Oct
# Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < Oct < Nov < Dec