我是tsibble软件包的新手。我有月度数据,不得不强迫使用该寓言包。我遇到的几个问题
kclusterer = KMeansClusterer(8, distance = nltk.cluster.util.cosine_distance, repeats = 1)
predict = kclusterer.cluster(features, assign_clusters = True)
centroids = kclusterer._centroid
df_clustering['cluster'] = predict
# df_clustering['centroid'] = centroids[df_clustering['cluster'] - 1].tolist()
df_clustering['centroid'] = centroids
library(dplyr)
library(fable)
library(lubridate)
library(tsibble)
test <- data.frame(
YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)
test_ts <- test %>%
mutate(YearMonth = ymd(YearMonth)) %>%
as_tsibble(
index = YearMonth,
regular = FALSE #because it picks up gaps when I set it to TRUE
)
# Are there any gaps?
has_gaps(test_ts, .full = T)
model_new <- test_ts %>%
model(
snaive = SNAIVE(Claims))
任何帮助将不胜感激。
答案 0 :(得分:1)
看起来as_tsibble
不能正确识别YearMonth
列中的间隔,因为它是Date
类对象。隐藏在帮助页面的“索引”部分中可能有问题:
对于规则间隔的tbl_ts,必须选择索引表示。例如,每月数据应对应于Yearmonth或zoo :: yearmon创建的时间索引,而不是Date或POSIXct。
像摘录一样,您可以使用yearmonth()
解决问题。但这首先需要进行一些字符串操作,才能将其转换为可以正确解析的格式。
test_ts <- test %>%
mutate(YearMonth = gsub("(.{2})01$", "-\\1", YearMonth) %>%
yearmonth()
) %>%
as_tsibble(
index = YearMonth
)
现在,该模型应无错误运行!不确定为什么has_gaps()
测试说明您的示例中一切正常...
答案 1 :(得分:1)
您有一个每日索引,但想要一个月索引。最简单的方法是使用tsibble::yearmonth()
函数,但是您需要先将日期转换为字符。
library(dplyr)
library(tsibble)
test <- data.frame(
YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)
test_ts <- test %>%
mutate(YearMonth = yearmonth(as.character(YearMonth))) %>%
as_tsibble(index = YearMonth)