BigQuery中的date_format()等效项

时间:2019-02-20 04:17:12

标签: sql database google-bigquery

我的表格中有一个日期列,其中包含格式为2017-08-05 09-AM的字符串,并且我正在尝试对其进行格式设置,以便有一个日期类型的列和一个日期类型的时间,时间类型。

library(naivebayes)
#data <- read.csv(file.choose(),header = T) 
data <- data.frame(admit = sample(100, x=c(F,T), prob=c(.5,.5), replace=T),
           var1 = sample(100, x=1:4, replace=T),
           var2 = sample(100, x=1:3, replace=T),
           var3 = sample(100, x=1:3, replace=T),
           var4 = sample(100, x=c("s1", "s2"), replace=T))

str(data)
set.seed(1234)
splitData <- sample(2,nrow(data),replace = T,prob = c(0.8,0.2))
train<-data[splitData == 1,]
test <- data[splitData == 2,]

# Doesn't work
mdl <- naive_bayes(admit ~ .,data = train)
predicted <- predict(mdl, train, type = 'prob')

# Works
mdl <- naive_bayes(admit ~ var1 + var2 + var3,data = train)
predicted <- predict(mdl, train, type = 'prob')

# Convert string to factor then numeric
train$var4 <- as.numeric(as.factor(train$var4))

mdl <- naive_bayes(admit ~ .,data = train)
predicted <- predict(mdl, train, type = 'prob')

查询在MySQL中以我希望的方式运行,但BigQuery不支持date_format()。我想知道是否有类似的方法将字符串日期转换为单独的日期和时间对象。

2 个答案:

答案 0 :(得分:1)

以下是用于BigQuery标准SQL

#standardSQL
SELECT id, Symbol,
  DATE(PARSE_DATETIME('%Y-%m-%d %H-%p', a.date)) AS `date`,
  TIME(PARSE_DATETIME('%Y-%m-%d %H-%p', a.date)) AS time
FROM `project.dataset.table` a

您可以使用下面的示例中的虚拟数据来测试,玩游戏

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, '2017-08-05 09-AM' `date`, 'x' Symbol UNION ALL
  SELECT 2, '2019-02-05 12-AM', 'y' UNION ALL
  SELECT 3, '2019-01-31 11-PM', 'z'
)
SELECT id, Symbol,
  DATE(PARSE_DATETIME('%Y-%m-%d %H-%p', a.date)) AS `date`,
  TIME(PARSE_DATETIME('%Y-%m-%d %H-%p', a.date)) AS time
FROM `project.dataset.table` a
-- ORDER BY id

有结果

Row id  Symbol  date        time     
1   1   x       2017-08-05  09:00:00     
2   2   y       2019-02-05  12:00:00     
3   3   z       2019-01-31  11:00:00     

答案 1 :(得分:0)

有一个PARSE_DATETIME函数可用于解析BigQuery中的自定义日期时间格式。

对于您来说,这应该有所帮助:

select ID, extract(date from dt) as date, extract(time from dt) as time, Symbol from (
  select 
     a.ID as ID, 
     parse_datetime('%Y-%d-%m %H-%p', a.date) as dt, 
     a.Symbol as Symbol
  from 
     `crypto_market_data.BTC_1H` a
  order by a.ID
)