我从源(下面列出的)中获取了一些时间数据,并想要转换它。
060000
061500
063000
064500
070000
071500
073000
这
6:360:00
6:375:00
6:390:00
6:405:00
7:420:00
7:435:00
7:450:00
7:465:00
8:480:00
8:495:00
答案 0 :(得分:0)
转换规则不明确。看起来我们想要实现某种转换
> df2
date half_hour values
1 2010-02-02 0.5 NA
2 2010-02-02 1.0 0.99599102
3 2010-02-02 1.5 0.29814381
4 2010-02-02 2.0 1.41686296
5 2010-02-02 2.5 1.95570961
6 2010-02-02 3.0 3.59151505
7 2010-02-02 3.5 NA
8 2010-02-02 4.0 NA
9 2010-02-02 4.5 -2.94070834
10 2010-02-02 5.0 NA
11 2010-02-02 5.5 -2.08794703
12 2010-02-02 6.0 1.04275734
13 2010-02-02 6.5 1.46472433
14 2010-02-02 7.0 -2.02043247
15 2010-02-02 7.5 -0.17989752
16 2010-02-02 8.0 1.16028746
17 2010-02-02 8.5 0.42617715
18 2010-02-02 9.0 -1.21205356
19 2010-02-02 9.5 -1.63536660
20 2010-02-02 10.0 -2.37808504
21 2010-02-02 10.5 -0.15505870
22 2010-02-02 11.0 0.03145841
23 2010-02-02 11.5 -0.93546302
24 2010-02-02 12.0 0.63270809
25 2010-02-02 12.5 0.22420168
26 2010-02-02 13.0 -0.46191368
27 2010-02-02 13.5 2.21862683
28 2010-02-02 14.0 0.36631139
29 2010-02-02 14.5 0.76912170
30 2010-02-02 15.0 -2.70820713
31 2010-02-02 15.5 -0.18200408
32 2010-02-02 16.0 1.98156055
33 2010-02-02 16.5 0.57525057
34 2010-02-02 17.0 1.37435422
35 2010-02-02 17.5 1.64160673
36 2010-02-02 18.0 -1.13330533
37 2010-02-02 18.5 -0.33000520
38 2010-02-02 19.0 0.03816768
39 2010-02-02 19.5 1.23194633
40 2010-02-02 20.0 -1.98555720
41 2010-02-02 20.5 1.77062845
42 2010-02-02 21.0 -0.03245631
43 2010-02-02 21.5 -0.58233200
44 2010-02-02 22.0 -0.39989655
45 2010-02-02 22.5 1.75511944
46 2010-02-02 23.0 0.91594245
47 2010-02-02 23.5 2.04145902
48 2010-02-02 24.0 NA
49 2010-02-03 0.5 0.80626028
50 2010-02-03 1.0 0.99599102
51 2010-02-03 1.5 0.29814381
52 2010-02-03 2.0 1.41686296
53 2010-02-03 2.5 1.95570961
54 2010-02-03 3.0 3.59151505
55 2010-02-03 3.5 -1.66764947
56 2010-02-03 4.0 0.50262906
57 2010-02-03 4.5 -2.94070834
58 2010-02-03 5.0 -1.12035358
59 2010-02-03 5.5 -2.08794703
60 2010-02-03 6.0 1.04275734
61 2010-02-03 6.5 1.46472433
62 2010-02-03 7.0 -2.02043247
63 2010-02-03 7.5 -0.17989752
64 2010-02-03 8.0 1.16028746
65 2010-02-03 8.5 0.42617715
66 2010-02-03 9.0 NA
67 2010-02-03 9.5 -1.63536660
68 2010-02-03 10.0 -2.37808504
69 2010-02-03 10.5 -0.15505870
70 2010-02-03 11.0 0.03145841
71 2010-02-03 11.5 -0.93546302
72 2010-02-03 12.0 0.63270809
73 2010-02-03 12.5 0.22420168
74 2010-02-03 13.0 -0.46191368
75 2010-02-03 13.5 2.21862683
76 2010-02-03 14.0 0.36631139
77 2010-02-03 14.5 0.76912170
78 2010-02-03 15.0 -2.70820713
79 2010-02-03 15.5 -0.18200408
80 2010-02-03 16.0 1.98156055
81 2010-02-03 16.5 0.57525057
82 2010-02-03 17.0 1.37435422
83 2010-02-03 17.5 1.64160673
84 2010-02-03 18.0 -1.13330533
85 2010-02-03 18.5 -0.33000520
86 2010-02-03 19.0 0.03816768
87 2010-02-03 19.5 1.23194633
88 2010-02-03 20.0 -1.98555720
89 2010-02-03 20.5 1.77062845
90 2010-02-03 21.0 -0.03245631
91 2010-02-03 21.5 -0.58233200
92 2010-02-03 22.0 -0.39989655
93 2010-02-03 22.5 1.75511944
94 2010-02-03 23.0 0.91594245
95 2010-02-03 23.5 2.04145902
96 2010-02-03 24.0 NA
看起来需要将原始字符串拆分为三个部分。
SQL不是解析字符串的最佳工具,但我们可以一起破解。
假设所有值都包含两个冒号字符,我们可以这样做(在这里使用用户定义的变量代替列名)
from pylab import *
xvalues, yvalues = meshgrid(arange(0, 3, 0.1), arange(0, 3, 0.1))
xdot = xvalues - xvalues * yvalues
ydot = - yvalues + xvalues * yvalues
streamplot(xvalues, yvalues, xdot, ydot)
show()
我们得到了
from to
--------- ----
6:360:00 -> 060000
6:375:00 -> 061500
6:390:00 -> 063000
6:405:00 -> 064500
7:420:00 -> 070000
7:435:00 -> 071500
7:450:00 -> 073000
我们需要做一些算术。我们是否在第一个冒号(p1)之前取出部分并将其用作“小时”?
第二部分(冒号之间)似乎可能是午夜以来的几分钟。我们可以进行模60运算并获得余数,并将其用作分钟数。 (我们也可以使用整数除法的结果为60,并将其用作“小时”。
我们可以为每个部分添加零,以便将那些转换为数字,然后我们可以将每个部分格式化为带有前导零的两位数的字符串。然后将这些部分连接成一个字符串。
SET @foo := '6:375:00';
SELECT SUBSTRING_INDEX(@foo,':',1) AS p1
, SUBSTRING_INDEX(SUBSTRING_INDEX(@foo,':',2),':',-1) AS p2
, SUBSTRING_INDEX(@foo,':',-1) AS p3
要对表的列使用这些操作,请使用有效的列名替换三次出现的p1 p2 p3
----- ---- ----
6 375 00
,并添加SELECT v.foo
, CONCAT( RIGHT(CONCAT('00', v.p1 + 0 ),2)
, RIGHT(CONCAT('00', (v.p2 + 0) MOD 60),2)
, RIGHT(CONCAT('00', v.p3 +0 ),2)
) AS bar
FROM (
SELECT @foo AS foo
, SUBSTRING_INDEX(@foo,':',1) AS p1
, SUBSTRING_INDEX(SUBSTRING_INDEX(@foo,':',2),':',-1) AS p2
, SUBSTRING_INDEX(@foo,':',-1) AS p3
) v
子句。像这样:
@foo
当源FROM
值不包含两个冒号字符时,转换会崩溃。
根本不清楚我们想要对转换后的值做什么。 (如果这是一个“时间”值,我倾向于将值转换为SELECT v.foo
, CONCAT( RIGHT(CONCAT('00', v.p1 + 0 ),2)
, RIGHT(CONCAT('00', (v.p2 + 0) MOD 60),2)
, RIGHT(CONCAT('00', v.p3 +0 ),2)
) AS bar
FROM (
SELECT t.foo AS foo
, SUBSTRING_INDEX(t.foo,':',1) AS p1
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.foo,':',2),':',-1) AS p2
, SUBSTRING_INDEX(t.foo,':',-1) AS p3
FROM my_table t
) v
数据类型,而不是字符串。