在结构化数组中转换日期元素

时间:2019-03-04 07:40:20

标签: python python-3.x pandas numpy jupyter-notebook

我已经解析了一个数据文件,该文件具有julian日期格式的 second 元素。

array([(1957,  1, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999),
   (1957, 13, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999)],
  dtype=[('year', '<i4'), ('julian_date', '<i4'), ('1', '<i4'), ('2', '<i4'), ('3', '<i4'), ('4', '<i4'), ('5', '<i4'), ('6', '<i4'), ('7', '<i4'), ('8', '<i4'), ('9', '<i4'), ('10', '<i4'), ('11', '<i4'), ('12', '<i4')])

我只想将julian 日期转换为月份,以便可视化我的数据。

我知道如何使用julian_date元素执行简单的操作,例如:

galveston['julian_date'] - 1
array([  0,  12,  24, ..., 336, 348, 360], dtype=int32)

我知道我可以使用datetime函数进行此转换:

datetime.date(1956, 1, 1) + datetime.timedelta(121 - 1)

返回datetime.date(1956, 4, 30),其中4是月份数字,

但是,我不知道如何将其应用于我的数据。我是python的新手,并且是一般的编程人员,我将不胜感激。

好的。我想我正在接近.month方法。当我手动输入yearjulian_date时:

x = datetime.date(1957, 1, 1) + datetime.timedelta(1 - 1) galveston['julian_date'] = x.month

它将元素更改为相应的月份:

array([(1957, 1, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999),(1957, 1, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999, 9999)],dtype=[('year', '<i4'), ('julian_date', '<i4'), ('1', '<i4'), ('2', '<i4'), ('3', '<i4'), ('4', '<i4'), ('5', '<i4'), ('6', '<i4'), ('7', '<i4'), ('8', '<i4'), ('9', '<i4'), ('10', '<i4'), ('11', '<i4'), ('12', '<i4')])

但是,它们都收到相同的月份值,这不是我要做的。当我试图像这样修改函数时(顺便说一句,这可能是完全错误的):

x = datetime.date('year', 1, 1) + datetime.timedelta('julian_date' - 1) galveston['julian_date'] = x.month我遇到了错误。

Image of the error

1 个答案:

答案 0 :(得分:0)

我认为我找到了自己问题的答案。我使用了pandas库将年份中的天数转换为datetime64,并从那里使用了Adam建议的.month方法。

# Import pandas library
import pandas as pd

# Create object x of datetime64 type
# Here I use galveston data array and to_datetime function
x = pd.to_datetime(galveston['year'] * 1000 + galveston['julian_date'], format='%Y%j')

# Here is the look at 20 dates from the array
x[20:40]
DatetimeIndex(['1957-08-29', '1957-09-10', '1957-09-22', '1957-10-04',
           '1957-10-16', '1957-10-28', '1957-11-09', '1957-11-21',
           '1957-12-03', '1957-12-15', '1957-12-27', '1958-01-01',
           '1958-01-13', '1958-01-25', '1958-02-06', '1958-02-18',
           '1958-03-02', '1958-03-14', '1958-03-26', '1958-04-07'],
          dtype='datetime64[ns]', freq=None)

# Get the month and put it back into galveston array
galveston['month'] = x.month

#The final result where the second element is the month
galveston[20:40]
array([(1957,  8, 1253, 1314, 1248, 1253, 1277, 1269, 1270, 1272, 1271, 1219, 1222, 1261),
   (1957,  9, 1284, 1284, 1277, 1258, 1292, 1349, 1309, 1428, 1439, 1261, 1271, 1344),
   (1957,  9, 1345, 1301, 1363, 1433, 1352, 1206, 1221, 1281, 1273, 1303, 1317, 1268),
   (1957, 10, 1172, 1177, 1307, 1319, 1280, 1215, 1166, 1208, 1281, 1384, 1510, 1598),
   (1957, 10, 1454, 1309, 1296, 1341, 1369, 1503, 1584, 1414, 1194, 1193, 1095, 1001),
   (1957, 10, 1060, 1111, 1159, 1141, 1184, 1193, 1140, 1205, 1287, 1278, 1348, 1148),
   (1957, 11, 1184, 1441, 1580, 1584, 1507, 1313, 1350, 1373, 1371, 1272, 1109, 1193),
   (1957, 11, 1284, 1365, 1199, 1064,  951, 1057, 1189, 1165, 1028,  666,  835,  996),
   (1957, 12, 1000,  929, 1116, 1253, 1084,  728,  625,  856,  617,  828, 1035, 1055),
   (1957, 12, 1011, 1069, 1090, 1105, 1142,  957,  949, 1127, 1180, 1089, 1090, 1083),
   (1957, 12, 1180, 1091, 1092, 1171, 1118, 9999, 9999, 9999, 9999, 9999, 9999, 9999),
   (1958,  1,  965, 1104, 1161, 1298, 1517, 1443, 1082,  879, 1011,  974, 1013, 1233),
   (1958,  1,  929,  823,  776,  675,  804,  873, 1159, 1158,  667,  784, 1218,  972),
   (1958,  1,  824,  888,  833,  914,  994,  964,  916,  641,  645,  780,  940,  986),
   (1958,  2,  928,  702,  833, 1071, 1117, 1020,  937,  910, 1079,  741,  757, 1082),
   (1958,  2, 1111, 1106, 1161, 1180, 1214, 1145,  964, 1008, 1125,  979,  841,  958),
   (1958,  3, 1061, 1016, 1112, 1237, 1146, 1142, 1119, 1011, 1070, 1169, 1191,  918),
   (1958,  3,  847,  924,  963, 1010,  937,  947,  907,  878,  950, 1050,  926,  936),
   (1958,  3,  954,  948, 1002, 1128, 1035,  995, 1111, 1164, 1140, 1147, 1231, 1127),
   (1958,  4,  980, 1221, 1315, 1033,  916, 1045, 1147, 1129,  942,  857,  946,  974)],
  dtype=[('year', '<i4'), ('month', '<i4'), ('1', '<i4'), ('2', '<i4'), ('3', '<i4'), ('4', '<i4'), ('5', '<i4'), ('6', '<i4'), ('7', '<i4'), ('8', '<i4'), ('9', '<i4'), ('10', '<i4'), ('11', '<i4'), ('12', '<i4')])