我有时间纪元(UNIX时间)的数据文件,我试图将数据日/日分开在单独的文件中。例如:数据是90天,所以它应该吐到90个文件。我不知道如何开始。有时我知道天数,有时候我不知道如此轻松我试图找到一种更好的方法来简单地分割数据日期/日期。 Data[0] Data[1] Timeepoch[2] Timeepoch[3]
。 Time_1 and Time_2
是开始时间和停止时间。
数据:这些只有几行。
Data_1 Data_2 Time_1 Time_2
3436 1174 1756908 1759291
3436 3031 1756908 1759291
3436 1349 1756908 1759291
5372 937 1756913 1756983
4821 937 1756913 1756983
4376 937 1756913 1756983
2684 937 1756913 1756983
3826 896 1756961 1756971
3826 896 1756980 1756997
5372 937 1756983 1757045
4821 937 1756983 1757045
4376 937 1756983 1757045
2684 937 1756983 1757045
3826 896 1757003 1757053
4944 3715 1757009 1757491
4944 4391 1757009 1757491
2539 1431 1757014 1757337
5372 937 1757045 1757104
4821 937 1757045 1757104
4376 937 1757045 1757104
2684 937 1757045 1757104
896 606 1757053 1757064
3826 896 1757064 1757074
5045 4901 1757074 1757085
4921 4901 1757074 1757085
4901 3545 1757074 1757085
4901 3140 1757074 1757085
4901 4243 1757074 1757085
896 606 1757074 1757084
答案 0 :(得分:1)
要从POSIX时间戳中查找UTC日期,只需将其添加到Epoch,例如:
>>> from datetime import date, timedelta
>>> date(1970, 1, 1) + timedelta(seconds=1756908)
datetime.date(1970, 1, 21)
然后创建一个映射:date -> file
并使用它来分割输入文件:
#!/usr/bin/env python
import fileinput
from datetime import date, timedelta
def get_date(line, epoch=date(1970, 1, 1)):
try:
timestamp = int(line.split()[2]) # timestamp from 3rd column
return epoch + timedelta(seconds=timestamp) # UTC date
except Exception:
return None # can't parse timestamp
daily_files = {} # date -> file
input_file = fileinput.input()
next(input_file) # skip header
for line in input_file:
d = get_date(line)
file = daily_files.get(d)
if file is None: # file for the given date is not found
file = daily_files[d] = open(str(d), 'w') # open a new one
file.write(line)
# close all files
for f in daily_files.values():
try:
f.close()
except EnvironmentError:
pass # ignore errors
答案 1 :(得分:1)
import itertools
import datetime
# Extract the date from the timestamp that is the third item in a line
# (Will be grouping by start timestamp)
def key(s):
return datetime.date.fromtimestamp(int(s.split()[2]))
with open('in.txt') as in_f:
for date, group in itertools.groupby(in_f, key=key):
# Output to file that is named like "1970-01-01.txt"
with open('{:%Y-%m-%d}.txt'.format(date), 'w') as out_f:
out_f.writelines(group)
答案 2 :(得分:0)
datetime.fromtimestamp(timestamp)
可以从时间戳和
中获取日期时间对象 datetime.fromtimestamp(timestamp).replace(second=0, minute=0, hour=0)
只能使用日期组件来获取日期时间对象。
答案 3 :(得分:0)
下一个代码会将每一行写入名为output-YYYY-MM-DD的文件,其中YYYY-MM-DD是从Time_2列中提取的。
from datetime import date with open('infile.txt', 'r') as f: for line in f: fields = line.split() with open('output-'+date.fromtimestamp(float(fields[3])).__str__(), 'a') as outf: outf.write(line)
此代码效率不高。它为每一行打开一个文件。如果您可以确保输入数据按end_time排序,则可以进行改进。