我有一个简单的*.csv
文件,其中某些列的日期为mm/dd/yy
格式。这是一个示例:
$ cat somefile.csv
05/09/15,8,Apple,05/09/15
06/10/15,5,Banana,06/10/12
05/11/18,4,Carrot,09/03/18
02/09/15,2,Apple,01/09/15
我想轻松确定一列是否仅包含有效日期,
但是我发现自己在数'/'
和数字符方面都在挣扎。当然有一些简单的方法可以做到这一点?
编辑(@ RahulAgarwal的回答)
这是我的脚本(仍然不起作用:(()
###########
# IMPORTS #
###########
import csv
import sys
import numpy
from dateutil.parser import parse
###########################
# [1] Open input csv file #
###########################
myfile=open("input4.csv","r")
myreader = csv.reader(myfile)
############################
# [2] read header csv file #
############################
for myline in myreader:
myheader=myline
break
####################################################################
# [3] read and put in ds only data originating in specific columns #
####################################################################
for myline in myreader:
for myColIndex in range(len(myline)):
if (parse(myline[myColIndex])):
print("column = {0}".format(myColIndex))
######################
# [4] Close csv file #
######################
myfile.close()
答案 0 :(得分:2)
您可以在下面尝试检查有效日期:
from dateutil.parser import parse
parse("05/09/15")
答案 1 :(得分:1)
您可以使用datetime对象的strptime方法:
from datetime import datetime
def isDateValid(date, pattern = "%d/%m/%y"):
try:
datetime.strptime(date, pattern)
return True
except ValueError:
return False
如果字符串与模式不匹配,则 strptime 方法会引发 ValueError 。
编辑:
要使其工作:
from datetime import datetime
def isDateValid(date, pattern = "%d/%m/%y"):
try:
datetime.strptime(date, pattern)
return True
except ValueError:
return False
# load file
with open("filename.csv") as f:
# split file into lines
lines = f.readlines()
# replace new-line character
lines = [x.replace("\n", "") for x in lines]
# extract the header
header = lines[0]
# extract rows
rows = lines[1:]
# loop over every row
for rowNumber, row in enumerate(rows, 1):
# split row into the seperate columns
columns = line.split(",")
# setting default value for every row
gotValidDate = False
# loop over every column
for column in columns:
# check if the column got a valid date
if isDateValid(column):
gotValidDate = True
# if at least one out of all columns in that row got a valid date
# the row number gets printed
if gotValidDate:
print(f"Row {rowNumber} got at least one valid date")
(代码是用Python 3.7编写的)
答案 2 :(得分:1)
您可以使用一组来跟踪在文件中看到的列,以及一组没有成功解析为有效日期的列,那么这两者之间的区别就是被解析为日期的列,例如: / p>
async