使用python

时间:2018-11-23 08:40:00

标签: python csv date

我有一个简单的*.csv文件,其中某些列的日期为mm/dd/yy格式。这是一个示例:

$ cat somefile.csv
05/09/15,8,Apple,05/09/15
06/10/15,5,Banana,06/10/12
05/11/18,4,Carrot,09/03/18
02/09/15,2,Apple,01/09/15

我想轻松确定一列是否仅包含有效日期, 但是我发现自己在数'/'和数字符方面都在挣扎。当然有一些简单的方法可以做到这一点?

编辑(@ RahulAgarwal的回答)

这是我的脚本(仍然不起作用:(()

###########
# IMPORTS #
###########
import csv
import sys
import numpy
from dateutil.parser import parse

###########################
# [1] Open input csv file #
###########################
myfile=open("input4.csv","r")
myreader = csv.reader(myfile)

############################
# [2] read header csv file #
############################
for myline in myreader:
    myheader=myline
    break

####################################################################
# [3] read and put in ds only data originating in specific columns #
####################################################################
for myline in myreader:
    for myColIndex in range(len(myline)):
        if (parse(myline[myColIndex])):
            print("column = {0}".format(myColIndex))

######################
# [4] Close csv file #
######################
myfile.close()

3 个答案:

答案 0 :(得分:2)

您可以在下面尝试检查有效日期:

from dateutil.parser import parse
parse("05/09/15")

答案 1 :(得分:1)

您可以使用datetime对象的strptime方法:

from datetime import datetime
def isDateValid(date, pattern = "%d/%m/%y"):
    try:
        datetime.strptime(date, pattern)
        return True
    except ValueError:
        return False

如果字符串与模式不匹配,则 strptime 方法会引发 ValueError

编辑:

要使其工作:

from datetime import datetime
def isDateValid(date, pattern = "%d/%m/%y"):
    try:
        datetime.strptime(date, pattern)
        return True
    except ValueError:
        return False

# load file
with open("filename.csv") as f:
    # split file into lines
    lines = f.readlines()

    # replace new-line character
    lines = [x.replace("\n", "") for x in lines]

    # extract the header
    header = lines[0]

    # extract rows
    rows = lines[1:]

    # loop over every row
    for rowNumber, row in enumerate(rows, 1):
        # split row into the seperate columns
        columns = line.split(",")

        # setting default value for every row
        gotValidDate = False

        # loop over every column
        for column in columns:
            # check if the column got a valid date
            if isDateValid(column):
                gotValidDate = True

        # if at least one out of all columns in that row got a valid date
        # the row number gets printed
        if gotValidDate:
            print(f"Row {rowNumber} got at least one valid date")

(代码是用Python 3.7编写的)

答案 2 :(得分:1)

您可以使用一组来跟踪在文件中看到的列,以及一组没有成功解析为有效日期的列,那么这两者之间的区别就是被解析为日期的列,例如: / p>

async