使用python以CSV格式查找给定DOB的月份和年份的年龄

时间:2013-05-17 16:35:02

标签: python datetime csv replace

我对自动生成的CSV报告进行了多次调整。我目前被困在一个需要患者DOB并将其转换为月龄和年龄的部分。原始CSV中已有一个Age for Column,我已经想出如何转换DOB列中的数据以查找Age in Days,但是,我需要能够将其转换为Months / years然后也取该计算值并替换当前字段中的值。当前字段是手工字符串,没有真正一致的格式。实际的CSV有大约1700行和18列,并使用标准逗号分隔它们,因此我只是为一个示例设置一个较短的表单,并使用缩进使其更容易看到:

Last_Name   First_Name   MI   age                 DOB          SSN         visit_date
Stalone     Frank        P    62yrs 10 months     07-30-1950   123456789   05-02-2013
Astley      Richard      P    47years3mo          02-06-1966   987654321   05-03-2013

我想要的应该是这样的:

Last_Name   First_Name   MI   Age       DOB          SSN
Stalone     Frank        P    62y10mo   07-30-1950   123456789
Astley      Richard      P    47y3mo    02-06-1966   987654321
编辑:我意识到我可以使用date.year和date.month来减去年份和月份的jsut,使这些值更容易找到。我现在正在编辑我的代码,并且在我开始工作时会更新它,但是我在问题的第二部分仍然遇到问题。

到目前为止我的代码:

import re
import csv
import datetime

with open(inputfile.csv','r') as fin, open(outputfile.csv','w') as fout:
   reader = csv.DictReader(fin)
   fieldnames = reader.fieldnames
   writer_clinics = csv.DictWriter(fout, fieldnames, dialect="excel")
   writer_clinics.writeheader()

   for row in reader:
    data = next(reader)
    today = datetime.date.today()
    DOB = datetime.datetime.strptime(data["DOB"], "%m/%d/%Y").date()
age_y = (today.year - DOB.year)
age_m = (today.month - DOB.month)

if age_m < 0:
    age_y = age_y - 1
    age_m = age_m + 12

age = str(age_y) + " y " + str(age_m) + " mo "
print (age)

所以,我试着弄清楚如何将年龄写入outputfile.csv中的正确字段?

更新2:管理以使大部分内容写入,但是,在输入文件中某些字段保留为空时出错。我的老板也希望我按照约会的实际日期来决定年龄。我目前的一大堆代码:

import re
import csv
import datetime

def getage(visit, dob):
    years = visit.year - dob.year
    months = visit.month - dob.month
    if visit.day < dob.day:
        months -= 1
    if months < 0:
        months += 12
        years -= 1
    return '%sy%smo'% (years, months)

with open('inputfile.csv','r') as fin, open('outputfile.csv','w') as fout:
    reader = csv.DictReader(fin)
    writer_clinics = csv.DictWriter(fout, reader.fieldnames, dialect="excel")
    writer_clinics.writeheader()

    for data in reader:
        visit_date = datetime.strptime(data["visit_date"], "%m-%d-%Y").date()
        DOB = datetime.datetime.strptime(data["DOB"], "%m-%d-%Y").date()
        data["Age"] = getage(visit_date, DOB)
        writer_clinics.writerow(data)

3 个答案:

答案 0 :(得分:3)

您无法将天数转换为年和月,因为数年和数月的天数不同。你需要自己考虑年份和月份的差异。

dob = datetime.datetime.strptime('07-30-1950', '%m-%d-%Y')
now = datetime.datetime.now()
years = now.year - dob.year
months = now.month - dob.month
if now.day < dob.day:
    months -= 1
while months < 0:
    months += 12
    years -= 1
age = '{}y{}mo'.format(years, months)

>>> print age
62y9mo

答案 1 :(得分:1)

此代码使用Mark Ransom's algorithm来获取正确的年龄。这将按照您在问题中的请求填充输出CSV文件。

import re
import csv
import datetime

def getage(now, dob):
    years = now.year - dob.year
    months = now.month - dob.month
    if now.day < dob.day:
        months -= 1
        while months < 0:
            months += 12
            years -= 1
    return '%sy%smo'% (years, months)

with open('inputfile.csv','r') as fin, open('outputfile.csv','w') as fout:
    reader = csv.DictReader(fin)
    writer_clinics = csv.DictWriter(fout, reader.fieldnames, dialect="excel")
    writer_clinics.writeheader()

    for data in reader:
        today = datetime.date.today()
        DOB = datetime.datetime.strptime(data["DOB"], "%m-%d-%Y").date()
        data["Age"] = getage(today, DOB)
        writer_clinics.writerow(data)

注意:我仅使用您在上面提供的CSV文件来测试此代码。

答案 2 :(得分:0)

您是否尝试过eGenix DateTime套餐?:

>>> import mx.DateTime as dt
>>> a = dt.DateTime(2000, 1, 1)
>>> b = dt.DateTime(2013, 6, 17)
>>> x = dt.Age(b, a)
>>> x.years
13
>>> x.months
5