Question

我有一个CSV文件，其中包含修补程序名称，发布日期以及其他一些列中的信息。我正在尝试编写一个Python脚本，要求用户输入Patch名称，一旦获得输入，将检查Patch是否在CSV文件中并打印出发布日期。

到目前为止，我已经编写了以下代码，我基于this回答。

import csv

patch = raw_input("Please provide your Patchname: ")

with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        for k in row:
            if row[k] == patch:
                print "According to the CSV_File1 database: "+row[k]

这样我就可以在屏幕上打印补丁名称了。我不知道如何使用Dates遍历列，以便我可以打印与我提供的Patch名称对应的行的日期作为输入。

此外，我想检查该补丁是否是最后发布的补丁。如果不是，则打印最新的一个及其发布日期。我的问题是CSV文件包含不同软件版本的补丁名称，所以我不能只打印列表的最后一个。例如：

PatchXXXYY,...other columns...,Release Date,...     <--- (this is the header row of the CSV file)
Patch10000,...,date
Patch10001,...,date
Patch10002,...,date
Patch10100,...,date
Patch10101,...,date
Patch10102,...,date
Patch10103,...,date
Patch20000,...,date
...

所以，如果我的输入是“Patch10000”，那么我应该得到它的发布日期和最新的可用补丁，在这种情况下是Patch10002，以及它的发布日期。但不是Patch20000，因为那将是一个不同的软件版本。一个更好的输出是这样的：

根据CSV_File1数据库：Patch10100已发布 “日期”。最新的补丁是“Patch10103”，它是在“约会”上发布。

这是因为上面的PatchXXXYY中的“XXX”数字表示软件版本，而“YY”表示补丁号。我希望这很清楚。

提前致谢！

Answer 1

你几乎就在那里，虽然我有点困惑 - 你的样本数据没有标题行。如果没有，那么你不应该使用DictReader但如果是，你可以采用这种方法。

version = patch[:8]
latest_patch = ''
last_patch_data = None
with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        # This works because of ASCII ordering. First,
        # we make sure the package starts with the right
        # version - e.g. Patch200
        if row['Package'].startswith(version):
            # Now we grab the next two numbers, so from
            # Patch20042 we're grabbing '42'
            patch_number = row['Package'][8:10]
            # '02' > '' is true, and '42' > '02' is also True
            if patch_number > latest_patch:
                # If we have a greater patch number, we
                # want to store that, along with the row that
                # had that. We could just store the patch & date
                # but it's fine to store the whole row
                latest_patch = patch_number
                last_patch_data = row

        # No need to iterate over the keys, you *know* the
        # column containing the patch. Presumably it's
        # titled 'patch'
        #for k in row:
        #    if row[k] == patch:
        if row['Package'] == patch:
            # assuming the date header is 'date'
            print("According to the CSV_File1 database: {patch!r}"
                  " was released on {date!r}".format(patch=row['Package'],
                                                     date=row['Registration']))

    # `None` is a singleton, which means that we can use `is`,
    # rather than `==`. If we didn't even *start* with the same
    # version, there was certainly no patch. You may prefer a
    # different message, of course.
    if last_patch_data is None:
        print('No patch found')
    else:
        print('The latest available patch is {patch!r},'
              ' which was released on {date!r}'.format(patch=last_patch_data['Package'],
                                                       date=last_patch_data['Registration']))

Answer 2

CSV模块工作正常，但我只是想抛出Pandas，因为这可能是一个很好的用例。可能有更好的方法来处理这个问题，但这是一个有趣的例子。这假设您的列是标签（Patch_Name，Release_Date），因此您需要更正它们。

import pandas as pd

my_file1 = pd.read_csv("CSV_File1.csv", error_bad_lines=False)

patch = raw_input("Please provide your Patchname: ")

#Find row that matches patch and store the index as idx
idx = my_file1[my_file1["Patch_Name"] == patch].index.tolist()

#Get the date value from row by index number
date = my_file1.get_value(idx[0], "Release_Date")

print "According to the CSV_File1 database: {} {}".format(patch, date)

有很多方法可以过滤和比较CSV中的数据和Pandas。如果我有更多时间，我会给出更多描述性的解决方案。我强烈建议查看Pandas文档。

如何使用csv文件检查用户输入并打印特定列中的数据？

2 个答案: