Question

我正在学习Python（我有一个C / C ++背景）。

我需要在学习的同时用Python编写实用的东西。我有以下伪代码（我第一次尝试编写Python脚本，因为昨天阅读了Python）。希望该片段详细说明了我想要做的事情的逻辑。顺便说一句，我在Ubuntu Karmic上使用python 2.6。

假设脚本被调用为：script_name.py directory_path

import csv, sys, os, glob

# Can I declare that the function accepts a dictionary as first arg?
def getItemValue(item, key, defval)
  return !item.haskey(key) ? defval : item[key]


dirname = sys.argv[1]

# declare some default values here
weight, is_male, default_city_id = 100, true, 1 

# fetch some data from a database table into a nested dictionary, indexed by a string
curr_dict = load_dict_from_db('foo')

#iterate through all the files matching *.csv in the specified folder
for infile in glob.glob( os.path.join(dirname, '*.csv') ):
  #get the file name (without the '.csv' extension)
  code = infile[0:-4]
  # open file, and iterate through the rows of the current file (a CSV file)
  f = open(infile, 'rt')
  try:
    reader = csv.reader(f)
    for row in reader:
      #lookup the id for the code in the dictionary
      id = curr_dict[code]['id']
      name = row['name']
      address1 = row['address1']
      address2 = row['address2']
      city_id = getItemValue(row, 'city_id', default_city_id)

      # insert row to database table

  finally:
    f.close()

我有以下问题：

代码是用Pythonic编写的（有没有更好的实现方式）？
给定一个具有如下所示模式的表，我如何编写一个从表中获取数据的Python函数，并返回一个由字符串（名称）索引的字典。
如何将行数据插入表中（实际上我想尽可能使用事务，并在文件关闭之前提交）

表架构：

create table demo (id int, name varchar(32), weight float, city_id int);

[编辑]

Wayne等人：

澄清一下，我想要的是一组行。每一行都可以用键索引（这意味着行容器是一个字典（右）？好了，现在一旦我们使用键检索了一行，我也希望能够访问'列'中的'行 - 意味着行数据本身就是一个字典。我不知道Python在处理字典时是否支持多维数组语法 - 但以下语句将有助于解释我打算如何在概念上使用从db返回的数据。 dataset ['joe'] ['weight']将首先获取由键'joe'（这是一个字典）索引的行数据，然后将该字典索引为键'weight'。我想知道如何构建这样的以你之前的Pythonic方式从检索到的数据中获取字典字典。

一种简单的方法是编写类似的东西：

import pyodbc

mydict = {}
cnxn = pyodbc.connect(params)
cursor = cnxn.cursor()
cursor.execute("select user_id, user_name from users"):

for row in cursor:
   mydict[row.id] = row

这是正确的/可以用更加pythonic的方式写吗？

Answer 1

要从字典中获取值，您需要使用dict的{{3}}方法：

>>> d = {1: 2}
>>> d.get(1, 3)
2
>>> d.get(5, 3)
3

这将消除对getItemValue功能的需求。我不会评论现有的语法，因为它显然与Python不同。 Python中三元的正确语法是：

true_val if true_false_check else false_val
>>> 'a' if False else 'b'
'b'

但正如我在下面所说，你完全不需要它。

如果您使用的是Python＆gt; 2.6，您应该在with上使用try-finally语句：

with open(infile) as f:
    reader = csv.reader(f)
    ... etc

看到您希望将row作为字典，您应该使用.get而不是简单的csv. reader。但是，在您的情况下，这是不必要的。您的sql查询可以构造为访问row dict的字段。在这种情况下，您无需创建单独的项city_id，name等。要将默认city_id添加到row，如果它不存在，您可以使用csv.DictReader方法：

>>> d
{1: 2}
>>> d.setdefault(1, 3)
2
>>> d
{1: 2}
>>> d.setdefault(3, 3)
3
>>> d
{1: 2, 3: 3}

和id，只需row[id] = curr_dict[code]['id']

切片时，您可以跳过0：

>>> 'abc.txt'[:-4]
'abc'

通常，Python的库在游标上提供fetchone，fetchmany，fetchall方法，返回Row对象，可能支持类似dict的访问或返回一个简单的元组。这取决于您正在使用的特定模块。

Answer 2

对我来说，它看起来很像Pythonic。

三元操作应该看起来像这样（我认为这将返回您期望的结果）：

return defval if not key in item else item[key]

是的，您可以基本上以任何顺序传递字典（或任何其他值）。唯一的区别是如果你使用* args，** kwargs（通过约定命名。技术上你可以使用你想要的任何名称），它们应该按顺序排列，最后一个或两个参数。

要插入数据库，您可以使用odbc模块：

import odbc
conn = odbc.odbc('servernamehere')
cursor = conn.cursor()
cursor.execute("INSERT INTO mytable VALUES (42, 'Spam on Eggs', 'Spam on Wheat')")
conn.commit()

你可以在odbc模块上阅读或找到大量的例子 - 我确信还有其他模块，但是那个模块应该适合你。

要进行检索，您可以使用

cursor.execute("SELECT * FROM demo")
#Reads one record - returns a tuple
print cursor.fetchone()
#Reads the rest of the records - a list of tuples
print cursor.fetchall()

将其中一条记录写入字典：

record = cursor.fetchone()
# Removes the 2nd element (at index 1) from the record
mydict[record[1]] = record[:1] + record[2:]

如果你想要整个shebang，那实际上就会为生成器表达而尖叫

mydict = dict((record[1], record[:1] + record[2:] for record in cursor.fetchall())

应该使用名称作为密钥，将所有记录整齐地打包在字典中。

HTH

Answer 3

在def s后需要冒号：

def getItemValue(item, key, defval):
    ...

布尔运算符：在python ! - ＆gt; not; && - ＆gt; and和|| - ＆gt; or（有关布尔运算符，请参阅http://docs.python.org/release/2.5.2/lib/boolean.html）。 python中没有? :运算符，有一个return (x) if (x) else (x)表达式，尽管我个人很少使用它来支持普通if。

布尔/ None： True，False和None有大写字母。

检查参数类型：在python中，通常不会声明函数参数的类型。你可以去，例如函数中assert isinstance(item, dict), "dicts must be passed as the first parameter!"虽然经常不鼓励这种“严格检查”，因为它在python中并不总是必要的。

python关键字： default不是保留的python keyword，可以作为参数和变量使用（仅供参考）。

样式指南： PEP 8（python样式指南）规定模块import一般应该只有一行，尽管有一些例外（我必须承认我经常不会在单独的行上跟import sys和os，但我通常会遵循它。）

文件打开模式： rt在python 2.x中无效 - 它会起作用，但t将被忽略。另见http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files。但是在python 3中它是valid，所以如果你想强制文本模式，在二进制字符上引发异常，我认为它不会受到影响（如果你想读取非{i}，请使用rb ASCII字符。）

使用词典：Python过去常常使用dict.has_key(key)，但您现在应该使用key in dict（已基本取代它，请参阅http://docs.python.org/library/stdtypes.html#mapping-types-dict。）

拆分文件扩展名 code = infile[0:-4]可以替换为code = os.path.splitext(infile)[0]（例如('root', '.ext')返回扩展名中的点（请参阅http://docs.python.org/library/os.path.html#os.path.splitext ）。

编辑：删除了单行内容的多个变量声明并添加了一些格式。还纠正了rt在python 3中不是python中的有效模式。

如何用Python编写这个片段？

3 个答案: