Question

这是代码！

import csv

def do_work():
      global data
      global b
      get_file()
      samples_subset1()
      return

def get_file():

      start_file='thefile.csv'

      with open(start_file, 'rb') as f:
        data = list(csv.reader(f))
        import collections
        counter = collections.defaultdict(int)

      for row in data:
        counter[row[10]] += 1
      return

def samples_subset1():

      with open('/pythonwork/samples_subset1.csv', 'wb') as outfile:
          writer = csv.writer(outfile)
          sample_cutoff=5000
          b_counter=0
          global b
          b=[]
          for row in data:
              if counter[row[10]] >= sample_cutoff:
                 global b
                 b.append(row) 
                 writer.writerow(row)
                 #print b[b_counter]
                 b_counter+=1
      return

我是python的初学者。我的代码运行的方式是我调用do_work，do_Work将调用其他函数。这是我的问题：

如果我需要只有2个函数才能看到data我应该将其设为全局吗？如果没有，那么我该怎么称呼samples_subset1？我应该从get_file还是从do_work拨打电话？
代码有效，但是你可以指出其编写方式的其他好/坏事吗？
我正在处理一个csv文件，有多个步骤。我正在将步骤分解为get_file，samples_subset1等不同的功能，还有更多我将添加的功能。我现在应该按照我现在的方式继续这样做我在do_work调用每个单独的函数吗？

这是新代码，根据下面的答案之一：

import csv
import collections

def do_work():
      global b
      (data,counter)=get_file('thefile.csv')
      samples_subset1(data, counter,'/pythonwork/samples_subset1.csv')
      return

def get_file(start_file):

        with open(start_file, 'rb') as f:
        global data
        data = list(csv.reader(f))
        counter = collections.defaultdict(int)

      for row in data:
        counter[row[10]] += 1
      return (data,counter)

def samples_subset1(data,counter,output_file):

      with open(output_file, 'wb') as outfile:
          writer = csv.writer(outfile)
          sample_cutoff=5000
          b_counter=0
          global b
          b=[]
          for row in data:
              if counter[row[10]] >= sample_cutoff:
                 global b
                 b.append(row) 
                 writer.writerow(row)
                 #print b[b_counter]
                 b_counter+=1
      return

Answer 1

根据经验，避免使用全局变量。

在这里，很容易：让get_file返回数据然后你可以说

data = get_file()
samples_subset1(data)

另外，我会在文件顶部进行所有导入

Answer 2

如果您必须使用全局（有时我们必须），您可以使用Pythonic方式定义它，并且只允许某些模块访问它，而不会在顶部使用讨厌的global关键字你所有的功能/类。

创建一个仅包含全局数据的新模块（在您的情况下，假设为csvGlobals.py）：

# create an instance of some data you want to share across modules
data=[]

然后您想要访问此数据的每个文件都可以这种方式执行此操作：

import csvGlobals

csvGlobals.data = [1,2,3,4]
for i in csvGlobals.data:
    print i

Answer 3

如果要在两个或多个函数之间共享数据，那么通常最好使用类并将函数转换为方法，将全局变量转换为类实例上的属性。

顺便说一下，你不需要在每个函数的末尾都有return语句。如果要返回值或在函数中间返回，则只需显式返回。

python：正确使用全局变量

3 个答案: