我有函数“read_csv”,我经常在其他Python脚本中使用它。而不是将其复制到我使用它的每个Python脚本的开头,我想将它保存在一个单独的文件中并导入它。该功能定义如下:
import numpy as np
def read_csv(filename,suffix=''):
# Read column headers (to be variable naames)
with open(filename) as f:
firstline = f.readline() # Read first line of csv
firstline = firstline.replace("\n","") # Remove new line characters
firstline = firstline.replace(" ","") # Remove spaces
ColumnHeaders = firstline.split(",") # Get array of column headers
# Read in the data (omitting the first row containing column headers)
data=np.genfromtxt(filename,skiprows=1,delimiter=",",filling_values=0)
# Assign the data to arrays, with names of the variables generated from column headers
Ind=0
for Var in ColumnHeaders:
VarName=Var+suffix # Define variable name appended with given suffix (if any)
globals()[VarName]=data[:,Ind] # Assign the columns of the data to variables names after the column headers
该函数读取包含列标题及其下方数字的csv文件,并将数字作为数组写入“工作空间”,其名称来自列标题。我已将上面的代码保存为“read_csv_nDI.py”。在同一目录中,我尝试以下脚本:
import numpy as np
from read_csv_nDI import read_csv
read_csv('test.csv')
其中'test.csv'是一个应该有效的CSV文件:
Track, Bin, RO_T,ZEZ2A_T,Intra,RO_T_evnt_amp,ZEZ2A_T_evnt_amp,Intra_evnt_amp,Intra_reservoir_amplitude_normalized_to_RO_T,Intra_reservoir_amplitude_normalized_to_ZEZ2A_T
1, 1, 2149.7307, 2110.3000, 2189.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 2, 2151.7307, 2112.3000, 2191.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 3, 2153.7307, 2114.3000, 2193.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 4, 2155.7307, 2116.3000, 2195.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 5, 2157.7307, 2118.3000, 2197.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 6, 2159.7307, 2120.3000, 2199.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
1, 7, 2161.7307, 2122.3000, 2201.5908, 1000.3883, -766.3962, -687.7489, -0.6875, 0.8974
但是,如果我运行上面的脚本并输入dir()命令,我看不到我希望看到的变量“RO_T”,“ZEZ2A_T”等。另一方面,如果我只是添加一行
read_csv('test.csv')
在同一个Python脚本中的函数定义之后,它可以工作,我确实在运行脚本后看到了这些变量。为什么只有在函数定义在同一个脚本中时它才有效?
答案 0 :(得分:2)
globals
为您提供调用它的文件中的全局变量,因此它总是为您提供定义read_csv
的文件的全局变量。
像这样混乱你的全局命名空间可能不是最好的主意。最好只返回整个阵列。另外,我建议您查看pandas
包,它可以轻松地将CSV读入DataFrame对象,该对象的工作方式类似于numpy数组,但对于大多数用途来说更方便。