pandas read_table的替换函数不起作用

时间:2016-05-20 08:44:25

标签: python csv numpy pandas scipy

我是Python新手! 由于某些原因,我不能在我的环境中使用pandas。所以我自己写了pandas read_table()。基本上我正在转换一个自己使用pandas.read_table的代码。使用必须替换的pandas的代码如下:

import pandas as pd
import numpy as np
import scipy as sp
data_file = pd.read_table(r'records.csv', sep = ';', header=None)

id = np.unique(data_file[0])
tags = np.unique(data_file[1])

number_of_rows = len(id)
number_of_columns = len(tags)

words_indices, letter_indices = {}, {}

for i in range(len(tags)):
     words_indices[tags[i]] = i

for i in range(len(id)):
     letter_indices[id[i]] = i


#scipy sparse matrix 
Vector = sp.lil_matrix((number_of_rows, number_of_columns))

#adds data into the sparse matrix
for line in data_file.values:
     u, i , r = map(str,line)
     Vector[letter_indices[u], words_indices[i]] = r

csv文件包含大约100种此格式的记录:

REC000034232657,CRC FIX OE Resubmit,0.0073410, 45 

现在,我已经通过直接从数据库而不是从.csv文件中读取pandas.read_table来替换pandas.read_table:

def fetch_table(**kwargs):
    qry = kwargs['qrystr']
    try:
        cursor = conn.cursor()
        cursor.execute(qry)
        all_tuples  = cursor.fetchall()
        return all_tuples
    except pyodbc.ProgrammingError as e:
        print ("Exception occured as :",  type(e) , e)

# pandas alternate code
total_col = 0
count = 0
dict_csv = {}

stmt = "select * from tickets;"
fetched_rows = fetch_table(qrystr = stmt)

for row in fetched_rows:
    total_col = len(row)
    break

for i in range(0,total_col):
    dict_csv[i] = []

for row in fetched_rows:
    for i in range(0,total_col):
        dict_csv[i].append(row[i])

# End of pandas alternate code

其余的代码只是继续与早期的代码块相同,除了代替data_file(由pd.read_table()返回),现在我正在使用dict_csv,所以在早期代码中添加数据的for循环将稀疏矩阵更改为:

 for line in data_file.values:
         u, i , r = map(str,line)
         Vector[letter_indices[u], words_indices[i]] = r

但是,我的TypeError低于:

Traceback (most recent call last):
  File "C:\Python32\my_scripts\ds.py", line 132, in <module>
    for line in dict_csv.values:
TypeError: 'builtin_function_or_method' object is not iterable

我知道dict_csv.values没有返回一个可迭代列表,任何人都可以指出我正在犯的错误。 整数45也是十进制(45),我怎么能摆脱它呢?

非常感谢

0 个答案:

没有答案