到目前为止,我正在将CSV读入numpy数组
示例行:
20041207,7.04,7.18,6.88,7.10,25981485
代码:
import datetime
import numpy as np
import matplotlib.dates as dt
def mkdate(text):
return dt.date2num(datetime.datetime.strptime(text, '%Y%m%d'))
np.genfromtxt(
filename,
delimiter=',',
skip_header=1,
usecols=[1, 2, 5, 3, 4, 6],
names=('date', 'open', 'close', 'high', 'low', 'volume'),
converters={'date': mkdate},
dtype=(
np.float64,
np.float64,
np.float64,
np.float64,
np.float64,
np.int64
)
)
现在我必须切换到数据库。从数据库中获取相关值后,它看起来像这样(元组列表):
[(datetime.datetime(2004, 12, 7, 0, 0), Decimal('7.04000'), Decimal('7.10000'), Decimal('7.18000'), Decimal('6.88000'), 25981485L), (and so on), ... ]
现在我需要转换成与以前相同的numpy数组,我想象它会是:
def mkdate(date):
return dt.date2num(date)
np.somefunction(
list_of_tuples,
names=('date', 'open', 'close', 'high', 'low', 'volume'),
converters={
'date': mkdate,
'open': float,
'close': float,
'high': float,
'low': float,
'volume': int,
},
dtype=(
np.float64,
np.float64,
np.float64,
np.float64,
np.float64,
np.int64
)
)
所以要总结一下:我需要将元组列表转换为带有命名列的numpy数组。
答案 0 :(得分:0)
如果np.readtxt()
不可行,也许你可以用numpy的1D数组进行dict:
tofloat = lambda w: float(w) # must be function not type
converters={
'date': mkdate,
'open': tofloat,
'close': tofloat,
'high': tofloat,
'low': tofloat,
'volume': lambda w: int(w),
}
# dict with 1D lists per column:
dd = dict([(k,[]) for k in converters.keys()])
with open(fname, 'r') as f:
for l in f: # read line
ww = l.split(',') # split line into strings
for w,k in zip(ww,dd.keys()): # iterate over strings and column names
f = converters[k]
dd[k].append(f(w)) # append to proper dict-entry list
# convert to dict with numpy arrays:
dd_a = dict([(k,np.asarray(v)) for k,v in dd.items()])