字符b附加到numpy数组的输出?

时间:2019-03-22 22:35:49

标签: python python-3.x numpy

下面是我的代码,工作正常。唯一的问题是print输出在每个元素的开头打印字符b。

不知道为什么会这样。

有人可以帮忙吗?

"[object Object]"

此代码的输出是

   opening_duration_list = np.zeros(0, dtype={'names':('sitename', 'postcode', 'dur'),'formats':('S40', 'i4', 'f2')})
        with open(DATA_FILE) as f:
             rows = csv.DictReader(f)
             for row in rows:
             sitename = row['SITE NAME']
             postcode = row['POSTCODE']
             Open = row['Open']
             Close = row['Close']
             dur = compute_opening_duration(Open, Close)

    x = np.array([tuple((sitename+","+postcode+","+str(dur)).split(','))], dtype=opening_duration_list.dtype)
    #print(x['sitename'])
    opening_duration_list = np.append(opening_duration_list,x)

    if row is None:
        break

 for i in range(0,10):
             print("List No:",i+1,opening_duration_list[i])

我不知道字符b是如何附加在前面的。

1 个答案:

答案 0 :(得分:0)

我的猜测是您的csv看起来像这样(用dur代替打开和关闭):

In [122]:  txt = """ 
     ...: SITE NAME, POSTCODE, dur 
     ...: Armadale (WA), 6112, 8. 
     ...: Armidale (NSW), 2350, 8.5 
     ...: Newport, 3015, 6.5 
     ...: Townsville Jobseekers, 4814, 7.5 
     ...: Albany, 6330, 6.5 
     ...:  """                                                                  

我可以用`genfromtxt加载它:

In [124]: data = np.genfromtxt(txt.splitlines(), delimiter=',', dtype=None, names=True, encoding=None)                                                   
In [125]: data                                                                  
Out[125]: 
array([('Armadale (WA)', 6112, 8. ), ('Armidale (NSW)', 2350, 8.5),
       ('Newport', 3015, 6.5), ('Townsville Jobseekers', 4814, 7.5),
       ('Albany', 6330, 6.5)],
      dtype=[('SITE_NAME', '<U21'), ('POSTCODE', '<i8'), ('dur', '<f8')])

并按您的方式显示它:

In [126]: for i in range(0,5): 
     ...:              print("List No:",i+1,data[i]) 
     ...:                                                                       
List No: 1 ('Armadale (WA)', 6112, 8.)
List No: 2 ('Armidale (NSW)', 2350, 8.5)
List No: 3 ('Newport', 3015, 6.5)
List No: 4 ('Townsville Jobseekers', 4814, 7.5)
List No: 5 ('Albany', 6330, 6.5)

请注意,第一个字段的dtypeU21-Unicode字符串。

使用您的dtype

In [127]: data = np.genfromtxt(txt.splitlines(), delimiter=',', dtype={'names':(
     ...: 'sitename', 'postcode', 'dur'),'formats':('S40', 'i4', 'f2')}, skip_he
     ...: ader=1, encoding=None)                                                
In [128]: data                                                                  
Out[128]: 
array([(b'SITE NAME',   -1, nan), (b'Armadale (WA)', 6112, 8. ),
       (b'Armidale (NSW)', 2350, 8.5), (b'Newport', 3015, 6.5),
       (b'Townsville Jobseekers', 4814, 7.5), (b'Albany', 6330, 6.5)],
      dtype=[('sitename', 'S40'), ('postcode', '<i4'), ('dur', '<f2')])

因为您指定了'S40'dtype,所以它具有b字节串标志。

===

对于您的csv阅读器,我认为通过积累元组列表,迭代会更好:

   dt = {'names':('sitename', 'postcode', 'dur'),'formats':('U40', 'i4', 'f2')})
   alist = []
   with open(DATA_FILE) as f:
       rows = csv.DictReader(f)
       for row in rows:
           sitename = row['SITE NAME']
           postcode = row['POSTCODE']
           Open = row['Open']
           Close = row['Close']
           dur = compute_opening_duration(Open, Close)
           x = tuple((sitename + "," + postcode + "," + str(dur)).split(',')))
           alist.append(x)
           #print(x['sitename'])

   opening_duration_list = np.array(alist, dtype=dt)

np.append使用权很尴尬,而且速度慢。这样,您只需使用一次复合dtype。 (但是恭喜您使用np.append;特别是使用复合dtype。)