Question

我有一些代码可以很好地工作。

def upc_dict_to_pandas_dataframe(upc_dict):
    #This could be done in fewer lines but I split them for debugging purposes
    d = upc_dict.items()
    d = list(d)
    d = [list(i) for i in d]

    for i in range(len(d)):
        d[i] = np.array(d[i], dtype=object)
        d[i] = np.hstack(d[i])
        x = int(d[i][3])
        d[i][3] = x

最后一行d [i] [3] = x不会将x分配给d [i] [3]。它的原始类型是一个numpy字符串，我正尝试将其替换为其整数形式。但是，似乎只是完全跳过了分配行。我什至在调试模式下尝试了它。我看着它将字符串数字转换为整数。但是d [i] [3]从未改变。

这是为什么，我该如何解决？

谢谢。

编辑

这里是d = [d中的i的list（i）]之后的d值，

<class 'list'>: [['B01A8L6KKO', ['873124006834', 'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL', 4408]], ['B00L59D9HG', ['045496891503', 'Nintendo 3DS AC Adapter', 148]], ['B00ND0EBP4', ['873124005110', 'HORI Retro Zelda Hard Pouch for Nintendo 3DS XL - Zelda Version Edition', 4403]], ['B01MSHC8WT', ['859986005680', 'Tend Insights John Deere 100 Indoor Wi-Fi Camera', 16007]], ['B07CFLK37X', ['859986005291', 'Lynx Indoor/Outdoor Pro HD Wifi Camera', -1]], ['B076ZWVR2R', ['859986005376', 'Lynx Solar Weatherproof Outdoor WiFi Surveillance Camera with Solar Panel, Facial Recognition, Night Vision, White', 23570]], ['B0716ZNTKS', ['859986005857', 'Tend Insights Minion Cam HD Wi-Fi Camera (Despicable Me 3', 17726]], ['B00MOVY01I', ['853376004284', 'Rocksteady XS Extra Battery and Charger', -1]]]
 _len_ = 8

Answer 1

要能够调用看来是嵌套列表的d[i][3]并使用hstack，您需要将d[i]列表放入嵌套列表中。您可以在 numpy hstack上阅读更多内容。

因此list(np.hstack(d[i]))会将数组转换为嵌套列表形式。您可以自己执行一个简单的脚本，并看到np.array()实际上并不返回嵌套列表形式，因为它本身已经是数组形式

import numpy as np

a = np.array([1,2,3])
print(np.array(a))

# outputs [1,2,3]

Answer 2

您添加了d：

In [28]: d[0]                                                                                                
Out[28]: 
['B01A8L6KKO',
 ['873124006834',
  'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
  4408]]
In [29]: np.array(d[0], object)                                                                              
Out[29]: 
array(['B01A8L6KKO',
       list(['873124006834', 'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL', 4408])],
      dtype=object)
In [30]: np.hstack(np.array(d[0], object))                                                                   
Out[30]: 
array(['B01A8L6KKO', '873124006834',
       'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
       '4408'], dtype='<U64')

从d[0]创建对象dtype数组时，hstack创建了字符串dtype数组。

In [31]: np.hstack(np.array(d[0], object))[3]                                                                
Out[31]: '4408'

分配给该数组的所有内容都将变成字符串。

In [34]: x = np.hstack(np.array(d[0], object))                                                               
In [35]: x[3] = 123                                                                                          
In [36]: x                                                                                                   
Out[36]: 
array(['B01A8L6KKO', '873124006834',
       'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
       '123'], dtype='<U64')

但是列表没有通用的dtype约束，因此可以将元素更改为整数：

In [37]: x = list(np.hstack(np.array(d[0], object)))                                                         
In [38]: x[3] = 123                                                                                          
In [39]: x                                                                                                   
Out[39]: 
['B01A8L6KKO',
 '873124006834',
 'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
 123]

hstack在将所有输入传递给concatenate之前，确保所有输入都是数组：

In [49]: [np.atleast_1d(x) for x in d[0]]                                                                    
Out[49]: 
[array(['B01A8L6KKO'], dtype='<U10'), array(['873124006834',
        'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
        '4408'], dtype='<U64')]

这说明了为什么hstack的结果是字符串dtype。不需要np.array(d[0], object)步骤。

list()包装器的替代方法是将字符串dtype转换为对象类型：

In [52]: x = np.hstack(d[0]).astype(object)                                                                  
In [53]: x[3] = 123                                                                                          
In [54]: x                                                                                                   
Out[54]: 
array(['B01A8L6KKO', '873124006834',
       'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
       123], dtype=object)

tolist通常更适合从数组中创建列表，尽管在这里没什么大不同：np.hstack(d[0]).tolist()

扁平化列表的另一种方法是：

In [62]: x = np.hstack([np.array(j, object) for j in d[0]])                                                  
In [63]: x                                                                                                   
Out[63]: 
array(['B01A8L6KKO', '873124006834',
       'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
       4408], dtype=object)

x[3]仍然是整数。

但是您也可以直接拉平列表（因为它们全部由字符串和列表组成）：

In [66]: [d[0][0], *d[0][1]]                                                                                 
Out[66]: 
['B01A8L6KKO',
 '873124006834',
 'HORI Premium Protector - Pikachu Edition for Nintendo New 2DS XL',
 4408]

Answer 3

我只是想出了一个快速解决方案：

更改此行：

d[i] = np.hstack(d[i])

对此：

d[i] = list(np.hstack(d[i]))

我认为问题是特定于numpy的。我仍然对它不适用于numpy的原因感到好奇。

无法更改numpy数组中的值

3 个答案: