补充：如何执行转码（示例）

Question

我对创建像素图有用的代码感到烦恼，尤其是在将选定区域中的数据分组的循环中。我无法克服KeyError。我该如何处理？

我正在使用python 3.7，并且我已经尝试在循环中添加一些控件，但是循环并没有结束，导致第一个遇到的像素似乎为空。我也尝试使用try：和except：KeyError，但是最后我得到了无法重塑的行，因为显然，该循环只是跳过了空的子数据帧。在这里，我报告了主要的代码步骤，让您知道“ lin”和“ col”是整数，指示整数在特定像素中的位置：

第一次试用：

mean_val=[]
row=[]

for i in range (0,Ypix):

   for j in range (0,Xpix):

      data_pix = data.groupby(['lin', 'col']).get_group((i,j))[['ref', 'th']]

      if KeyError:
                data_pix = pd.DataFrame()

       else:
                mean_level= data_pix['ref'].mean()  
                row.append(mean_level)

mean_val = np.array(row).reshape(Ypix, Xpix)

第二次试用：

mean_val=[]
row = []

for i in range (0,Ypix):

  for j in range (0,Xpix):

      try:
         data_pix=data.groupby(['lin', 'col']).get_group((i,j))[['ref', 'th']]

      except KeyError:
         data_pix = pd.DataFrame()

      else:

         mean_level= data_pix['ref'].mean()  
         row.append(mean_level)

mean_val = np.array(row).reshape(Ypix, Xpix)

我希望在行的最后进行重塑以具有地图，并且我希望至少获得一个没有数据的空白像素，以便正确重塑。显示的错误如下：

第一次试用：

Traceback (most recent call last):
File "grid.py", line 385, in <module>
    proc.process()

File "grid.py", line 106, in process
    data_pix = data.groupby(['lin', 'col']).get_group((i,j))[['ref', 'th']]

File "C:\xxx\yyy\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\groupby\groupby.py", line 680, in get_group

  raise KeyError(name)

KeyError: (0, 0)

第二次试用：

Traceback (most recent call last):
  File "grid.py", line 379, in <module>
    proc.process()

File "grid.py", line 276, in process

   mean_val = np.array(row).reshape(Ypix, Xpix) 

ValueError: cannot reshape array of size 1506 into shape (50,50)

有人可以帮助我吗？

Answer 1

如果您真的想忽略KeyError，可以编写以下代码：

 except KeyError:
      pass

Answer 2

我猜想，您的 groupby 为可能的部分分组 i 和 j 的组合（对于某些 i / j 组合没有相应的组。

然后单独处理异常（如另一个答案中所述）将不会工作，因为您：

仅收集现有组的数据，
然后尝试重塑它们，就好像您拥有所有组的数据一样。

我的主张是，与其收集所有组合的数据， i / j 的代码，但由于缺少特定的组而进行异常处理，您应该填写中间结果的每个元素，仅适用于现有组。像这样：

means = data.groupby(['lin', 'col'])['ref'].mean()

结果是具有以下内容的系列：

由 lin 和 col 组成的 MultiIndex -像素坐标，
值-当前组中 ref 的平均值。

然后将该表转码为结果表（大小为 Xpix * Ypix ），将剩余的单元格填充为某个值，表示“无数据”（例如 0 ）。

注意：由于您未提供任何示例数据，因此我无法执行任何操作测试，因此以上所有内容都是基于我对您的情况的了解并且很可能需要进行一些更正/完成才能获得真正有效的代码。

补充：如何执行转码（示例）

假设表示-源 Series 为：

         ref
lin col     
0   0      1
    1      2
    2      3
1   0      4
    1      5
    2      6
2   0      7
    1      8
    2      9

运行：

Xpix = 5; Ypix = 5       # Target array size (example)
df1 = means.unstack()    # Convert to DataFrame
# Drop top level from the column index ('ref')
df1.columns = df1.columns.droplevel()
df1.columns.name = None  # Drop the name from the column index ('col')
df1.index.name = None    # Drop the name from the row index ('lin')
# Reindex (change the shape), and fill with "empty" values
df1 = df1.reindex(index=range(Xpix), columns=range(Xpix), fill_value=0)

结果是：

   0  1  2  3  4
0  1  2  3  0  0
1  4  5  6  0  0
2  7  8  9  0  0
3  0  0  0  0  0
4  0  0  0  0  0

现在您有了一个 DataFrame ，其中包含默认列索引和默认行索引，但是如果愿意，可以使用df1.values-基础的 Numpy 数组。

如何即使在发生KeyError

2 个答案:

补充：如何执行转码（示例）