如何将两个3d矩阵结合起来以形成具有相同形状的2d矩阵的3d矩阵?

时间:2020-08-27 00:07:51

标签: python python-3.x pandas numpy indexing

我有一个2d矩阵的3d矩阵。但是它们的大小都相同。 它们的第二维随每个样本而增加。 因此,我想在每行上方填充NaN,以使它们都具有相同的形状。

这些是示例:

# generated by this:
arr = np.asarray(df)
result = list((map(lambda i: arr[:i], range(1,df.shape[0]+1))))

[                                                                                  
   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71  ],  

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91   ],  

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21   ],   

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41   ],  

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21    ], 

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11    ], 

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    ],

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    ],  

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    
   2019-06-17 08:53:00     12503.61     12503.61    12503.61      12503.61     ],   

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    
   2019-06-17 08:53:00     12503.61     12503.61    12503.61      12503.61     
   2019-06-17 08:54:00     12503.61     12503.61    12503.11      12503.11  ]
                                                                               ]

预期结果:

[ 
   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN  
                   NaN          NaN          NaN         NaN           NaN                                                                 
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71  ],  

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN     
                   NaN          NaN          NaN         NaN           NaN 
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91   ],  

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN  
                   NaN          NaN          NaN         NaN           NaN    
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21   ],   

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN  
                   NaN          NaN          NaN         NaN           NaN  
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41   ],  

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN  
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21    ], 

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN  
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11    ], 

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN 
                   NaN          NaN          NaN         NaN           NaN   
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    ],

   [               NaN          NaN          NaN         NaN           NaN   
                   NaN          NaN          NaN         NaN           NaN  
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    ],  

   [               NaN          NaN          NaN         NaN           NaN   
   2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    
   2019-06-17 08:53:00     12503.61     12503.61    12503.61      12503.61     ],   

   [2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71     
   2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91    
   2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21      
   2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41     
   2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21     
   2019-06-17 08:50:00     12504.11     12504.11    12504.11      12504.11     
   2019-06-17 08:51:00     12504.11          NaN    12503.11      12503.11    
   2019-06-17 08:52:00     12504.11     12504.11    12503.11      12503.11    
   2019-06-17 08:53:00     12503.61     12503.61    12503.61      12503.61     
   2019-06-17 08:54:00     12503.61     12503.61    12503.11      12503.11  ]
                                                                               ]

什么是有效的方法? (数据大约有100.000-500.000个样本)

  • 是否可以分批执行此操作? (样本的前10%,然后追加到列表中,接下来的10%... 在这种情况下,每个样品的理想长度是批次中最后一个样品的长度)

编辑: 否则,是否有办法立即生成“结果”和预期结果? 像创建第二个充满NaN的数据框一样?这样的东西? (伪:)

result = list((map(lambda i: nanarr[:j-i]+arr[:i], range(1,df.shape[0]+1))))

1 个答案:

答案 0 :(得分:0)

我假设result就是您上面粘贴的内容。

如果result是列表列表,则可以使用以下方法修改结果以获取您上面要求的输出:

import numpy as np
longest_length = max(len(item) for item in result)
new_result = []
for L in result:
    new_result.append([np.NaN] * (longest_length - len(L)) + L)

这大约与不使用编译代码所能获得的“效率”一样。

您所问的问题本身效率很低。您正在构造的输出具有N**2 * M值,其中N是您拥有的样本数量,M是每个样本中值的数量。此问题的输出包含大量重复的数据。如果您需要一种更高效的解决方案,则可以尝试找到一种编写没有此重复代码的方法。