Question

我有两个数据集如下：

ds1：DEM（数字高程模型）文件为2d numpy数组，

ds2：显示其中有多余水分的区域（像素）。

我有一个while循环，负责根据其8个邻居的高度及其自身扩散（和改变）每个像素中的多余体积，直到每个像素中的多余体积小于某个值d = 0.05。因此，在每次迭代中，我需要找到ds2中像素的索引，其中多余的体积大于0.05，如果没有像素，则退出while循环：

<?php if ( $the_query->have_posts() ) : while ( $the_query->have_posts() ) : $the_query->the_post(); ?>
       <div class="news-el">
           <a href="<?php the_permalink(); ?>">
                <div class="news-content">                      
                    <h3><?php echo get_the_title(); ?></h3>
                    <?php echo wp_strip_all_tags( get_the_content() );?>
                </div>
                <span><b>More...</b></span>
            </a>
        </div>
    <?php endwhile; ?>
<?php endif; ?>

问题是numpy.argwhere（ds2> 0.05）非常慢。我正在寻找更快的替代解决方案。

Answer 1

export class AddComponent implements OnInit { isEditing = false; public cheats = []; public cheaterNames = []; public selectedName = null; addCheatForm: FormGroup; title = new FormControl('', Validators.required); code = new FormControl('', Validators.required); description = new FormControl('', Validators.required); name = new FormControl('', Validators.required); constructor(private cheatService: CheatService, private formBuilder: FormBuilder, public toast: ToastComponent, private commonService: CommonService) {} ngOnInit() { this.addCheatForm = this.formBuilder.group({ title: this.title, code: this.code, description: this.description, name: this.name }); //this.getNames(); } addCheat() { this.cheatService.addCheat(this.addCheatForm.value).subscribe( res => { const newCheat = res.json(); this.cheats.push(newCheat); this.addCheatForm.reset(); this.toast.setMessage('item added successfully.', 'success'); this.commonService.notifyOther({option: 'onSubmit', value: 'Add component'}); }, error => console.log(error) ); }和np.where(arr> 0.05)在我的测试中快了约22-25％。

例如：

(arr > 0.05).nonzero()

但是，我担心由于while exit_code == "No": index_of_pixels_with_excess_volume = numpy.where(ds2 > 0.05) if not index_of_pixels_with_excess_volume[0].size: exit_code = "Yes" else: for pixel in zip(*index_of_pixels_with_excess_volume):，where与argwhere带来的任何收益都会在最后一个循环中丢失。如果是这种情况，请告诉我，我很乐意删除这个答案。

Answer 2

制作样本2d数组：

In [584]: arr = np.random.rand(1000,1000)

查找其中的一小部分：

In [587]: np.where(arr>.999)
Out[587]: 
(array([  1,   1,   1, ..., 997, 999, 999], dtype=int32),
 array([273, 471, 584, ..., 745, 310, 679], dtype=int32))
In [588]: _[0].shape
Out[588]: (1034,)

计算argwhere的各个部分：

In [589]: timeit arr>.999
2.65 ms ± 116 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [590]: timeit np.count_nonzero(arr>.999)
2.79 ms ± 26 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [591]: timeit np.nonzero(arr>.999)
6 ms ± 10 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [592]: timeit np.argwhere(arr>.999)
6.06 ms ± 58.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

因此大约1/3的时间用于>测试，其余的用于查找True元素。将where元组转换为2列数组很快。

现在，如果目标是找到第一个>值，则argmax速度很快。

In [593]: np.argmax(arr>.999)
Out[593]: 1273    # can unravel this to (1,273)
In [594]: timeit np.argmax(arr>.999)
2.76 ms ± 143 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

argmax短路，因此实际运行时间会因找到第一个值而有所不同。

flatnonzero比where快

In [595]: np.flatnonzero(arr>.999)
Out[595]: array([  1273,   1471,   1584, ..., 997745, 999310, 999679], dtype=int32)
In [596]: timeit np.flatnonzero(arr>.999)
3.05 ms ± 26.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [599]: np.unravel_index(np.flatnonzero(arr>.999),arr.shape)
Out[599]: 
(array([  1,   1,   1, ..., 997, 999, 999], dtype=int32),
 array([273, 471, 584, ..., 745, 310, 679], dtype=int32))
In [600]: timeit np.unravel_index(np.flatnonzero(arr>.999),arr.shape)
3.05 ms ± 3.58 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [601]: timeit np.transpose(np.unravel_index(np.flatnonzero(arr>.999),arr.shap
     ...: e))
3.1 ms ± 5.86 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

这与np.argwhere(arr>.999)相同。

有趣的是，flatnonzero方法将时间缩短了一半！我没想到会有这么大的改进。

比较迭代速度：

从argwhere：

对2d数组进行迭代

In [607]: pixels = np.argwhere(arr>.999)
In [608]: timeit [pixel for pixel in pixels]
347 µs ± 5.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

使用where转置对来自zip(*)的元组进行迭代：

In [609]: idx = np.where(arr>.999)
In [610]: timeit [pixel for pixel in zip(*idx)]
256 µs ± 147 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

迭代数组通常比在列表上迭代要慢一些，或者在这种情况下是压缩数组。

In [611]: [pixel for pixel in pixels][:5]
Out[611]: 
[array([  1, 273], dtype=int32),
 array([  1, 471], dtype=int32),
 array([  1, 584], dtype=int32),
 array([  1, 826], dtype=int32),
 array([  2, 169], dtype=int32)]
In [612]: [pixel for pixel in zip(*idx)][:5]
Out[612]: [(1, 273), (1, 471), (1, 584), (1, 826), (2, 169)]

一个是数组列表，另一个是元组列表。但是将这些元组转换为数组（单独）是很慢的：

In [614]: timeit [np.array(pixel) for pixel in zip(*idx)]
2.26 ms ± 4.94 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

迭代平坦的非零数组更快

In [617]: fdx = np.flatnonzero(arr>.999)
In [618]: fdx[:5]
Out[618]: array([1273, 1471, 1584, 1826, 2169], dtype=int32)
In [619]: timeit [i for i in fdx]
112 µs ± 23.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

但单独将unravel应用于这些值需要时间。

def foo(idx):    # a simplified unravel
    return idx//1000, idx%1000

In [628]: timeit [foo(i) for i in fdx]
1.12 ms ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

将此1毫秒添加到3毫秒以生成fdx，此flatnonzero可能仍会提前出现。但最好的情况是我们谈论的是速度提升2倍。

Answer 3

我的问题的另一个解决方案是那些可能感兴趣的人：我通过使用numpy技巧找到“代码矢量化”，可以通过消除for或while循环和{{来显着加快运行时间1}}。我发现这两个网站在解释代码矢量化方面非常有用。

https://github.com/rougier/from-python-to-numpy/blob/master/04-code-vectorization.rst#id36

https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html

替换numpy.argwhere以加速python中的循环

3 个答案: