卤化物发生器将RGBA转换为HSV,并将H值更改为其他值。
实际复制到设备的时间是75ms,运行时间是430ms,但是设备到主机的时间是4.3秒
我不确定为什么从主机->设备复制126MB所需的时间要比从设备到主机的31MB花费的时间短
问题的调试日志:
``` 2019-07-26 15:12:37.453 5045-5045/com.example.hellohalide D/halide_native: reading bitmap info...
2019-07-26 15:12:37.453 5045-5045/com.example.hellohalide D/halide_native: width:4032 height:1960 stride:16128
2019-07-26 15:12:37.453 5045-5045/com.example.hellohalide E/halide_native: The selected filter is :: Cartoon
2019-07-26 15:12:37.453 5045-5045/com.example.hellohalide D/halide_native: reading bitmap pixels...
2019-07-26 15:12:37.454 5045-5045/com.example.hellohalide I/halide: Entering Pipeline cartoon
2019-07-26 15:12:37.454 5045-5045/com.example.hellohalide I/halide: Target: arm-64-android-debug-openglcompute
2019-07-26 15:12:37.454 5045-5045/com.example.hellohalide I/halide: Input Buffer input8: buffer(0, 0x0, 0x72d4600000, 1, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.454 5045-5045/com.example.hellohalide I/halide: Output Buffer curved: buffer(0, 0x0, 0x72d2600000, 0, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.460 5045-5045/com.example.hellohalide I/halide: Halide running on 0x72e4ccdc2a
2019-07-26 15:12:37.460 5045-5045/com.example.hellohalide I/halide: Compute shader source for: kernel_curved_s0_y_yo___block_id_y
2019-07-26 15:12:37.460 5045-5045/com.example.hellohalide I/halide:
compute shader generated:
'''
#version 310 es
#extension GL_ANDROID_extension_pack_es31a : require
float float_from_bits(int x) { return intBitsToFloat(int(x)); }
layout(location = 0) uniform int _curved_extent_0;
layout(location = 1) uniform int _curved_extent_1;
layout(location = 2) uniform int _curved_min_0;
layout(location = 3) uniform int _curved_min_1;
layout(location = 4) uniform int _curved_stride_1;
layout(location = 5) uniform int _input8_min_0;
layout(location = 6) uniform int _input8_min_1;
layout(location = 7) uniform int _input8_min_2;
layout(location = 8) uniform int _input8_stride_1;
layout(binding=9) buffer buffer9 { float data[]; } _curved;
layout(binding=10) buffer buffer10 { float data[]; } _input8;
void main()
{
int _curved_s0_y_yoXX_block_id_y = int(gl_WorkGroupID.y);
int _curved_s0_x_xoXX_block_id_x = int(gl_WorkGroupID.x);
int XX_thread_id_y = int(gl_LocalInvocationID.y);
int XX_thread_id_x = int(gl_LocalInvocationID.x);
int _0 = _curved_s0_y_yoXX_block_id_y * int(8);
float _1 = float(_0);
int _2 = _
```
debug logs:
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Active Uniforms: 9
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 0 Type: 5124 Name: _input8_stride_1 location: 8
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 1 Type: 5124 Name: _input8_min_2 location: 7
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 2 Type: 5124 Name: _input8_min_1 location: 6
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 3 Type: 5124 Name: _input8_min_0 location: 5
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 4 Type: 5124 Name: _curved_stride_1 location: 4
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 5 Type: 5124 Name: _curved_min_1 location: 3
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 6 Type: 5124 Name: _curved_min_0 location: 2
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 7 Type: 5124 Name: _curved_extent_1 location: 1
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Uniform 8 Type: 5124 Name: _curved_extent_0 location: 0
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: Time: 8.070062e+01 ms
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: halide_copy_to_device validating input buffer: buffer(0, 0x0, 0x72d2600000, 0, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: halide_device_malloc validating input buffer: buffer(0, 0x0, 0x72d2600000, 0, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: halide_device_malloc: target device interface 0x72e303d0f8
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: OpenGLCompute: halide_openglcompute_device_malloc (user_context: 0x0, buf: 0x72e303e238)
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: allocating buffer, extents: 4032x1960x4x0 strides: 4x16128x1x0 (type: uint8)
2019-07-26 15:12:37.534 5045-5045/com.example.hellohalide I/halide: openglcompute_device_malloc: initialization completed.
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: Allocated dev_buffer(i.e. vbo) 1
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: Time: 6.894481e+01 ms for malloc
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: halide_copy_to_device validating input buffer: buffer(0, 0x0, 0x72d4600000, 1, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: halide_device_malloc validating input buffer: buffer(0, 0x0, 0x72d4600000, 1, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: halide_device_malloc: target device interface 0x72e303d0f8
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: OpenGLCompute: halide_openglcompute_device_malloc (user_context: 0x0, buf: 0x72e303e1c0)
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: allocating buffer, extents: 4032x1960x4x0 strides: 4x16128x1x1 (type: uint8)
2019-07-26 15:12:37.603 5045-5045/com.example.hellohalide I/halide: openglcompute_device_malloc: initialization completed.
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: Allocated dev_buffer(i.e. vbo) 2
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: Time: 5.314331e+01 ms for malloc
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: halide_copy_to_device 0x72e303e1c0 host is dirty
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: OGLC: halide_openglcompute_copy_to_device (user_context: 0x0, buf: 0x72e303e1c0, the_buffer:2)
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: Calling global_state.MapBufferRange(GL_ARRAY_BUFFER, 0, 126443520, GL_MAP_READ_BIT|GL_MAP_WRITE_BIT)
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: c.extent[0] = 4032
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: c.extent[1] = 1960
2019-07-26 15:12:37.657 5045-5045/com.example.hellohalide I/halide: c.extent[0] = 4
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: copied 126443520 bytes from 0x72d4600000 to the device.
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: Time: 7.563319e+01 ms for copy to dev
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: OpenGLCompute: halide_openglcompute_run (user_context: 0x0, entry: kernel_curved_s0_y_yo___block_id_y, blocks: 504x245x1, threads: 8x8x1, shmem: 0, num_attributes: 0, num_coords_dim0: 0, num_coords_dim1: 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 0 int32 [0x7a800000fc0 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 1 int32 [0x7a8 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 2 int32 [0x0 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 3 int32 [0x3f0000000000 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 4 int32 [0x3f00 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 5 int32 [0x0 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 6 int32 [0x0 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 7 int32 [0x3f0000000000 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 8 int32 [0xf705181000003f00 ...] 0
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 9 uint8 [0x1 ...] 1
2019-07-26 15:12:37.732 5045-5045/com.example.hellohalide I/halide: args 10 uint8 [0x2 ...] 1
2019-07-26 15:12:38.165 5045-5045/com.example.hellohalide I/halide: Time: 4.323138e+02 ms for run
2019-07-26 15:12:38.165 5045-5045/com.example.hellohalide I/halide: Exiting Pipeline cartoon
copy_to_host需要4.3秒
2019-07-26 15:12:38.165 5045-5045/com.example.hellohalide I/halide: halide_copy_to_host validating input buffer: buffer(1, 0x72e303d0f8, 0x72d2600000, 2, uint8, {0, 4032, 4}, {0, 1960, 16128}, {0, 4, 1})
2019-07-26 15:12:38.165 5045-5045/com.example.hellohalide I/halide: copy_to_host_already_locked 0x72e303e238 dev_dirty is true
2019-07-26 15:12:38.165 5045-5045/com.example.hellohalide I/halide: OGLC: halide_openglcompute_copy_to_host (user_context: 0x0, buf: 0x72e303e238, the_buffer:1, size=31610880)
2019-07-26 15:12:38.188 5045-5045/com.example.hellohalide I/halide: c.extent[0] = 4032
2019-07-26 15:12:38.188 5045-5045/com.example.hellohalide I/halide: c.extent[1] = 1960
2019-07-26 15:12:38.188 5045-5045/com.example.hellohalide I/halide: c.extent[0] = 4
2019-07-26 15:12:42.547 5045-5045/com.example.hellohalide I/halide: copied 31610880 bytes to the host.
2019-07-26 15:12:42.547 5045-5045/com.example.hellohalide I/halide: Time: 4.382263e+03 ms for copy to host
2019-07-26 15:12:42.547 5045-5045/com.example.hellohalide D/halide_native: Time taken: 5093477 (0)```