Question

我有兴趣收集有关我的Maxwell卡上L1缓存访问和未命中的信息。但是，我注意到nvprof没有列出任何与L1缓存相关的指标或事件，根据文档，确实没有更多与L1缓存相关的计算能力5.x的指标。

我想知道是否有间接间接检索这些指标的方式，或者这些指标是否会在不久的将来某个时间曝光。

我的想法是简单地使用减法来检索L1缓存未命中数：

# L1 misses = (# of L2 accesses) - (# of L2 accesses from texture cache)

然而，这种方法可能不是100％准确。

我最感兴趣的是检索L1缓存全局命中率。

[sj755@localhost vectorAdd]$ nvprof --metrics tex_cache_hit_rate --events tex0_cache_sector_misses,tex1_cache_sector_misses,tex0_cache_sector_queries,tex1_cache_sector_queries ./vectorAdd
[Vector addition of 50000 elements]
==2450== NVPROF is profiling process 2450, command: ./vectorAdd
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
==2450== Warning: Some kernel(s) will be replayed on device 0 in order to collect all events/metrics.
Copy output data from the CUDA device to the host memory
Test PASSED
Done
==2450== Profiling application: ./vectorAdd
==2450== Profiling result:
==2450== Event result:
Invocations                                Event Name         Min         Max         Avg
Device "GeForce GTX 970 (0)"
    Kernel: vectorAdd(float const *, float const *, float*, int)
          1                 tex0_cache_sector_queries       18732       18732       18732
          1                 tex1_cache_sector_queries       18768       18768       18768
          1                  tex0_cache_sector_misses       12504       12504       12504
          1                  tex1_cache_sector_misses       12496       12496       12496

==2450== Metric result:
Invocations                               Metric Name                        Metric Description         Min         Max         Avg
Device "GeForce GTX 970 (0)"
    Kernel: vectorAdd(float const *, float const *, float*, int)
          1                        tex_cache_hit_rate                    Texture Cache Hit Rate      50.00%      50.00%      50.00%

__global__ void
vectorAdd(const float *A, const float *B, float *C, int numElements)
{
    int i = blockDim.x * blockIdx.x + threadIdx.x;

    if (i < numElements)
    {
        C[i] = A[i] + B[i];
    }
}

在Maxwell上检索L1缓存指标和事件

0 个答案: