批改娘 10098. Print Device Information (CUDA)

contents

  1. 1. Problem
  2. 2. Sample Input
  3. 3. Sample Output
  4. 4. 編譯參數
  5. 5. 備註
  6. 6. Solution

Problem

使用 CUDA 印出裝置訊息。請參考課程講義。

Sample Input

no input

Sample Output

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
3 devices found supporting CUDA
----------------------------------
Device GeForce GTX 980 Ti
----------------------------------
Device memory: 6442254336
Memory per-block: 49152
Register per-block: 65536
Warp size: 32
Memory pitch: 2147483647
Constant Memory: 65536
Max thread per-block: 1024
Max thread dim: 1024 / 1024 / 64
Max grid size: 2147483647 / 65535 / 65535
Ver: 5.2
Clock: 1190000
Texture Alignment: 512
----------------------------------
Device GeForce GTX 970
----------------------------------
Device memory: 4294770688
Memory per-block: 49152
Register per-block: 65536
Warp size: 32
Memory pitch: 2147483647
Constant Memory: 65536
Max thread per-block: 1024
Max thread dim: 1024 / 1024 / 64
Max grid size: 2147483647 / 65535 / 65535
Ver: 5.2
Clock: 1228000
Texture Alignment: 512
----------------------------------
Device GeForce GTX 770
----------------------------------
Device memory: 2147287040
Memory per-block: 49152
Register per-block: 65536
Warp size: 32
Memory pitch: 2147483647
Constant Memory: 65536
Max thread per-block: 1024
Max thread dim: 1024 / 1024 / 64
Max grid size: 2147483647 / 65535 / 65535
Ver: 3.0
Clock: 1137000
Texture Alignment: 512

編譯參數

1
2
$ nvcc hello.cu -o hello
$ ./hello

備註

請參考題解頁面的輸出格式。

Solution

以防萬一還是處理一下抓不到 device 的判斷,有時候因為驅動版本不對,抓不到 device 是很正常的。接下來就藉由 cudaDeviceProp 下的資訊全部打印。而在 %zu 則是處理型態 size_t 的輸出。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <stdio.h>
#include <cuda.h>

const char splitLine[] = "----------------------------------";
void output(const cudaDeviceProp devInfo) {
puts(splitLine);
printf("Device %s\n", devInfo.name);
puts(splitLine);
printf(" Device memory: \t%zu\n", devInfo.totalGlobalMem);
printf(" Memory per-block: \t%zu\n", devInfo.sharedMemPerBlock);
printf(" Register per-block: \t%d\n", devInfo.regsPerBlock);
printf(" Warp size: \t\t%d\n", devInfo.warpSize);
printf(" Memory pitch: \t\t%zu\n", devInfo.memPitch);
printf(" Constant Memory: \t%zu\n", devInfo.totalConstMem);
printf(" Max thread per-block: \t%d\n", devInfo.maxThreadsPerBlock);
printf(" Max thread dim: \t%d / %d / %d\n",
devInfo.maxThreadsDim[0], devInfo.maxThreadsDim[1], devInfo.maxThreadsDim[2]);
printf(" Max grid size: \t%d / %d / %d\n",
devInfo.maxGridSize[0], devInfo.maxGridSize[1], devInfo.maxGridSize[2]);
printf(" Ver: \t\t\t%d.%d\n", devInfo.major, devInfo.minor);
printf(" Clock: \t\t%d\n", devInfo.clockRate);
printf(" Texture Alignment: \t%zu\n", devInfo.textureAlignment);
}

int main() {
int cudaDeviceCnt = 0;
cudaGetDeviceCount(&cudaDeviceCnt);
printf("%d devices found supporting CUDA\n", cudaDeviceCnt);

if (cudaDeviceCnt == 0) {
printf("No supported GPU\n");
return 0;
}

for (int i = 0; i < cudaDeviceCnt; i++) {
cudaDeviceProp devInfo;
cudaGetDeviceProperties(&devInfo, i);
output(devInfo);
}
return 0;
}