CUDA Course Outline
|
Day |
Description |
Duration |
1
|
- Difference between GPGPU and GPU
- Introduction to GPU Hardware
- Introduction to Various GPU Programming Models
- Various GPU Vendors and Their GPUs
|
2 hrs
|
2
|
- CUDA Execution Model
- Understanding CUDA threads and thread hierarchy
|
2 hrs
|
3
|
- Multidimensional mapping of dataspace
- Warp scheduling and divergence
|
2 hrs
|
4
|
- Dimension of Grids and Blocks - 1D, 2D, 3D
- CUDA Program Execution Workflow
|
2 hrs
|
5
|
- Understanding the role of CUDA API in the host program
- Using CUDA API to query device information and capabilities
- Using CUDA API to allocate and deallocate device memory
|
2 hrs
|
6
|
- Using CUDA API to copy data between host and device
- Using CUDA API to launch kernels and synchronise threads
- Using CUDA API to handle errors and exceptions
|
2 hrs
|
7
|
- Introduction to CUDA memory hierarchy
- Introduction to Various GPU Memories and Their Scope
- Memory access coalescing
|
2 hrs
|
8
|
- Memory allocation and data transfer in CPU
- Memory allocation and data transfer in CUDA
|
2 hrs
|
9
|
- Understanding the difference between host and device execution models
- Using CUDA threads, blocks, and grids to define the parallelism
|
2 hrs
|
10
|
- Threads Mapping
- Using CUDA thread functions, such as threadIdx, blockIdx, blockDim, etc.
|
2 hrs
|
11
|
- Real-world applications of CUDA programming
- Case studies of CUDA-accelerated applications
|
2 hrs
|
12
|
- Best practices for debugging CUDA code
- Profiling CUDA applications for performance optimization
|
2 hrs
|
13
|
- Understanding the factors that affect the performance of CUDA programs
- Tips and tricks for optimising CUDA applications
|
2 hrs
|
14
|
- Introduction to NumPy for GPU
- Examples in python
|
2 hrs
|
15
|
|
2 hrs
|