Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- What is ROCm?
- What is HIP?
- ROCm vs CUDA vs OpenCL
- Overview of ROCm and HIP features and architecture
- Setting up the Development Environment
Getting Started
- Creating a new ROCm project using Visual Studio Code
- Exploring the project structure and files
- Compiling and running the program
- Displaying the output using printf and fprintf
ROCm API
- Understanding the role of the ROCm API in host programs
- Using the ROCm API to query device information and capabilities
- Using the ROCm API to allocate and deallocate device memory
- Using the ROCm API to copy data between host and device
- Using the ROCm API to launch kernels and synchronize threads
- Using the ROCm API to handle errors and exceptions
HIP Language
- Understanding the role of the HIP language in device programs
- Using the HIP language to write kernels that execute on the GPU and manipulate data
- Using HIP data types, qualifiers, operators, and expressions
- Using HIP built-in functions, variables, and libraries for common tasks and operations
ROCm and HIP Memory Model
- Understanding the difference between host and device memory models
- Using ROCm and HIP memory spaces, such as global, shared, constant, and local
- Using ROCm and HIP memory objects, such as pointers, arrays, textures, and surfaces
- Using ROCm and HIP memory access modes, such as read-only, write-only, and read-write
- Understanding ROCm and HIP memory consistency models and synchronization mechanisms
ROCm and HIP Execution Model
- Understanding the difference between host and device execution models
- Using ROCm and HIP threads, blocks, and grids to define parallelism
- Using ROCm and HIP thread functions, such as hipThreadIdx_x, hipBlockIdx_x, hipBlockDim_x, etc.
- Using ROCm and HIP block functions, such as __syncthreads, __threadfence_block, etc.
- Using ROCm and HIP grid functions, such as hipGridDim_x, hipGridSync, cooperative groups, etc.
Debugging
- Understanding common errors and bugs in ROCm and HIP programs
- Using the Visual Studio Code debugger to inspect variables, breakpoints, call stack, etc.
- Using the ROCm Debugger to debug ROCm and HIP programs on AMD devices
- Using the ROCm Profiler to analyze ROCm and HIP programs on AMD devices
Optimization
- Understanding factors that affect the performance of ROCm and HIP programs
- Using ROCm and HIP coalescing techniques to improve memory throughput
- Using ROCm and HIP caching and prefetching techniques to reduce memory latency
- Using ROCm and HIP shared memory and local memory techniques to optimize memory accesses and bandwidth
- Using ROCm and HIP profiling and profiling tools to measure and improve execution time and resource utilization
Summary and Next Steps
Requirements
- Familiarity with C/C++ languages and parallel programming concepts.
- Basic knowledge of computer architecture and memory hierarchy.
- Experience using command-line tools and code editors.
Audience
- Developers aiming to learn how to use ROCm and HIP for programming AMD GPUs and exploiting their parallelism.
- Developers seeking to write high-performance, scalable code that runs across various AMD devices.
- Programmers interested in exploring the low-level aspects of GPU programming and optimizing code performance.
28 Hours