Get in Touch

Course Outline

Introduction

  • What is GPU programming?
  • Why utilize GPU programming?
  • What are the challenges and trade-offs of GPU programming?
  • Which frameworks and tools are available for GPU programming?
  • How to choose the right framework and tool for your application

OpenCL

  • What is OpenCL?
  • What are the advantages and disadvantages of OpenCL?
  • Setting up the development environment for OpenCL
  • Creating a basic OpenCL program that performs vector addition
  • Using the OpenCL API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
  • Writing kernels in OpenCL C that execute on the device and manipulate data
  • Using OpenCL built-in functions, variables, and libraries to perform common tasks and operations
  • Leveraging OpenCL memory spaces, such as global, local, constant, and private, to optimize data transfers and memory accesses
  • Using the OpenCL execution model to control work-items, work-groups, and ND-ranges that define parallelism
  • Debugging and testing OpenCL programs using tools like CodeXL
  • Optimizing OpenCL programs using techniques such as coalescing, caching, prefetching, and profiling

CUDA

  • What is CUDA?
  • What are the advantages and disadvantages of CUDA?
  • Setting up the development environment for CUDA
  • Creating a basic CUDA program that performs vector addition
  • Using the CUDA API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
  • Writing kernels in CUDA C/C++ that execute on the device and manipulate data
  • Using CUDA built-in functions, variables, and libraries to perform common tasks and operations
  • Leveraging CUDA memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses
  • Using the CUDA execution model to control threads, blocks, and grids that define parallelism
  • Debugging and testing CUDA programs using tools such as CUDA-GDB, CUDA-MEMCHECK, and NVIDIA Nsight
  • Optimizing CUDA programs using techniques such as coalescing, caching, prefetching, and profiling

ROCm

  • What is ROCm?
  • What are the advantages and disadvantages of ROCm?
  • Setting up the development environment for ROCm
  • Creating a basic ROCm program that performs vector addition
  • Using the ROCm API to query device information, allocate and deallocate device memory, copy data between host and device, launch kernels, and synchronize threads
  • Writing kernels in ROCm C/C++ that execute on the device and manipulate data
  • Using ROCm built-in functions, variables, and libraries to perform common tasks and operations
  • Leveraging ROCm memory spaces, such as global, local, constant, and private, to optimize data transfers and memory accesses
  • Using the ROCm execution model to control threads, blocks, and grids that define parallelism
  • Debugging and testing ROCm programs using tools like ROCm Debugger and ROCm Profiler
  • Optimizing ROCm programs using techniques such as coalescing, caching, prefetching, and profiling

HIP

  • What is HIP?
  • What are the advantages and disadvantages of HIP?
  • Setting up the development environment for HIP
  • Creating a basic HIP program that performs vector addition
  • Using the HIP language to write kernels that execute on the device and manipulate data
  • Using HIP built-in functions, variables, and libraries to perform common tasks and operations
  • Leveraging HIP memory spaces, such as global, shared, constant, and local, to optimize data transfers and memory accesses
  • Using the HIP execution model to control threads, blocks, and grids that define parallelism
  • Debugging and testing HIP programs using tools like ROCm Debugger and ROCm Profiler
  • Optimizing HIP programs using techniques such as coalescing, caching, prefetching, and profiling

Comparison

  • Comparing features, performance, and compatibility of OpenCL, CUDA, ROCm, and HIP
  • Evaluating GPU programs using benchmarks and metrics
  • Learning best practices and tips for GPU programming
  • Exploring current and future trends and challenges in GPU programming

Summary and Next Steps

Requirements

  • Understanding of the C/C++ language and parallel programming concepts
  • Foundational knowledge of computer architecture and memory hierarchy
  • Experience with command-line tools and code editors

Audience

  • Developers seeking to learn the basics of GPU programming and the primary frameworks and tools for building GPU applications.
  • Developers aiming to write portable and scalable code compatible with various platforms and devices.
  • Programmers interested in exploring the benefits and challenges of GPU programming and optimization.
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories