C++ AMP
Accelerated Massive Parallelism with Microsoft Visual C++
Kate Gregory and
Ade Miller
Image may be NSFW.
Clik here to view.
Capitalize on the faster GPU processors in today’s computers with the C++ AMP code library—and bring massive parallelism to your project. With this practical book, experienced C++ developers will learn parallel programming fundamentals with C++
AMP through detailed examples, code snippets, and case studies. Learn the advantages of parallelism and get best practices for harnessing this library in your applications.
Discover how to:
- Gain greater code performance using graphics processing units (GPUs)
- Choose accelerators that enable you to write code for GPUs
- Apply thread tiles, tile barriers, and tile static memory
- Debug C++ AMP code with Microsoft Visual Studio®
- Use profiling tools to track the performance of your code
Kate Gregory maintains a the book's homepage which contains updates, links and news of speaking engagements.
Get the Book
Image may be NSFW.
Clik here to view.The
book is now available for purchase, online and in good bookstores! You can read preview material on both the
Amazon.com (paper and Kindle versions) and the
O’Reilly web site (DRM free eBook, PDF and paper versions). You can also read it through
Safari Books online. The list prices; print $36.99, eBook $29.99, both $40.69.
If you like the book and want to write a review of it then Amazon.com is the place to do that.
Download the case studies and sample code for each chapter
Image may be NSFW.
Clik here to view.TheN-body case study shows how to use C++ AMP to get the most out of your GPU hardware in a computational application. It contains several implementations of the classic n-body problem that models particles moving under the influence of gravity.
The code has implementations for simple and tiled C++ AMP kernels as well as an implementation that runs on more than one GPU. The accompanying CPU based sample also includes single and multi-core implementations of the same algorithm. The case study also
shows the use of inter-op with DirectX to minimize the overhead of displaying your application’s results.
Image may be NSFW.
Clik here to view.TheCartoonizer case study demonstrates braided parallelism, using both the available cores on the CPU and any available GPU(s). It implements color simplification and edge detection algorithms using C++ AMP and orchestrates the processing
of images using the
Parallel Patterns Library and
Asynchronous Agents Library. Single accelerator implementations of simple, tiled and texture based algorithms are all shown. In addition, the case study also shows two approaches for dividing the cartoonizing workload up across more than one accelerator,
either by splitting images into subsections or forking the pipeline and processing images on separate accelerators before multiplexing them back into the correct sequence.
Image may be NSFW.
Clik here to view.
The Reduction case study shows twelve different implementations of the reduce algorithm. Each implementation shows different approaches and the book discusses their performance characteristics and the trade-offs associated with each implementation. Reduction is an important data parallel operation so it is worth considering its implementation in some detail.
Finally all the code samples associated with the other chapters in the book can also be found here.
System Requirements
You will need at least Visual Studio 2012 Professional to run the samples. However, you will need Visual Studio 2012 Ultimate to use some the parallel diagnostic tools such as the Concurrency Visualizer. If you are using Visual Studio 2013 then you will need to download and install the Concurrency Visualizer extension, as this no longer ships as part of Visual Studio. The DirectX SDK (June 2010) is also required to build the N-body case study and Chapter 11 samples.
You can also use Visual Studio Express 2012 for Windows Desktop to build and run the sample projects and case studies. The sample projects are classic style native applications so will not load in Visual Studio Express 2012 for Windows 8.
Note: Debugging and WARP accelerator support is now available on Windows 7 with the Platform Update for Windows 7. In addition NVidia now supports hardware debugging. The following blog posts outline how to use these:
- C++ AMP CPU fallback support now available on Windows 7
- C++ AMP GPU debugging now available on Windows 7
- Remote GPU Debugging on NVidia Hardware
Demos and Talks
Ade Miller gave a talk at the NVidia GPU Technology Conference you can find the slide deck for this talk on his blog. If you attended GTC and missed the talk then a video of it is available on theGTC web site.
- S3317 - An Overview of Accelerated Parallelism with C++ AMP
The source code for the examples in the talk are checked in to the Extras folder in the source tree.
Don McCrady (Development Lead for C++ AMP) and Jim Radigan (Architect on Windows C++) gave an excellent talk on performance programming with C++ and C++ AMP at//BUILD 2012. The C++ AMP material is towards the end.
Kate Gregory spoke at TechEd 2012. You can watch the talk here:
Daniel Moth gave two talks on C++ AMP at the GPU Technology Conference 2012. This included a showing the Cartoonizer case study.
There is also a shorter Channel 9 video, also by Daniel, that walks through the Cartoonizer application:
The C++ AMP team blogs frequently on the Parallel Programming in Native Code blog. Check there for new updates about C++ AMP and parallel programming in general. They also have a collection of samples that are very helpful for understanding how to implement solutions to specific problems in C++ AMP:
The Parallel Computing in C++ and Native Code forum on MSDN is good place to ask questions about C++ AMP.