Jump to content
LaptopVideo2Go Forums



Recommended Posts



Whether you are exploring mountains of geological data, researching solutions to complex scientific problems, or racing to model fast-moving financial markets, you need a computing platform that delivers the highest throughput and lowest latency possible. GPU-accelerated clusters and workstations are widely recognized for providing the tremendous horsepower required to perform compute-intensive workloads, and your applications can achieve even faster results with NVIDIA GPUDirect™.

First introduced in June 2010, the initial release of GPUDirect supported accelerated communication with network and storage devices and was supported by InfiniBand solutions available from Mellanox and QLogic. GPUDirect has continued to evolve, adding support for peer-to-peer communication between GPUs and optimized APIs for video solutions in 2011. New support for RDMA between GPUs and 3rd party devices was announced with CUDA 5.0, which will be released in September 2012.

Using GPUDirect, 3rd party network adapters, solid-state drives (SSDs) and other devices can directly read and write CUDA host and device memory, eliminating unnecessary system memory copies and CPU overhead, resulting in significant performance improvements in data transfer times on NVIDIA Tesla™ and Quadro™ products.

For more information, see the GPUDirect Technology Overview presentation.

Key Features:

  • Accelerated Communication With Network And Storage Devices
    Avoid unnecessary system memory copies and CPU overhead by copying data directly to/from pinned CUDA host memory
  • Peer-To-Peer Transfers Between GPUs
    Use high-speed DMA transfers to copy data from one GPU directly to another GPU in the same system
  • Peer-To-Peer Memory Access
    Optimize communication between GPUs using NUMA-style access to memory on other GPUs from within CUDA kernels
  • RDMA
    Eliminate CPU bandwidth and latency bottlenecks using direct memory access (DMA) between GPUs and other PCIe devices, resulting in significantly improved MPISendRecv efficiency between GPUs and other nodes (new in CUDA 5)
  • GPUDirect For Video
    Optimized pipeline for frame-based devices such as frame grabbers, video switchers, HD-SDI capture, and CameraLink devices. More Info

Link to comment
Share on other sites

  • 11 months later...
  • Create New...