Ginkgo  Generated from pipelines/1556235455 branch based on develop. Ginkgo version 1.9.0
A numerical linear algebra library targeting many-core architectures
Public Member Functions | Static Public Member Functions | List of all members
gko::CudaExecutor Class Reference

This is the Executor subclass which represents the CUDA device. More...

#include <ginkgo/core/base/executor.hpp>

Inheritance diagram for gko::CudaExecutor:
[legend]
Collaboration diagram for gko::CudaExecutor:
[legend]

Public Member Functions

std::shared_ptr< Executorget_master () noexcept override
 Returns the master OmpExecutor of this Executor. More...
 
std::shared_ptr< const Executorget_master () const noexcept override
 Returns the master OmpExecutor of this Executor. More...
 
void synchronize () const override
 Synchronize the operations launched on the executor with its master.
 
scoped_device_id_guard get_scoped_device_id_guard () const override
 
std::string get_description () const override
 
int get_device_id () const noexcept
 Get the CUDA device id of the device associated to this executor.
 
int get_num_warps_per_sm () const noexcept
 Get the number of warps per SM of this executor.
 
int get_num_multiprocessor () const noexcept
 Get the number of multiprocessor of this executor.
 
int get_num_warps () const noexcept
 Get the number of warps of this executor.
 
int get_warp_size () const noexcept
 Get the warp size of this executor.
 
int get_major_version () const noexcept
 Get the major version of compute capability.
 
int get_minor_version () const noexcept
 Get the minor version of compute capability.
 
cublasContext * get_cublas_handle () const
 Get the cublas handle for this executor. More...
 
cublasContext * get_blas_handle () const
 Get the cublas handle for this executor. More...
 
cusparseContext * get_cusparse_handle () const
 Get the cusparse handle for this executor. More...
 
cusparseContext * get_sparselib_handle () const
 Get the cusparse handle for this executor. More...
 
std::vector< int > get_closest_pus () const
 Get the closest PUs. More...
 
int get_closest_numa () const
 Get the closest NUMA node. More...
 
CUstream_st * get_stream () const
 Returns the CUDA stream used by this executor. More...
 
virtual void run (const Operation &op) const=0
 Runs the specified Operation using this Executor. More...
 
template<typename ClosureOmp , typename ClosureCuda , typename ClosureHip , typename ClosureDpcpp >
void run (const ClosureOmp &op_omp, const ClosureCuda &op_cuda, const ClosureHip &op_hip, const ClosureDpcpp &op_dpcpp) const
 Runs one of the passed in functors, depending on the Executor type. More...
 
template<typename ClosureReference , typename ClosureOmp , typename ClosureCuda , typename ClosureHip , typename ClosureDpcpp >
void run (std::string name, const ClosureReference &op_ref, const ClosureOmp &op_omp, const ClosureCuda &op_cuda, const ClosureHip &op_hip, const ClosureDpcpp &op_dpcpp) const
 Runs one of the passed in functors, depending on the Executor type. More...
 

Static Public Member Functions

static std::shared_ptr< CudaExecutorcreate (int device_id, std::shared_ptr< Executor > master, bool device_reset, allocation_mode alloc_mode=default_cuda_alloc_mode, CUstream_st *stream=nullptr)
 Creates a new CudaExecutor. More...
 
static std::shared_ptr< CudaExecutorcreate (int device_id, std::shared_ptr< Executor > master, std::shared_ptr< CudaAllocatorBase > alloc=std::make_shared< CudaAllocator >(), CUstream_st *stream=nullptr)
 Creates a new CudaExecutor with a custom allocator and device stream. More...
 
static int get_num_devices ()
 Get the number of devices present on the system.
 

Detailed Description

This is the Executor subclass which represents the CUDA device.

Member Function Documentation

◆ create() [1/2]

static std::shared_ptr<CudaExecutor> gko::CudaExecutor::create ( int  device_id,
std::shared_ptr< Executor master,
bool  device_reset,
allocation_mode  alloc_mode = default_cuda_alloc_mode,
CUstream_st *  stream = nullptr 
)
static

Creates a new CudaExecutor.

Parameters
device_idthe CUDA device id of this device
masteran executor on the host that is used to invoke the device kernels
device_resetthis option no longer has any effect.
alloc_modethe allocation mode that the executor should operate on. See @allocation_mode for more details
streamthe stream to execute operations on.

◆ create() [2/2]

static std::shared_ptr<CudaExecutor> gko::CudaExecutor::create ( int  device_id,
std::shared_ptr< Executor master,
std::shared_ptr< CudaAllocatorBase alloc = std::make_shared< CudaAllocator >(),
CUstream_st *  stream = nullptr 
)
static

Creates a new CudaExecutor with a custom allocator and device stream.

Parameters
device_idthe CUDA device id of this device
masteran executor on the host that is used to invoke the device kernels.
allocthe allocator to use for device memory allocations.
streamthe stream to execute operations on.

◆ get_blas_handle()

cublasContext* gko::CudaExecutor::get_blas_handle ( ) const
inline

Get the cublas handle for this executor.

Returns
the cublas handle (cublasContext*) for this executor

◆ get_closest_numa()

int gko::CudaExecutor::get_closest_numa ( ) const
inline

Get the closest NUMA node.

Returns
the closest NUMA node closest to this device

◆ get_closest_pus()

std::vector<int> gko::CudaExecutor::get_closest_pus ( ) const
inline

Get the closest PUs.

Returns
the array of PUs closest to this device

◆ get_cublas_handle()

cublasContext* gko::CudaExecutor::get_cublas_handle ( ) const
inline

Get the cublas handle for this executor.

Returns
the cublas handle (cublasContext*) for this executor

◆ get_cusparse_handle()

cusparseContext* gko::CudaExecutor::get_cusparse_handle ( ) const
inline

Get the cusparse handle for this executor.

Returns
the cusparse handle (cusparseContext*) for this executor

◆ get_description()

std::string gko::CudaExecutor::get_description ( ) const
overridevirtual
Returns
a textual representation of the executor and its device.

Implements gko::Executor.

◆ get_master() [1/2]

std::shared_ptr<const Executor> gko::CudaExecutor::get_master ( ) const
overridevirtualnoexcept

Returns the master OmpExecutor of this Executor.

Returns
the master OmpExecutor of this Executor.

Implements gko::Executor.

◆ get_master() [2/2]

std::shared_ptr<Executor> gko::CudaExecutor::get_master ( )
overridevirtualnoexcept

Returns the master OmpExecutor of this Executor.

Returns
the master OmpExecutor of this Executor.

Implements gko::Executor.

◆ get_sparselib_handle()

cusparseContext* gko::CudaExecutor::get_sparselib_handle ( ) const
inline

Get the cusparse handle for this executor.

Returns
the cusparse handle (cusparseContext*) for this executor

◆ get_stream()

CUstream_st* gko::CudaExecutor::get_stream ( ) const
inline

Returns the CUDA stream used by this executor.

Can be nullptr for the default stream.

Returns
the stream used to execute kernels and memory operations.

◆ run() [1/3]

template<typename ClosureOmp , typename ClosureCuda , typename ClosureHip , typename ClosureDpcpp >
void gko::Executor::run ( typename ClosureOmp  ,
typename ClosureCuda  ,
typename ClosureHip  ,
typename ClosureDpcpp   
)
inline

Runs one of the passed in functors, depending on the Executor type.

Template Parameters
ClosureOmptype of op_omp
ClosureCudatype of op_cuda
ClosureHiptype of op_hip
ClosureDpcpptype of op_dpcpp
Parameters
op_ompfunctor to run in case of a OmpExecutor or ReferenceExecutor
op_cudafunctor to run in case of a CudaExecutor
op_hipfunctor to run in case of a HipExecutor
op_dpcppfunctor to run in case of a DpcppExecutor

◆ run() [2/3]

virtual void gko::Executor::run

Runs the specified Operation using this Executor.

Parameters
opthe operation to run

◆ run() [3/3]

template<typename ClosureReference , typename ClosureOmp , typename ClosureCuda , typename ClosureHip , typename ClosureDpcpp >
void gko::Executor::run ( typename ClosureReference  ,
typename ClosureOmp  ,
typename ClosureCuda  ,
typename ClosureHip  ,
typename ClosureDpcpp   
)
inline

Runs one of the passed in functors, depending on the Executor type.

Template Parameters
ClosureReferencetype of op_ref
ClosureOmptype of op_omp
ClosureCudatype of op_cuda
ClosureHiptype of op_hip
ClosureDpcpptype of op_dpcpp
Parameters
namethe name of the operation
op_reffunctor to run in case of a ReferenceExecutor
op_ompfunctor to run in case of a OmpExecutor
op_cudafunctor to run in case of a CudaExecutor
op_hipfunctor to run in case of a HipExecutor
op_dpcppfunctor to run in case of a DpcppExecutor

The documentation for this class was generated from the following file: