Ginkgo
Generated from pipelines/1589998975 branch based on develop. Ginkgo version 1.10.0
A numerical linear algebra library targeting many-core architectures
|
Operations can be used to define functionalities whose implementations differ among devices. More...
#include <ginkgo/core/base/executor.hpp>
Public Member Functions | |
virtual void | run (std::shared_ptr< const OmpExecutor >) const |
virtual void | run (std::shared_ptr< const HipExecutor >) const |
virtual void | run (std::shared_ptr< const DpcppExecutor >) const |
virtual void | run (std::shared_ptr< const CudaExecutor >) const |
virtual void | run (std::shared_ptr< const ReferenceExecutor > executor) const |
virtual const char * | get_name () const noexcept |
Returns the operation's name. More... | |
Operations can be used to define functionalities whose implementations differ among devices.
This is done by extending the Operation class and implementing the overloads of the Operation::run() method for all Executor types. When invoking the Executor::run() method with the Operation as input, the library will select the Operation::run() overload corresponding to the dynamic type of the Executor instance.
Consider an overload of operator<<
for Executors, which prints some basic device information (e.g. device type and id) of the Executor to a C++ stream:
One possible implementation would be to use RTTI to find the dynamic type of the Executor, However, using the Operation feature of Ginkgo, there is a more elegant approach which utilizes polymorphism. The first step is to define an Operation that will print the desired information for each Executor type.
Using DeviceInfoPrinter, the implementation of operator<<
is as simple as calling the run() method of the executor.
Now it is possible to write the following code:
which produces the expected output:
One might feel that this code is too complicated for such a simple task. Luckily, there is an overload of the Executor::run() method, which is designed to facilitate writing simple operations like this one. The method takes four closures as input: one which is run for OMP, one for CUDA executors, one for HIP executors, and the last one for DPC++ executors. Using this method, there is no need to implement an Operation subclass:
Using this approach, however, it is impossible to distinguish between a OmpExecutor and ReferenceExecutor, as both of them call the OMP closure.
|
virtualnoexcept |
Returns the operation's name.