Pressure Poisson with NekRS + Ginkgo#

Replaces NekRS’s built-in Preconditioned Conjugate Gradient (PCG) with Ginkgo’s CG solver for the pressure Poisson equation. NekRS’s matrix-free spectral element operator is wrapped as a Ginkgo LinOp, keeping all data on the device with zero-copy pointer sharing.

Compatible with NekRS v26+.

Integration Overview#

NekRS does not assemble a sparse matrix. The Laplacian is applied matrix-free via tensor-product spectral element operations. To use Ginkgo’s Krylov solvers, we wrap NekRS’s elliptic operator as a custom gko::LinOp subclass that calls NekRS’s op() method internally.

Key points:

No sparse matrix — Ginkgo’s solver works with the LinOp interface, not CSR
Zero-copy — OCCA memory objects and Ginkgo arrays share the same pointers
UDF plugin — examples compile as shared libraries loaded by NekRS at runtime
Shared wrapper — the NekRSOperator class lives in nekrs/nekrs_ginkgo.hpp and is shared between both NekRS examples

Step-by-Step#

Wrapping the Operator#

NekRS’s elliptic::op() applies $A \cdot x$ matrix-free. We wrap it as a gko::LinOp (defined in nekrs/nekrs_ginkgo.hpp):

// Wraps NekRS's elliptic operator as a Ginkgo LinOp.
// The operator applies A*x matrix-free via spectral element kernels.
class NekRSOperator : public gko::EnableLinOp<NekRSOperator> {
    friend class gko::EnablePolymorphicObject<NekRSOperator, gko::LinOp>;

public:
    static std::unique_ptr<NekRSOperator> create(
        std::shared_ptr<const gko::Executor> exec, gko::dim<2> size,
        elliptic* solver)
    {
        return std::unique_ptr<NekRSOperator>(
            new NekRSOperator(std::move(exec), size, solver));
    }

protected:
    // Required by EnableLinOp for clone/clear support.
    explicit NekRSOperator(std::shared_ptr<const gko::Executor> exec)
        : gko::EnableLinOp<NekRSOperator>(std::move(exec)), solver_(nullptr)
    {}

    NekRSOperator(std::shared_ptr<const gko::Executor> exec, gko::dim<2> size,
                  elliptic* solver)
        : gko::EnableLinOp<NekRSOperator>(std::move(exec), size),
          solver_(solver)
    {}

    void apply_impl(const gko::LinOp* b, gko::LinOp* x) const override
    {
        auto dense_b = gko::as<gko::matrix::Dense<dfloat>>(b);
        auto dense_x = gko::as<gko::matrix::Dense<dfloat>>(x);

        const auto N = solver_->fieldOffset();

        // Wrap Ginkgo's raw pointers as OCCA memory (zero-copy).
        auto o_b = platform->device.occaDevice().wrapMemory<dfloat>(
            dense_b->get_const_values(), N);
        auto o_x = platform->device.occaDevice().wrapMemory<dfloat>(
            dense_x->get_values(), N);

        // Apply matrix-free spectral element operator: o_x = A * o_b.
        solver_->op(o_b, o_x);
    }

    void apply_impl(const gko::LinOp* alpha, const gko::LinOp* b,
                    const gko::LinOp* beta, gko::LinOp* x) const override
    {
        auto dense_x = gko::as<gko::matrix::Dense<dfloat>>(x);
        auto tmp = dense_x->clone();
        this->apply_impl(b, tmp.get());
        dense_x->scale(beta);
        dense_x->add_scaled(alpha, tmp.get());
    }

private:
    elliptic* solver_;
};

The Ginkgo Solve Function#

Builds a Ginkgo CG solver from the wrapped operator:

// Solves the elliptic system using Ginkgo's CG solver with NekRS's
// matrix-free operator wrapped as a Ginkgo LinOp.
inline void nekrs_ginkgo_solve(elliptic* solver, occa::memory o_rhs,
                               occa::memory o_sol)
{
    auto exec = gko::ReferenceExecutor::create();

    const gko::size_type N = solver->fieldOffset();
    auto op_size = gko::dim<2>{N, N};

    auto A = gko::share(NekRSOperator::create(exec, op_size, solver));

    // Wrap OCCA memory as Ginkgo Dense vectors (zero-copy).
    auto b = gko::matrix::Dense<dfloat>::create(
        exec, gko::dim<2>{N, 1},
        gko::array<dfloat>::view(exec, N, static_cast<dfloat*>(o_rhs.ptr())),
        1);
    auto x = gko::matrix::Dense<dfloat>::create(
        exec, gko::dim<2>{N, 1},
        gko::array<dfloat>::view(exec, N, static_cast<dfloat*>(o_sol.ptr())),
        1);

    auto gko_solver =
        gko::solver::Cg<dfloat>::build()
            .with_criteria(
                gko::stop::Iteration::build().with_max_iters(200u),
                gko::stop::ResidualNorm<dfloat>::build().with_reduction_factor(
                    static_cast<dfloat>(1e-6)))
            .on(exec)
            ->generate(A);

    gko_solver->apply(b, x);
}

NekRS UDF Integration#

The solve function is called from NekRS’s per-timestep hook. NekRS automatically injects the nrs variable into UDF functions.

void UDF_ExecuteStep(double time, int tstep)
{
    auto* pSolver = nrs->fluid->ellipticSolverP;
    if (!pSolver || tstep == 0) return;

    // Get the pressure RHS and solution as OCCA memory.
    auto o_rhs = nrs->fluid->o_explicitTerms("pressure");
    auto o_sol = nrs->fluid->o_solution("pressure");

    // Solve using Ginkgo's CG with NekRS's operator.
    nekrs_ginkgo_solve(pSolver, static_cast<occa::memory&>(o_rhs),
                       static_cast<occa::memory&>(o_sol));

    if (tstep % 100 == 0)
        std::cout << "[Ginkgo] Pressure solve at step " << tstep
                  << ", time = " << time << std::endl;
}

Building#

NekRS UDFs compile as shared libraries (plugins), not standalone executables:

cmake -S . -B build --preset default \
  -DNEKRS_DIR=/path/to/nekrs \
  -DGinkgo_DIR=/path/to/ginkgo
cmake --build build

This produces libpoisson.so.

Running#

Note: Running has not been tested end-to-end. The code compiles against the NekRS v26 API, validating the integration pattern, but actually executing it requires a full NekRS case setup.

To run, you would need a NekRS case directory with .par and .re2 files (e.g., the ethier example bundled with NekRS). Copy the built shared library into the case directory and configure the .par to load it:

# Copy a NekRS example case
cp -r /path/to/nekrs/examples/ethier /tmp/test-case
# Copy the UDF plugin
cp build/libpoisson.so /tmp/test-case/
# Run
cd /tmp/test-case && mpirun -np 1 nekrs --setup ethier

The UDF_ExecuteStep hook intercepts the pressure solve at every time step. Note that NekRS still calls its own pressure solver internally — a production integration would need to disable the built-in solve and route it through the Ginkgo path exclusively.

Expected Output (approximate)#

[Ginkgo] Pressure DOFs: 262144
[Ginkgo] Pressure solve at step 100, time = 0.2
[Ginkgo] Pressure solve at step 200, time = 0.4