Context

class Context
#include <context.hpp>

Context, which provides configuration settings and reusable resources.

Public Functions

explicit Context(SplaProcessingUnit pu)

Constructor of Context with default configuration for given processing unit.

Parameters:

pu[in] Processing unit to be used for computations.

Context(Context&&) = default

Default move constructor.

Context(const Context&) = delete

Disabled copy constructor.

Context &operator=(Context&&) = default

Default move assignment operator.

Context &operator=(const Context&) = delete

Disabled copy assignment operator.

SplaProcessingUnit processing_unit() const

Access a Context parameter.

Returns:

Processing unit used.

int num_tiles() const

Access a Context parameter.

Returns:

Number of tiles used to overlap computation and communication.

int tile_size_host() const

Access a Context parameter.

Returns:

Size of tiles on host. Used for partitioning communication.

int tile_size_gpu() const

Access a Context parameter.

Returns:

Target size of tiles on GPU.

int op_threshold_gpu() const

Access a Context parameter.

Returns:

Operations threshold, below which computation may be done on Host, even if processing unit is set to GPU. For GEMM, the number of operations is estimatex as 2mnk.

int gpu_device_id() const

Access a Context parameter.

Returns:

Id of GPU used for computations. This is set as fixed parameter by query of device id at context creation.

std::uint_least64_t allocated_memory_host() const

Access a Context parameter.

Returns:

Total allocated memory on host in bytes used for internal buffers. Does not include allocations through standard C++ allocators. May change with use of context.

std::uint_least64_t allocated_memory_pinned() const

Access a Context parameter.

Returns:

Total allocated pinned memory on host in bytes used for internal buffers. Does not include allocations through standard C++ allocators. May change with with use of context.

std::uint_least64_t allocated_memory_gpu() const

Access a Context parameter.

Returns:

Total allocated memory on gpu in bytes used for internal buffers. Does not include allocations by device libraries like cuBLAS / rocBLAS. May change with with use of context.

void set_num_tiles(int numTiles)

Set the number of tiles.

Parameters:

numTiles[in] Number of tiles.

void set_tile_size_host(int tileSizeHost)

Set the tile size used for computations on host and partitioning of communication.

Parameters:

tileSizeHost[in] Tile size.

void set_op_threshold_gpu(int opThresholdGPU)

Set the operations threshold, below which computation may be done on Host, even if processing unit is set to GPU.

For GEMM, the number of operations is estimatex as 2mnk.

Parameters:

opThresholdGPU[in] Threshold in number of operations.

void set_tile_size_gpu(int tileSizeGPU)

Set the tile size used for computations on GPU.

Parameters:

tileSizeGPU[in] Tile size on GPU.

void set_alloc_host(std::function<void*(std::size_t)> allocateFunc, std::function<void(void*)> deallocateFunc)

Set the allocation and deallocation functions for host memory.

Internal default uses a memory pool for better performance.

Parameters:
  • allocateFunc[in] Function allocating given size in bytes.

  • deallocateFunc[in] Function to deallocate memory allocated using allocateFunc.

void set_alloc_pinned(std::function<void*(std::size_t)> allocateFunc, std::function<void(void*)> deallocateFunc)

Set the allocation and deallocation functions for pinned host memory.

Internal default uses a memory pool for better performance.

Parameters:
  • allocateFunc[in] Function allocating given size in bytes.

  • deallocateFunc[in] Function to deallocate memory allocated using allocateFunc.

void set_alloc_gpu(std::function<void*(std::size_t)> allocateFunc, std::function<void(void*)> deallocateFunc)

Set the allocation and deallocation functions for gpu memory.

Internal default uses a memory pool for better performance.

Parameters:
  • allocateFunc[in] Function allocating given size in bytes.

  • deallocateFunc[in] Function to deallocate memory allocated using allocateFunc.