GEMM - SSBTR

General matrix multiplication functions for computing \( C \leftarrow \alpha A^H B + \beta C \) with stripe-stripe-block distribution, where computation may be limited to triangular part.

  ------ T     ------
  |    |       |    |
  |    |       |    |
  ------       ------
  |    |       |    |        -------         -------
  |    |       |    |        |  |  |         |  |  |
  ------   *   ------    +   -------   -->   -------
  |    |       |    |           |  |            |  |
  |    |       |    |           ----            ----
  ------       ------           C               C
  |    |       |    |
  |    |       |    |
  ------       ------
    A            B

Functions

void pgemm_ssbtr(int m, int n, int kLocal, SplaOperation opA, float alpha, const float *A, int lda, const float *B, int ldb, float beta, float *C, int ldc, int cRowOffset, int cColOffset, SplaFillMode cFillMode, MatrixDistribution &distC, Context &ctx)

Computes a distributed general matrix multiplication of the form \( C \leftarrow \alpha A^H B + \beta C \) in single precision.

\(A\) and \(B\) are only split along the row dimension (stripes), while \(C\) can be distributed as any supported MatrixDistribution type. The fill mode of \(C\) indicates the part of the matrix which must be computed, while any other part may or may not be computed. It is therefore not a strict limitation. For example, given SPLA_FILL_MODE_UPPER, a small matrix may still be fully computed, while a large matrix will be computed block wise, such that the computed blocks cover the upper triangle. The fill mode is always in refenrence to the full matrix, so offsets are taken into account.

Parameters
  • m[in] Number of rows of \(A^H\)

  • n[in] Number of columns of \(B\)

  • kLocal[in] Number rows of \(B\) and number of columns of \(A^H\) stored at calling MPI rank. This number may differ for each rank.

  • opA[in] Operation applied when reading matrix A. Must be SPLA_OP_TRANSPOSE or SPLA_OP_CONJ_TRANSPOSE.

  • alpha[in] Scaling of multiplication of \(A^H\) and \(B\)

  • A[in] Pointer to matrix \(A\).

  • lda[in] Leading dimension of \(A\) with lda \(\geq\) kLocal.

  • B[in] Pointer to matrix \(B\).

  • ldb[in] Leading dimension of \(B\) with ldb \(\geq\) kLocal.

  • beta[in] Scaling of \(C\) before summation.

  • C[out] Pointer to global matrix \(C\).

  • ldc[in] Leading dimension of \(C\) with ldc \(\geq\) loc(m), where loc(m) is the number of locally stored rows of \(C\).

  • cRowOffset[in] Row offset in the global matrix \(C\), identifying the first row of the submatrix \(C\).

  • cColOffset[in] Column offset in the global matrix \(C\), identifying the first coloumn of the submatrix \(C\).

  • cFillMode[in] Fill mode of matrix C.

  • distC[in] Matrix distribution of global matrix \(C\).

  • ctx[in] Context, which provides configuration settings and reusable resources.

void pgemm_ssbtr(int m, int n, int kLocal, SplaOperation opA, double alpha, const double *A, int lda, const double *B, int ldb, double beta, double *C, int ldc, int cRowOffset, int cColOffset, SplaFillMode cFillMode, MatrixDistribution &distC, Context &ctx)

Computes a distributed general matrix multiplication of the form \( C \leftarrow \alpha A^H B + \beta C \) in double precision.

See documentation above.

void pgemm_ssbtr(int m, int n, int kLocal, SplaOperation opA, std::complex<float> alpha, const std::complex<float> *A, int lda, const std::complex<float> *B, int ldb, std::complex<float> beta, std::complex<float> *C, int ldc, int cRowOffset, int cColOffset, SplaFillMode cFillMode, MatrixDistribution &distC, Context &ctx)

Computes a distributed general matrix multiplication of the form \( C \leftarrow \alpha A^H B + \beta C \) in single precision for complex types.

See documentation above.

void pgemm_ssbtr(int m, int n, int kLocal, SplaOperation opA, std::complex<double> alpha, const std::complex<double> *A, int lda, const std::complex<double> *B, int ldb, std::complex<double> beta, std::complex<double> *C, int ldc, int cRowOffset, int cColOffset, SplaFillMode cFillMode, MatrixDistribution &distC, Context &ctx)

Computes a distributed general matrix multiplication of the form \( C \leftarrow \alpha A^H B + \beta C \) in double precision for complex types.

See documentation above.