3.2. LAPACK Auxiliary Functions

These are functions that support more advanced LAPACK routines. The auxiliary functions are divided into the following categories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

3.2.1. Vector and Matrix manipulations

List of vector and matrix manipulations

rocsolver_<type>lacgv()

rocblas_status rocsolver_zlacgv(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx)
rocblas_status rocsolver_clacgv(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx)

LACGV conjugates the complex vector x.

It conjugates the n entries of a complex vector x with increment incx.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The dimension of vector x.

  • [inout] x: pointer to type. Array on the GPU of size at least n (size depends on the value of incx).

    On entry, the vector x. On exit, each entry is overwritten with its conjugate value.

  • [in] incx: rocblas_int. incx != 0.

    The distance between two consecutive elements of x. If incx is negative, the elements of x are indexed in reverse order.

rocsolver_<type>laswp()

rocblas_status rocsolver_zlaswp(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_claswp(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_dlaswp(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)
rocblas_status rocsolver_slaswp(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_int k1, const rocblas_int k2, const rocblas_int *ipiv, const rocblas_int incx)

LASWP performs a series of row interchanges on the matrix A.

Row interchanges are done one by one. If \(\text{ipiv}[k_1 + (j - k_1) \cdot \text{abs}(\text{incx})] = r\), then the j-th row of A will be interchanged with the r-th row of A, for \(j = k_1,k_1+1,\dots,k_2\). Indices \(k_1\) and \(k_2\) are 1-based indices.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix to which the row interchanges will be applied. On exit, the resulting permuted matrix.

  • [in] lda: rocblas_int. lda > 0.

    The leading dimension of the array A.

  • [in] k1: rocblas_int. k1 > 0.

    The k_1 index. It is the first element of ipiv for which a row interchange will be done. This is a 1-based index.

  • [in] k2: rocblas_int. k2 > k1 > 0.

    The k_2 index. k_2 - k_1 + 1 is the number of elements of ipiv for which a row interchange will be done. This is a 1-based index.

  • [in] ipiv: pointer to rocblas_int. Array on the GPU of dimension at least k_1 + (k_2 - k_1)*abs(incx).

    The vector of pivot indices. Only the elements in positions k_1 through k_1 + (k_2 - k_1)*abs(incx) of this vector are accessed. Elements of ipiv are considered 1-based.

  • [in] incx: rocblas_int. incx != 0.

    The distance between successive values of ipiv. If incx is negative, the pivots are applied in reverse order.

3.2.2. Householder reflections

rocsolver_<type>larfg()

rocblas_status rocsolver_zlarfg(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *alpha, rocblas_double_complex *x, const rocblas_int incx, rocblas_double_complex *tau)
rocblas_status rocsolver_clarfg(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *alpha, rocblas_float_complex *x, const rocblas_int incx, rocblas_float_complex *tau)
rocblas_status rocsolver_dlarfg(rocblas_handle handle, const rocblas_int n, double *alpha, double *x, const rocblas_int incx, double *tau)
rocblas_status rocsolver_slarfg(rocblas_handle handle, const rocblas_int n, float *alpha, float *x, const rocblas_int incx, float *tau)

LARFG generates a Householder reflector H of order n.

The reflector H is such that

\[\begin{split} H'\left[\begin{array}{c} \text{alpha}\\ x \end{array}\right]=\left[\begin{array}{c} \text{beta}\\ 0 \end{array}\right] \end{split}\]

where x is an n-1 vector, and alpha and beta are scalars. Matrix H can be generated as

\[\begin{split} H = I - \text{tau}\left[\begin{array}{c} 1\\ v \end{array}\right]\left[\begin{array}{cc} 1 & v' \end{array}\right] \end{split}\]

where v is an n-1 vector, and tau is a scalar known as the Householder scalar. The vector

\[\begin{split} \bar{v}=\left[\begin{array}{c} 1\\ v \end{array}\right] \end{split}\]

is the Householder vector associated with the reflection.

Note

The matrix H is orthogonal/unitary (i.e. \(H'H=HH'=I\)). It is symmetric when real (i.e. \(H^T=H\)), but not Hermitian when complex (i.e. \(H^H\neq H\) in general).

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The order (size) of reflector H.

  • [inout] alpha: pointer to type. A scalar on the GPU.

    On entry, the scalar alpha. On exit, it is overwritten with beta.

  • [inout] x: pointer to type. Array on the GPU of size at least n-1 (size depends on the value of incx).

    On entry, the vector x, On exit, it is overwritten with vector v.

  • [in] incx: rocblas_int. incx > 0.

    The distance between two consecutive elements of x.

  • [out] tau: pointer to type. A scalar on the GPU.

    The Householder scalar tau.

rocsolver_<type>larft()

rocblas_status rocsolver_zlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *tau, rocblas_double_complex *T, const rocblas_int ldt)
rocblas_status rocsolver_clarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *tau, rocblas_float_complex *T, const rocblas_int ldt)
rocblas_status rocsolver_dlarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *tau, double *T, const rocblas_int ldt)
rocblas_status rocsolver_slarft(rocblas_handle handle, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *tau, float *T, const rocblas_int ldt)

LARFT generates the triangular factor T of a block reflector H of order n.

The block reflector H is defined as the product of k Householder matrices

\[\begin{split} \begin{array}{cl} H = H_1H_2\cdots H_k & \: \text{if direct indicates forward direction, or} \\ H = H_k\cdots H_2H_1 & \: \text{if direct indicates backward direction} \end{array} \end{split}\]

The triangular factor T is upper triangular in the forward direction and lower triangular in the backward direction. If storev is column-wise, then

\[ H = I - VTV' \]

where the i-th column of matrix V contains the Householder vector associated with \(H_i\). If storev is row-wise, then

\[ H = I - V'TV \]

where the i-th row of matrix V contains the Householder vector associated with \(H_i\).

Parameters
  • [in] handle: rocblas_handle.

  • [in] direct: rocblas_direct.

    Specifies the direction in which the Householder matrices are applied.

  • [in] storev: rocblas_storev.

    Specifies how the Householder vectors are stored in matrix V.

  • [in] n: rocblas_int. n >= 0.

    The order (size) of the block reflector.

  • [in] k: rocblas_int. k >= 1.

    The number of Householder matrices forming H.

  • [in] V: pointer to type. Array on the GPU of size ldv*k if column-wise, or ldv*n if row-wise.

    The matrix of Householder vectors.

  • [in] ldv: rocblas_int. ldv >= n if column-wise, or ldv >= k if row-wise.

    Leading dimension of V.

  • [in] tau: pointer to type. Array of k scalars on the GPU.

    The vector of all the Householder scalars.

  • [out] T: pointer to type. Array on the GPU of dimension ldt*k.

    The triangular factor. T is upper triangular if direct indicates forward direction, otherwise it is lower triangular. The rest of the array is not used.

  • [in] ldt: rocblas_int. ldt >= k.

    The leading dimension of T.

rocsolver_<type>larf()

rocblas_status rocsolver_zlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_double_complex *x, const rocblas_int incx, const rocblas_double_complex *alpha, rocblas_double_complex *A, const rocblas_int lda)
rocblas_status rocsolver_clarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, rocblas_float_complex *x, const rocblas_int incx, const rocblas_float_complex *alpha, rocblas_float_complex *A, const rocblas_int lda)
rocblas_status rocsolver_dlarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, double *x, const rocblas_int incx, const double *alpha, double *A, const rocblas_int lda)
rocblas_status rocsolver_slarf(rocblas_handle handle, const rocblas_side side, const rocblas_int m, const rocblas_int n, float *x, const rocblas_int incx, const float *alpha, float *A, const rocblas_int lda)

LARF applies a Householder reflector H to a general matrix A.

The Householder reflector H, of order m or n, is to be applied to an m-by-n matrix A from the left or the right, depending on the value of side. H is given by

\[ H = I - \text{alpha}\cdot xx' \]

where alpha is the Householder scalar and x is a Householder vector. H is never actually computed.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Determines whether H is applied from the left or the right.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of A.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of A.

  • [in] x: pointer to type. Array on the GPU of size at least 1 + (m-1)*abs(incx) if left side, or at least 1 + (n-1)*abs(incx) if right side.

    The Householder vector x.

  • [in] incx: rocblas_int. incx != 0.

    Distance between two consecutive elements of x. If incx < 0, the elements of x are indexed in reverse order.

  • [in] alpha: pointer to type. A scalar on the GPU.

    The Householder scalar. If alpha = 0, then H = I (A will remain the same; x is never used)

  • [inout] A: pointer to type. Array on the GPU of size lda*n.

    On entry, the matrix A. On exit, it is overwritten with H*A (or A*H).

  • [in] lda: rocblas_int. lda >= m.

    Leading dimension of A.

rocsolver_<type>larfb()

rocblas_status rocsolver_zlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *T, const rocblas_int ldt, rocblas_double_complex *A, const rocblas_int lda)
rocblas_status rocsolver_clarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *T, const rocblas_int ldt, rocblas_float_complex *A, const rocblas_int lda)
rocblas_status rocsolver_dlarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *V, const rocblas_int ldv, double *T, const rocblas_int ldt, double *A, const rocblas_int lda)
rocblas_status rocsolver_slarfb(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_direct direct, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *V, const rocblas_int ldv, float *T, const rocblas_int ldt, float *A, const rocblas_int lda)

LARFB applies a block reflector H to a general m-by-n matrix A.

The block reflector H is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} HA & \: \text{(No transpose from the left),}\\ H'A & \: \text{(Transpose or conjugate transpose from the left),}\\ AH & \: \text{(No transpose from the right), or}\\ AH' & \: \text{(Transpose or conjugate transpose from the right).} \end{array} \end{split}\]

The block reflector H is defined as the product of k Householder matrices as

\[\begin{split} \begin{array}{cl} H = H_1H_2\cdots H_k & \: \text{if direct indicates forward direction, or} \\ H = H_k\cdots H_2H_1 & \: \text{if direct indicates backward direction} \end{array} \end{split}\]

H is never stored. It is calculated as

\[ H = I - VTV' \]

where the i-th column of matrix V contains the Householder vector associated with \(H_i\), if storev is column-wise; or

\[ H = I - V'TV \]

where the i-th row of matrix V contains the Householder vector associated with \(H_i\), if storev is row-wise. T is the associated triangular factor as computed by LARFT.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply H.

  • [in] trans: rocblas_operation.

    Specifies whether the block reflector or its transpose/conjugate transpose is to be applied.

  • [in] direct: rocblas_direct.

    Specifies the direction in which the Householder matrices are to be applied to generate H.

  • [in] storev: rocblas_storev.

    Specifies how the Householder vectors are stored in matrix V.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix A.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix A.

  • [in] k: rocblas_int. k >= 1.

    The number of Householder matrices.

  • [in] V: pointer to type. Array on the GPU of size ldv*k if column-wise, ldv*n if row-wise and applying from the right, or ldv*m if row-wise and applying from the left.

    The matrix of Householder vectors.

  • [in] ldv: rocblas_int. ldv >= k if row-wise, ldv >= m if column-wise and applying from the left, or ldv >= n if column-wise and applying from the right.

    Leading dimension of V.

  • [in] T: pointer to type. Array on the GPU of dimension ldt*k.

    The triangular factor of the block reflector.

  • [in] ldt: rocblas_int. ldt >= k.

    The leading dimension of T.

  • [inout] A: pointer to type. Array on the GPU of size lda*n.

    On entry, the matrix A. On exit, it is overwritten with H*A, A*H, H’*A, or A*H’.

  • [in] lda: rocblas_int. lda >= m.

    Leading dimension of A.

3.2.3. Bidiagonal forms

List of functions for bidiagonal forms

rocsolver_<type>labrd()

rocblas_status rocsolver_zlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *D, double *E, rocblas_double_complex *tauq, rocblas_double_complex *taup, rocblas_double_complex *X, const rocblas_int ldx, rocblas_double_complex *Y, const rocblas_int ldy)
rocblas_status rocsolver_clabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *D, float *E, rocblas_float_complex *tauq, rocblas_float_complex *taup, rocblas_float_complex *X, const rocblas_int ldx, rocblas_float_complex *Y, const rocblas_int ldy)
rocblas_status rocsolver_dlabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *D, double *E, double *tauq, double *taup, double *X, const rocblas_int ldx, double *Y, const rocblas_int ldy)
rocblas_status rocsolver_slabrd(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *D, float *E, float *tauq, float *taup, float *X, const rocblas_int ldx, float *Y, const rocblas_int ldy)

LABRD computes the bidiagonal form of the first k rows and columns of a general m-by-n matrix A, as well as the matrices X and Y needed to reduce the remaining part of A.

The reduced form is given by:

\[ B = Q'AP \]

where the leading k-by-k block of B is upper bidiagonal if m >= n, or lower bidiagonal if m < n. Q and P are orthogonal/unitary matrices represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_k, & \text{and} \\ P = G_1G_2\cdots G_k. \end{array} \end{split}\]

Each Householder matrix \(H_i\) and \(G_i\) is given by

\[\begin{split} \begin{array}{cl} H_i = I - \text{tauq}[i]\cdot v_iv_i', & \text{and} \\ G_i = I - \text{taup}[i]\cdot u_iu_i'. \end{array} \end{split}\]

If m >= n, the first i-1 elements of the Householder vector \(v_i\) are zero, and \(v_i[i]=1\); while the first i elements of the Householder vector \(u_i\) are zero, and \(u_i[i+1]=1\). If m < n, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1]=1\); while the first i-1 elements of the Householder vector \(u_i\) are zero, and \(u_i[i]=1\).

The unreduced part of the matrix A can be updated using the block update

\[ A = A - VY' - XU' \]

where V and U are the m-by-k and n-by-k matrices formed with the vectors \(v_i\) and \(u_i\), respectively.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • [in] k: rocblas_int. min(m,n) >= k >= 0.

    The number of leading rows and columns of matrix A that will be reduced.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix to be reduced. On exit, the first k elements on the diagonal and superdiagonal (if m >= n), or subdiagonal (if m < n), contain the bidiagonal form B. If m >= n, the elements below the diagonal of the first k columns are the possibly non-zero elements of the Householder vectors associated with Q, while the elements above the superdiagonal of the first k rows are the n - i - 1 possibly non-zero elements of the Householder vectors related to P. If m < n, the elements below the subdiagonal of the first k columns are the m - i - 1 possibly non-zero elements of the Householder vectors related to Q, while the elements above the diagonal of the first k rows are the n - i possibly non-zero elements of the vectors associated with P.

  • [in] lda: rocblas_int. lda >= m.

    specifies the leading dimension of A.

  • [out] D: pointer to real type. Array on the GPU of dimension k.

    The diagonal elements of B.

  • [out] E: pointer to real type. Array on the GPU of dimension k.

    The off-diagonal elements of B.

  • [out] tauq: pointer to type. Array on the GPU of dimension k.

    The Householder scalars associated with matrix Q.

  • [out] taup: pointer to type. Array on the GPU of dimension k.

    The Householder scalars associated with matrix P.

  • [out] X: pointer to type. Array on the GPU of dimension ldx*k.

    The m-by-k matrix needed to update the unreduced part of A.

  • [in] ldx: rocblas_int. ldx >= m.

    The leading dimension of X.

  • [out] Y: pointer to type. Array on the GPU of dimension ldy*k.

    The n-by-k matrix needed to update the unreduced part of A.

  • [in] ldy: rocblas_int. ldy >= n.

    The leading dimension of Y.

rocsolver_<type>bdsqr()

rocblas_status rocsolver_zbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, rocblas_double_complex *V, const rocblas_int ldv, rocblas_double_complex *U, const rocblas_int ldu, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, rocblas_float_complex *V, const rocblas_int ldv, rocblas_float_complex *U, const rocblas_int ldu, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, double *D, double *E, double *V, const rocblas_int ldv, double *U, const rocblas_int ldu, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sbdsqr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nv, const rocblas_int nu, const rocblas_int nc, float *D, float *E, float *V, const rocblas_int ldv, float *U, const rocblas_int ldu, float *C, const rocblas_int ldc, rocblas_int *info)

BDSQR computes the singular value decomposition (SVD) of an n-by-n bidiagonal matrix B, using the implicit QR algorithm.

The SVD of B has the form:

\[ B = QSP' \]

where S is the n-by-n diagonal matrix of singular values of B, the columns of Q are the left singular vectors of B, and the columns of P are its right singular vectors.

The computation of the singular vectors is optional; this function accepts input matrices U (of size nu-by-n) and V (of size n-by-nv) that are overwritten with \(UQ\) and \(P'V\). If nu = 0 no left vectors are computed; if nv = 0 no right vectors are computed.

Optionally, this function can also compute \(Q'C\) for a given n-by-nc input matrix C.

Parameters
  • [in] handle: rocblas_handle.

  • [in] uplo: rocblas_fill.

    Specifies whether B is upper or lower bidiagonal.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of matrix B.

  • [in] nv: rocblas_int. nv >= 0.

    The number of columns of matrix V.

  • [in] nu: rocblas_int. nu >= 0.

    The number of rows of matrix U.

  • [in] nc: rocblas_int. nu >= 0.

    The number of columns of matrix C.

  • [inout] D: pointer to real type. Array on the GPU of dimension n.

    On entry, the diagonal elements of B. On exit, if info = 0, the singular values of B in decreasing order; if info > 0, the diagonal elements of a bidiagonal matrix orthogonally equivalent to B.

  • [inout] E: pointer to real type. Array on the GPU of dimension n-1.

    On entry, the off-diagonal elements of B. On exit, if info > 0, the off-diagonal elements of a bidiagonal matrix orthogonally equivalent to B (if info = 0 this matrix converges to zero).

  • [inout] V: pointer to type. Array on the GPU of dimension ldv*nv.

    On entry, the matrix V. On exit, it is overwritten with P’*V. (Not referenced if nv = 0).

  • [in] ldv: rocblas_int. ldv >= n if nv > 0, or ldv >=1 if nv = 0.

    The leading dimension of V.

  • [inout] U: pointer to type. Array on the GPU of dimension ldu*n.

    On entry, the matrix U. On exit, it is overwritten with U*Q. (Not referenced if nu = 0).

  • [in] ldu: rocblas_int. ldu >= nu.

    The leading dimension of U.

  • [inout] C: pointer to type. Array on the GPU of dimension ldc*nc.

    On entry, the matrix C. On exit, it is overwritten with Q’*C. (Not referenced if nc = 0).

  • [in] ldc: rocblas_int. ldc >= n if nc > 0, or ldc >=1 if nc = 0.

    The leading dimension of C.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, i elements of E have not converged to zero.

3.2.4. Tridiagonal forms

rocsolver_<type>latrd()

rocblas_status rocsolver_zlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, double *E, rocblas_double_complex *tau, rocblas_double_complex *W, const rocblas_int ldw)
rocblas_status rocsolver_clatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, float *E, rocblas_float_complex *tau, rocblas_float_complex *W, const rocblas_int ldw)
rocblas_status rocsolver_dlatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *E, double *tau, double *W, const rocblas_int ldw)
rocblas_status rocsolver_slatrd(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *E, float *tau, float *W, const rocblas_int ldw)

LATRD computes the tridiagonal form of k rows and columns of a symmetric/hermitian matrix A, as well as the matrix W needed to update the remaining part of A.

The reduced form is given by:

\[ T = Q'AQ \]

If uplo is lower, the first k rows and columns of T form the tridiagonal block. If uplo is upper, then the last k rows and columns of T form the tridiagonal block. Q is an orthogonal/unitary matrix represented as the product of Householder matrices

\[\begin{split} \begin{array}{cl} Q = H_1H_2\cdots H_k & \text{if uplo indicates lower, or}\\ Q = H_nH_{n-1}\cdots H_{n-k+1} & \text{if uplo is upper}. \end{array} \end{split}\]

Each Householder matrix \(H_i\) is given by

\[ H_i = I - \text{tau}[i]\cdot v_iv_i' \]

where tau[i] is the corresponding Householder scalar. When uplo indicates lower, the first i elements of the Householder vector \(v_i\) are zero, and \(v_i[i+1] = 1\). If uplo is upper, the last n-i elements of the Householder vector \(v_i\) are zero, and \(v_i[i] = 1\).

The unreduced part of the matrix A can be updated using a rank update of the form:

\[ A = A - VW' - WV' \]

where V is the n-by-k matrix formed by the vectors \(v_i\).

Parameters
  • [in] handle: rocblas_handle.

  • [in] uplo: rocblas_fill.

    Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of rows and columns of the matrix A to be reduced.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the n-by-n matrix to be reduced. On exit, if uplo is lower, the first k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements below the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns. If uplo is upper, the last k columns have been reduced to tridiagonal form (given in the diagonal elements of A and the array E), the elements above the diagonal contain the possibly non-zero entries of the Householder vectors associated with Q, stored as columns.

  • [in] lda: rocblas_int. lda >= n.

    The leading dimension of A.

  • [out] E: pointer to real type. Array on the GPU of dimension n-1.

    If upper (lower), the last (first) k elements of E are the off-diagonal elements of the computed tridiagonal block.

  • [out] tau: pointer to type. Array on the GPU of dimension n-1.

    If upper (lower), the last (first) k elements of tau are the Householder scalars related to Q.

  • [out] W: pointer to type. Array on the GPU of dimension ldw*k.

    The n-by-k matrix needed to update the unreduced part of A.

  • [in] ldw: rocblas_int. ldw >= n.

    The leading dimension of W.

rocsolver_<type>sterf()

rocblas_status rocsolver_dsterf(rocblas_handle handle, const rocblas_int n, double *D, double *E, rocblas_int *info)
rocblas_status rocsolver_ssterf(rocblas_handle handle, const rocblas_int n, float *D, float *E, rocblas_int *info)

STERF computes the eigenvalues of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the Pal-Walker-Kahan variant of the QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the tridiagonal matrix.

  • [inout] D: pointer to real type. Array on the GPU of dimension n.

    On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • [inout] E: pointer to real type. Array on the GPU of dimension n-1.

    On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, STERF did not converge. i elements of E did not converge to zero.

rocsolver_<type>steqr()

rocblas_status rocsolver_zsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_csteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dsteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_ssteqr(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)

STEQR computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

The eigenvalues of the symmetric tridiagonal matrix are computed by the implicit QL/QR algorithm, and returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters
  • [in] handle: rocblas_handle.

  • [in] evect: rocblas_evect.

    Specifies how the eigenvectors are computed.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the tridiagonal matrix.

  • [inout] D: pointer to real type. Array on the GPU of dimension n.

    On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order. If info > 0, the diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • [inout] E: pointer to real type. Array on the GPU of dimension n-1.

    On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, this array converges to zero. If info > 0, the off-diagonal elements of a tridiagonal matrix that is similar to the original matrix (i.e. has the same eigenvalues).

  • [inout] C: pointer to type. Array on the GPU of dimension ldc*n.

    On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g.,

    ORGTR or UNGTR. On exit, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

  • [in] ldc: rocblas_int. ldc >= n if evect is original or tridiagonal.

    Specifies the leading dimension of C. (Not referenced if evect is none).

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, STEQR did not converge. i elements of E did not converge to zero.

rocsolver_<type>stedc()

rocblas_status rocsolver_zstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, double *D, double *E, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sstedc(rocblas_handle handle, const rocblas_evect evect, const rocblas_int n, float *D, float *E, float *C, const rocblas_int ldc, rocblas_int *info)

STEDC computes the eigenvalues and (optionally) eigenvectors of a symmetric tridiagonal matrix.

This function uses the divide and conquer method to compute the eigenvectors. The eigenvalues are returned in increasing order.

The matrix is not represented explicitly, but rather as the array of diagonal elements D and the array of symmetric off-diagonal elements E. When D and E correspond to the tridiagonal form of a full symmetric/Hermitian matrix, as returned by, e.g., SYTRD or HETRD, the eigenvectors of the original matrix can also be computed, depending on the value of evect.

Parameters
  • [in] handle: rocblas_handle.

  • [in] evect: rocblas_evect.

    Specifies how the eigenvectors are computed.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the tridiagonal matrix.

  • [inout] D: pointer to real type. Array on the GPU of dimension n.

    On entry, the diagonal elements of the tridiagonal matrix. On exit, if info = 0, the eigenvalues in increasing order.

  • [inout] E: pointer to real type. Array on the GPU of dimension n-1.

    On entry, the off-diagonal elements of the tridiagonal matrix. On exit, if info = 0, the values of this array are destroyed.

  • [inout] C: pointer to type. Array on the GPU of dimension ldc*n.

    On entry, if evect is original, the orthogonal/unitary matrix used for the reduction to tridiagonal form as returned by, e.g.,

    ORGTR or UNGTR. On exit, if info = 0, it is overwritten with the eigenvectors of the original symmetric/Hermitian matrix (if evect is original), or the eigenvectors of the tridiagonal matrix (if evect is tridiagonal). (Not referenced if evect is none).

  • [in] ldc: rocblas_int. ldc >= n if evect is original or tridiagonal.

    Specifies the leading dimension of C. (Not referenced if evect is none).

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, STEDC failed to compute an eigenvalue on the sub-matrix formed by the rows and columns info/(n+1) through mod(info,n+1).

3.2.5. Symmetric matrices

List of functions for symmetric matrices

rocsolver_<type>lasyf()

rocblas_status rocsolver_zlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_clasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_dlasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, double *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)
rocblas_status rocsolver_slasyf(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, const rocblas_int nb, rocblas_int *kb, float *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_int *info)

LASYF computes a partial factorization of a symmetric matrix \(A\) using Bunch-Kaufman diagonal pivoting.

The partial factorization has the form

\[\begin{split} A = \left[ \begin{array}{cc} I & U_{12} \\ 0 & U_{22} \end{array} \right] \left[ \begin{array}{cc} A_{11} & 0 \\ 0 & D \end{array} \right] \left[ \begin{array}{cc} I & 0 \\ U_{12}^T & U_{22}^T \end{array} \right] \end{split}\]

or

\[\begin{split} A = \left[ \begin{array}{cc} L_{11} & 0 \\ L_{21} & I \end{array} \right] \left[ \begin{array}{cc} D & 0 \\ 0 & A_{22} \end{array} \right] \left[ \begin{array}{cc} L_{11}^T & L_{21}^T \\ 0 & I \end{array} \right] \end{split}\]

depending on the value of uplo. The order of the block diagonal matrix \(D\) is either \(nb\) or \(nb-1\), and is returned in the argument \(kb\).

Parameters
  • [in] handle: rocblas_handle.

  • [in] uplo: rocblas_fill.

    Specifies whether the upper or lower part of the matrix A is stored. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • [in] nb: rocblas_int. 2 <= nb <= n.

    The number of columns of A to be factored.

  • [out] kb: pointer to a rocblas_int on the GPU.

    The number of columns of A that were actually factored (either nb or nb-1).

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the symmetric matrix A to be factored. On exit, the partially factored matrix.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • [out] ipiv: pointer to rocblas_int. Array on the GPU of dimension n.

    The vector of pivot indices. Elements of ipiv are 1-based indices. If uplo is upper, then only the last kb elements of ipiv will be set. For n - kb < k <= n, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k-1] < 0, then rows and columns k-1 and -ipiv[k] were interchanged and D[k-1,k-1] to D[k,k] is a 2-by-2 diagonal block. If uplo is lower, then only the first kb elements of ipiv will be set. For 1 <= k <= kb, if ipiv[k] > 0 then rows and columns k and ipiv[k] were interchanged and D[k,k] is a 1-by-1 diagonal block. If, instead, ipiv[k] = ipiv[k+1] < 0, then rows and columns k+1 and -ipiv[k] were interchanged and D[k,k] to D[k+1,k+1] is a 2-by-2 diagonal block.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info[i] = j > 0, D is singular. D[j,j] is the first diagonal zero.

3.2.6. Orthonormal matrices

rocsolver_<type>org2r()

rocblas_status rocsolver_dorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorg2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORG2R generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k. \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

rocsolver_<type>orgqr()

rocblas_status rocsolver_dorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGQR generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

rocsolver_<type>orgl2()

rocblas_status rocsolver_dorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGL2 generates an m-by-n Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. 0 <= m <= n.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= m.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

rocsolver_<type>orglq()

rocblas_status rocsolver_dorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGLQ generates an m-by-n Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. 0 <= m <= n.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= m.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

rocsolver_<type>org2l()

rocblas_status rocsolver_dorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorg2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORG2L generates an m-by-n Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

rocsolver_<type>orgql()

rocblas_status rocsolver_dorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGQL generates an m-by-n Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n column of the product of k Householder reflectors of order m

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

rocsolver_<type>orgbr()

rocblas_status rocsolver_dorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv)

ORGBR generates an m-by-n Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k \]

If m < k, Q is defined as the product of Householder reflectors of order m

\[ Q = H_1H_2\cdots H_{m-1} \]

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_kH_{k-1}\cdots H_1 \]

If n <= k, Q is defined as the product of Householder reflectors of order n

\[ Q = H_{n-1}H_{n-2}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEBRD in its arguments A and tauq or taup.

Parameters
  • [in] handle: rocblas_handle.

  • [in] storev: rocblas_storev.

    Specifies whether to work column-wise or row-wise.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

  • [in] k: rocblas_int. k >= 0.

    The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

    GEBRD.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the Householder vectors as returned by

    GEBRD. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise.

    The Householder scalars as returned by

    GEBRD.

rocsolver_<type>orgtr()

rocblas_status rocsolver_dorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv)
rocblas_status rocsolver_sorgtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv)

ORGTR generates an n-by-n orthogonal Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

\[ Q = H_{n-1}H_{n-2}\cdots H_1 \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H_1H_2\cdots H_{n-1} \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by SYTRD in its arguments A and tau.

Parameters
  • [in] handle: rocblas_handle.

  • [in] uplo: rocblas_fill.

    Specifies whether the

    SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix Q.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the Householder vectors as returned by

    SYTRD. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension n-1.

    The Householder scalars as returned by

    SYTRD.

rocsolver_<type>orm2r()

rocblas_status rocsolver_dorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORM2R multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2 \cdots H_k \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQRF in the first k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>ormqr()

rocblas_status rocsolver_dormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMQR multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQRF in the first k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>orml2()

rocblas_status rocsolver_dorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORML2 multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

    The Householder vectors as returned by

    GELQF in the first k rows of its argument A.

  • [in] lda: rocblas_int. lda >= k.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>ormlq()

rocblas_status rocsolver_dormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMLQ multiplies a matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

    The Householder vectors as returned by

    GELQF in the first k rows of its argument A.

  • [in] lda: rocblas_int. lda >= k.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>orm2l()

rocblas_status rocsolver_dorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sorm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORM2L multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQLF in the last k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>ormql()

rocblas_status rocsolver_dormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMQL multiplies a matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQLF in the last k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>ormbr()

rocblas_status rocsolver_dormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMBR multiplies a matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k, \]

and if q < k, then Q is defined as the product

\[ Q = H_1H_2\cdots H_{q-1}. \]

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k, \]

and if q <= k, Q is defined as the product

\[ Q = H_1H_2\cdots H_{q-1}. \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters
  • [in] handle: rocblas_handle.

  • [in] storev: rocblas_storev.

    Specifies whether to work column-wise or row-wise.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0.

    The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

    GEBRD.

  • [in] A: pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise.

    The Householder vectors as returned by

    GEBRD.

  • [in] lda: rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least min(q,k).

    The Householder scalars as returned by

    GEBRD.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>ormtr()

rocblas_status rocsolver_dormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, double *ipiv, double *C, const rocblas_int ldc)
rocblas_status rocsolver_sormtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, float *ipiv, float *C, const rocblas_int ldc)

ORMTR multiplies an orthogonal matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^TC & \: \text{Transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^T & \: \text{Transpose from the right.} \end{array} \end{split}\]

The order q of the orthogonal matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

\[ Q = H_{q-1}H_{q-2}\cdots H_1. \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H_1H_2\cdots H_{q-1} \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by SYTRD in its arguments A and tau.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] uplo: rocblas_fill.

    Specifies whether the

    SYTRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] A: pointer to type. Array on the GPU of size lda*q.

    On entry, the Householder vectors as returned by

    SYTRD.

  • [in] lda: rocblas_int. lda >= q.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least q-1.

    The Householder scalars as returned by

    SYTRD.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

3.2.7. Unitary matrices

rocsolver_<type>ung2r()

rocblas_status rocsolver_zung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cung2r(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNG2R generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

rocsolver_<type>ungqr()

rocblas_status rocsolver_zungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungqr(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGQR generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k \]

Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQRF, with the Householder vectors in the first k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

rocsolver_<type>ungl2()

rocblas_status rocsolver_zungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungl2(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGL2 generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_k^HH_{k-1}^H\cdots H_1^H \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. 0 <= m <= n.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= m.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

rocsolver_<type>unglq()

rocblas_status rocsolver_zunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cunglq(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGLQ generates an m-by-n complex Matrix Q with orthonormal rows.

(This is the blocked version of the algorithm).

The matrix Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_k^HH_{k-1}^H\cdots H_1^H \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. 0 <= m <= n.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= m.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GELQF, with the Householder vectors in the first k rows. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

rocsolver_<type>ung2l()

rocblas_status rocsolver_zung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cung2l(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNG2L generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the unblocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

rocsolver_<type>ungql()

rocblas_status rocsolver_zungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungql(rocblas_handle handle, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGQL generates an m-by-n complex Matrix Q with orthonormal columns.

(This is the blocked version of the algorithm).

The matrix Q is defined as the last n columns of the product of k Householder reflectors of order m

\[ Q = H_kH_{k-1}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q.

  • [in] n: rocblas_int. 0 <= n <= m.

    The number of columns of the matrix Q.

  • [in] k: rocblas_int. 0 <= k <= n.

    The number of Householder reflectors.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the matrix A as returned by

    GEQLF, with the Householder vectors in the last k columns. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

rocsolver_<type>ungbr()

rocblas_status rocsolver_zungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGBR generates an m-by-n complex Matrix Q with orthonormal rows or columns.

If storev is column-wise, then the matrix Q has orthonormal columns. If m >= k, Q is defined as the first n columns of the product of k Householder reflectors of order m

\[ Q = H_1H_2\cdots H_k \]

If m < k, Q is defined as the product of Householder reflectors of order m

\[ Q = H_1H_2\cdots H_{m-1} \]

On the other hand, if storev is row-wise, then the matrix Q has orthonormal rows. If n > k, Q is defined as the first m rows of the product of k Householder reflectors of order n

\[ Q = H_kH_{k-1}\cdots H_1 \]

If n <= k, Q is defined as the product of Householder reflectors of order n

\[ Q = H_{n-1}H_{n-2}\cdots H_1 \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by GEBRD in its arguments A and tauq or taup.

Parameters
  • [in] handle: rocblas_handle.

  • [in] storev: rocblas_storev.

    Specifies whether to work column-wise or row-wise.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix Q. If row-wise, then min(n,k) <= m <= n.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix Q. If column-wise, then min(m,k) <= n <= m.

  • [in] k: rocblas_int. k >= 0.

    The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

    GEBRD.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the Householder vectors as returned by

    GEBRD. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension min(m,k) if column-wise, or min(n,k) if row-wise.

    The Householder scalars as returned by

    GEBRD.

rocsolver_<type>ungtr()

rocblas_status rocsolver_zungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv)
rocblas_status rocsolver_cungtr(rocblas_handle handle, const rocblas_fill uplo, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv)

UNGTR generates an n-by-n unitary Matrix Q.

Q is defined as the product of n-1 Householder reflectors of order n. If uplo indicates upper, then Q has the form

\[ Q = H_{n-1}H_{n-2}\cdots H_1 \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H_1H_2\cdots H_{n-1} \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors \(v_i\) and scalars \(\text{ipiv}[i]\), as returned by HETRD in its arguments A and tau.

Parameters
  • [in] handle: rocblas_handle.

  • [in] uplo: rocblas_fill.

    Specifies whether the

    HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix Q.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the Householder vectors as returned by

    HETRD. On exit, the computed matrix Q.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension n-1.

    The Householder scalars as returned by

    HETRD.

rocsolver_<type>unm2r()

rocblas_status rocsolver_zunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunm2r(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNM2R multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQRF in the first k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unmqr()

rocblas_status rocsolver_zunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmqr(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMQR multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QR factorization GEQRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQRF in the first k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, or lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQRF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unml2()

rocblas_status rocsolver_zunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunml2(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNML2 multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_k^HH_{k-1}^H\cdots H_1^H \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

    The Householder vectors as returned by

    GELQF in the first k rows of its argument A.

  • [in] lda: rocblas_int. lda >= k.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unmlq()

rocblas_status rocsolver_zunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmlq(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMLQ multiplies a complex matrix Q with orthonormal rows by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_k^HH_{k-1}^H\cdots H_1^H \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the LQ factorization GELQF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*m if side is left, or lda*n if side is right.

    The Householder vectors as returned by

    GELQF in the first k rows of its argument A.

  • [in] lda: rocblas_int. lda >= k.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GELQF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unm2l()

rocblas_status rocsolver_zunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunm2l(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNM2L multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the unblocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQLF in the last k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unmql()

rocblas_status rocsolver_zunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmql(rocblas_handle handle, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMQL multiplies a complex matrix Q with orthonormal columns by a general m-by-n matrix C.

(This is the blocked version of the algorithm).

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

Q is defined as the product of k Householder reflectors

\[ Q = H_kH_{k-1}\cdots H_1 \]

of order m if applying from the left, or n if applying from the right. Q is never stored, it is calculated from the Householder vectors and scalars returned by the QL factorization GEQLF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0; k <= m if side is left, k <= n if side is right.

    The number of Householder reflectors that form Q.

  • [in] A: pointer to type. Array on the GPU of size lda*k.

    The Householder vectors as returned by

    GEQLF in the last k columns of its argument A.

  • [in] lda: rocblas_int. lda >= m if side is left, lda >= n if side is right.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least k.

    The Householder scalars as returned by

    GEQLF.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unmbr()

rocblas_status rocsolver_zunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmbr(rocblas_handle handle, const rocblas_storev storev, const rocblas_side side, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, const rocblas_int k, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMBR multiplies a complex matrix Q with orthonormal rows or columns by a general m-by-n matrix C.

If storev is column-wise, then the matrix Q has orthonormal columns. If storev is row-wise, then the matrix Q has orthonormal rows. The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

When storev is column-wise, if q >= k, then Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k, \]

and if q < k, then Q is defined as the product

\[ Q = H_1H_2\cdots H_{q-1}. \]

When storev is row-wise, if q > k, then Q is defined as the product of k Householder reflectors

\[ Q = H_1H_2\cdots H_k, \]

and if q <= k, Q is defined as the product

\[ Q = H_1H_2\cdots H_{q-1}. \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by GEBRD in its arguments A and tauq or taup.

Parameters
  • [in] handle: rocblas_handle.

  • [in] storev: rocblas_storev.

    Specifies whether to work column-wise or row-wise.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] k: rocblas_int. k >= 0.

    The number of columns (if storev is column-wise) or rows (if row-wise) of the original matrix reduced by

    GEBRD.

  • [in] A: pointer to type. Array on the GPU of size lda*min(q,k) if column-wise, or lda*q if row-wise.

    The Householder vectors as returned by

    GEBRD.

  • [in] lda: rocblas_int. lda >= q if column-wise, or lda >= min(q,k) if row-wise.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least min(q,k).

    The Householder scalars as returned by

    GEBRD.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.

rocsolver_<type>unmtr()

rocblas_status rocsolver_zunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *ipiv, rocblas_double_complex *C, const rocblas_int ldc)
rocblas_status rocsolver_cunmtr(rocblas_handle handle, const rocblas_side side, const rocblas_fill uplo, const rocblas_operation trans, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *ipiv, rocblas_float_complex *C, const rocblas_int ldc)

UNMTR multiplies a unitary matrix Q by a general m-by-n matrix C.

The matrix Q is applied in one of the following forms, depending on the values of side and trans:

\[\begin{split} \begin{array}{cl} QC & \: \text{No transpose from the left,}\\ Q^HC & \: \text{Conjugate transpose from the left,}\\ CQ & \: \text{No transpose from the right, and}\\ CQ^H & \: \text{Conjugate transpose from the right.} \end{array} \end{split}\]

The order q of the unitary matrix Q is q = m if applying from the left, or q = n if applying from the right.

Q is defined as a product of q-1 Householder reflectors. If uplo indicates upper, then Q has the form

\[ Q = H_{q-1}H_{q-2}\cdots H_1. \]

On the other hand, if uplo indicates lower, then Q has the form

\[ Q = H_1H_2\cdots H_{q-1} \]

The Householder matrices \(H_i\) are never stored, they are computed from its corresponding Householder vectors and scalars as returned by HETRD in its arguments A and tau.

Parameters
  • [in] handle: rocblas_handle.

  • [in] side: rocblas_side.

    Specifies from which side to apply Q.

  • [in] uplo: rocblas_fill.

    Specifies whether the

    HETRD factorization was upper or lower triangular. If uplo indicates lower (or upper), then the upper (or lower) part of A is not used.

  • [in] trans: rocblas_operation.

    Specifies whether the matrix Q or its conjugate transpose is to be applied.

  • [in] m: rocblas_int. m >= 0.

    Number of rows of matrix C.

  • [in] n: rocblas_int. n >= 0.

    Number of columns of matrix C.

  • [in] A: pointer to type. Array on the GPU of size lda*q.

    On entry, the Householder vectors as returned by

    HETRD.

  • [in] lda: rocblas_int. lda >= q.

    Leading dimension of A.

  • [in] ipiv: pointer to type. Array on the GPU of dimension at least q-1.

    The Householder scalars as returned by

    HETRD.

  • [inout] C: pointer to type. Array on the GPU of size ldc*n.

    On entry, the matrix C. On exit, it is overwritten with Q*C, C*Q, Q’*C, or C*Q’.

  • [in] ldc: rocblas_int. ldc >= m.

    Leading dimension of C.