3.4. Lapack-like Functions

Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:

Note

Throughout the APIs’ descriptions, we use the following notations:

  • x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.

  • If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.

  • x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

3.4.1. Triangular factorizations

rocsolver_<type>getf2_npvt()

rocblas_status rocsolver_zgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_cgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_dgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_sgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)

GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getf2_npvt_batched()

rocblas_status rocsolver_zgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)

GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • [inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getf2_npvt_strided_batched()

rocblas_status rocsolver_zgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)

GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • [inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • [in] strideA: rocblas_stride.

    Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getrf_npvt()

rocblas_status rocsolver_zgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_cgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_dgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_sgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)

GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of the matrix A.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of the matrix A.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of A.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getrf_npvt_batched()

rocblas_status rocsolver_zgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)

GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • [inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getrf_npvt_strided_batched()

rocblas_status rocsolver_zgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)

GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.

Parameters
  • [in] handle: rocblas_handle.

  • [in] m: rocblas_int. m >= 0.

    The number of rows of all matrices A_i in the batch.

  • [in] n: rocblas_int. n >= 0.

    The number of columns of all matrices A_i in the batch.

  • [inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.

  • [in] lda: rocblas_int. lda >= m.

    Specifies the leading dimension of matrices A_i.

  • [in] strideA: rocblas_stride.

    Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

3.4.2. Linear-systems solvers

rocsolver_<type>getri_npvt()

rocblas_status rocsolver_zgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_cgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_dgetri_npvt(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)
rocblas_status rocsolver_sgetri_npvt(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)

GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.

The inverse is computed by solving the linear system

\[ A^{-1}L = U^{-1} \]

where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • [inout] A: pointer to type. Array on the GPU of dimension lda*n.

    On entry, the factors L and U of the factorization A = L*U returned by

    GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_batched()

rocblas_status rocsolver_zgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)

GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by

    GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_strided_batched()

rocblas_status rocsolver_zgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)

GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).

    On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [in] strideA: rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_outofplace()

rocblas_status rocsolver_zgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dgetri_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sgetri_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)

GETRI_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = PLU\) as given by GETRF.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • [in] A: pointer to type. Array on the GPU of dimension lda*n.

    The factors L and U of the factorization A = P*L*U returned by

    GETRF.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • [in] ipiv: pointer to rocblas_int. Array on the GPU of dimension n.

    The pivot indices returned by

    GETRF.

  • [out] C: pointer to type. Array on the GPU of dimension ldc*n.

    If info = 0, the inverse of A. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_outofplace_batched()

rocblas_status rocsolver_zgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)

GETRI_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_BATCHED.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [in] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by

    GETRF_BATCHED.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [in] ipiv: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    The pivot indices returned by

    GETRF_BATCHED.

  • [in] strideP: rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • [out] C: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_outofplace_strided_batched()

rocblas_status rocsolver_zgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)

GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_STRIDED_BATCHED.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [in] A: pointer to type. Array on the GPU (the size depends on the value of strideA).

    The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by

    GETRF_STRIDED_BATCHED.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [in] strideA: rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • [in] ipiv: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).

    The pivot indices returned by

    GETRF_STRIDED_BATCHED.

  • [in] strideP: rocblas_stride.

    Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.

  • [out] C: pointer to type. Array on the GPU (the size depends on the value of strideC).

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • [in] strideC: rocblas_stride.

    Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace()

rocblas_status rocsolver_zgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_cgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_dgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)
rocblas_status rocsolver_sgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)

GETRI_NPVT_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A without partial pivoting.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = LU\) as given by GETRF_NPVT.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of the matrix A.

  • [in] A: pointer to type. Array on the GPU of dimension lda*n.

    The factors L and U of the factorization A = L*U returned by

    GETRF_NPVT.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of A.

  • [out] C: pointer to type. Array on the GPU of dimension ldc*n.

    If info = 0, the inverse of A. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C.

  • [out] info: pointer to a rocblas_int on the GPU.

    If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_outofplace_batched()

rocblas_status rocsolver_zgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)

GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_BATCHED.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [in] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.

    The factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_BATCHED.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [out] C: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace_strided_batched()

rocblas_status rocsolver_zgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_cgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_dgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)
rocblas_status rocsolver_sgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)

GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_STRIDED_BATCHED.

Parameters
  • [in] handle: rocblas_handle.

  • [in] n: rocblas_int. n >= 0.

    The number of rows and columns of all matrices A_j in the batch.

  • [in] A: pointer to type. Array on the GPU (the size depends on the value of strideA).

    The factors L_j and U_j of the factorization A_j = L_j*U_j returned by

    GETRF_NPVT_STRIDED_BATCHED.

  • [in] lda: rocblas_int. lda >= n.

    Specifies the leading dimension of matrices A_j.

  • [in] strideA: rocblas_stride.

    Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n

  • [out] C: pointer to type. Array on the GPU (the size depends on the value of strideC).

    If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.

  • [in] ldc: rocblas_int. ldc >= n.

    Specifies the leading dimension of C_j.

  • [in] strideC: rocblas_stride.

    Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n

  • [out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.

    If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.

  • [in] batch_count: rocblas_int. batch_count >= 0.

    Number of matrices in the batch.