3.4. Lapack-like Functions¶

Other Lapack-like routines provided by rocSOLVER. These are divided into the following subcategories:

Triangular factorizations. Based on Gaussian elimination.
Linear-systems solvers. Based on triangular factorizations.

Note

Throughout the APIs’ descriptions, we use the following notations:

x[i] stands for the i-th element of vector x, while A[i,j] represents the element in the i-th row and j-th column of matrix A. Indices are 1-based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.

3.4.1. Triangular factorizations¶

List of Lapack-like triangular factorizations

rocsolver_<type>getf2_npvt()
rocsolver_<type>getf2_npvt_batched()
rocsolver_<type>getf2_npvt_strided_batched()
rocsolver_<type>getrf_npvt()
rocsolver_<type>getrf_npvt_batched()
rocsolver_<type>getrf_npvt_strided_batched()

rocsolver_<type>getf2_npvt()¶

rocblas_status rocsolver_zgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_cgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_dgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_sgetf2_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶

GETF2_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2 routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of the matrix A.
[in] n: rocblas_int. n >= 0.
The number of columns of the matrix A.
[inout] A: pointer to type. Array on the GPU of dimension lda*n.
On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of A.
[out] info: pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getf2_npvt_batched()¶

rocblas_status rocsolver_zgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetf2_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

GETF2_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_BATCHED routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of all matrices A_i in the batch.
[in] n: rocblas_int. n >= 0.
The number of columns of all matrices A_i in the batch.
[inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_i.
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getf2_npvt_strided_batched()¶

rocblas_status rocsolver_zgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetf2_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

GETF2_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the unblocked Level-2-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with small and mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETF2_STRIDED_BATCHED routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of all matrices A_i in the batch.
[in] n: rocblas_int. n >= 0.
The number of columns of all matrices A_i in the batch.
[inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_i.
[in] strideA: rocblas_stride.
Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getrf_npvt()¶

rocblas_status rocsolver_zgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_cgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_dgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_sgetrf_npvt(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶

GETRF_NPVT computes the LU factorization of a general m-by-n matrix A without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization has the form

\[ A = LU \]

where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and U is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of the matrix A.
[in] n: rocblas_int. n >= 0.
The number of columns of the matrix A.
[inout] A: pointer to type. Array on the GPU of dimension lda*n.
On entry, the m-by-n matrix A to be factored. On exit, the factors L and U from the factorization. The unit diagonal elements of L are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of A.
[out] info: pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = j > 0, U is singular. U[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.

rocsolver_<type>getrf_npvt_batched()¶

rocblas_status rocsolver_zgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetrf_npvt_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

GETRF_NPVT_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_BATCHED routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of all matrices A_i in the batch.
[in] n: rocblas_int. n >= 0.
The number of columns of all matrices A_i in the batch.
[inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorizations. The unit diagonal elements of L_i are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_i.
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getrf_npvt_strided_batched()¶

rocblas_status rocsolver_zgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetrf_npvt_strided_batched(rocblas_handle handle, const rocblas_int m, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

GETRF_NPVT_STRIDED_BATCHED computes the LU factorization of a batch of general m-by-n matrices without partial pivoting.

(This is the blocked Level-3-BLAS version of the algorithm. An optimized internal implementation without rocBLAS calls could be executed with mid-size matrices if optimizations are enabled (default option). For more details, see the “Tuning rocSOLVER performance” section of the Library Design Guide).

The factorization of matrix \(A_i\) in the batch has the form

\[ A_i = L_iU_i \]

where \(L_i\) is lower triangular with unit diagonal elements (lower trapezoidal if m > n), and \(U_i\) is upper triangular (upper trapezoidal if m < n).

Note: Although this routine can offer better performance, Gaussian elimination without pivoting is not backward stable. If numerical accuracy is compromised, use the legacy-LAPACK-like API GETRF_STRIDED_BATCHED routines instead.

Parameters

[in] handle: rocblas_handle.
[in] m: rocblas_int. m >= 0.
The number of rows of all matrices A_i in the batch.
[in] n: rocblas_int. n >= 0.
The number of columns of all matrices A_i in the batch.
[inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the m-by-n matrices A_i to be factored. On exit, the factors L_i and U_i from the factorization. The unit diagonal elements of L_i are not stored.
[in] lda: rocblas_int. lda >= m.
Specifies the leading dimension of matrices A_i.
[in] strideA: rocblas_stride.
Stride from the start of one matrix A_i to the next one A_(i+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[i] = 0, successful exit for factorization of A_i. If info[i] = j > 0, U_i is singular. U_i[j,j] is the first zero element in the diagonal. The factorization from this point might be incomplete.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

3.4.2. Linear-systems solvers¶

List of Lapack-like linear solvers

rocsolver_<type>getri_npvt()
rocsolver_<type>getri_npvt_batched()
rocsolver_<type>getri_npvt_strided_batched()
rocsolver_<type>getri_outofplace()
rocsolver_<type>getri_outofplace_batched()
rocsolver_<type>getri_outofplace_strided_batched()
rocsolver_<type>getri_npvt_outofplace()
rocsolver_<type>getri_npvt_outofplace_batched()
rocsolver_<type>getri_npvt_outofplace_strided_batched()

rocsolver_<type>getri_npvt()¶

rocblas_status rocsolver_zgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_cgetri_npvt(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_dgetri_npvt(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *info)¶

rocblas_status rocsolver_sgetri_npvt(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *info)¶

GETRI_NPVT inverts a general n-by-n matrix A using the LU factorization computed by GETRF_NPVT.

The inverse is computed by solving the linear system

\[ A^{-1}L = U^{-1} \]

where L is the lower triangular factor of A with unit diagonal elements, and U is the upper triangular factor.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
[inout] A: pointer to type. Array on the GPU of dimension lda*n.
On entry, the factors L and U of the factorization A = L*U returned by
GETRF_NPVT. On exit, the inverse of A if info = 0; otherwise undefined.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of A.
[out] info: pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_batched()¶

rocblas_status rocsolver_zgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_npvt_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_NPVT_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[inout] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
On entry, the factors L_j and U_j of the factorization A = L_j*U_j returned by
GETRF_NPVT_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getri_npvt_strided_batched()¶

rocblas_status rocsolver_zgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_npvt_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_NPVT_STRIDED_BATCHED inverts a batch of general n-by-n matrices using the LU factorization computed by GETRF_NPVT_STRIDED_BATCHED.

The inverse of matrix \(A_j\) in the batch is computed by solving the linear system

\[ A_j^{-1} L_j = U_j^{-1} \]

where \(L_j\) is the lower triangular factor of \(A_j\) with unit diagonal elements, and \(U_j\) is the upper triangular factor.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[inout] A: pointer to type. Array on the GPU (the size depends on the value of strideA).
On entry, the factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED. On exit, the inverses of A_j if info[j] = 0; otherwise undefined.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[in] strideA: rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getri_outofplace()¶

rocblas_status rocsolver_zgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_cgetri_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_int *ipiv, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_dgetri_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, rocblas_int *ipiv, double *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_sgetri_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, rocblas_int *ipiv, float *C, const rocblas_int ldc, rocblas_int *info)¶

GETRI_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = PLU\) as given by GETRF.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
[in] A: pointer to type. Array on the GPU of dimension lda*n.
The factors L and U of the factorization A = P*L*U returned by
GETRF.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of A.
[in] ipiv: pointer to rocblas_int. Array on the GPU of dimension n.
The pivot indices returned by
GETRF.
[out] C: pointer to type. Array on the GPU of dimension ldc*n.
If info = 0, the inverse of A. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C.
[out] info: pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_outofplace_batched()¶

rocblas_status rocsolver_zgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, rocblas_int *ipiv, const rocblas_stride strideP, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_BATCHED.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[in] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_BATCHED.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[in] ipiv: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
The pivot indices returned by
GETRF_BATCHED.
[in] strideP: rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(i+j). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] C: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getri_outofplace_strided_batched()¶

rocblas_status rocsolver_zgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_int *ipiv, const rocblas_stride strideP, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\).

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = P_j L_j U_j\) as given by GETRF_STRIDED_BATCHED.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[in] A: pointer to type. Array on the GPU (the size depends on the value of strideA).
The factors L_j and U_j of the factorization A_j = P_j*L_j*U_j returned by
GETRF_STRIDED_BATCHED.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[in] strideA: rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[in] ipiv: pointer to rocblas_int. Array on the GPU (the size depends on the value of strideP).
The pivot indices returned by
GETRF_STRIDED_BATCHED.
[in] strideP: rocblas_stride.
Stride from the start of one vector ipiv_j to the next one ipiv_(j+1). There is no restriction for the value of strideP. Normal use case is strideP >= n.
[out] C: pointer to type. Array on the GPU (the size depends on the value of strideC).
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
[in] strideC: rocblas_stride.
Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace()¶

rocblas_status rocsolver_zgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, rocblas_double_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_cgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, rocblas_float_complex *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_dgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, double *C, const rocblas_int ldc, rocblas_int *info)¶

rocblas_status rocsolver_sgetri_npvt_outofplace(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, float *C, const rocblas_int ldc, rocblas_int *info)¶

GETRI_NPVT_OUTOFPLACE computes the inverse \(C = A^{-1}\) of a general n-by-n matrix A without partial pivoting.

The inverse is computed by solving the linear system

\[ AC = I \]

where I is the identity matrix, and A is factorized as \(A = LU\) as given by GETRF_NPVT.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of the matrix A.
[in] A: pointer to type. Array on the GPU of dimension lda*n.
The factors L and U of the factorization A = L*U returned by
GETRF_NPVT.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of A.
[out] C: pointer to type. Array on the GPU of dimension ldc*n.
If info = 0, the inverse of A. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C.
[out] info: pointer to a rocblas_int on the GPU.
If info = 0, successful exit. If info = i > 0, U is singular. U[i,i] is the first zero pivot.

rocsolver_<type>getri_npvt_outofplace_batched()¶

rocblas_status rocsolver_zgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *const A[], const rocblas_int lda, rocblas_double_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *const A[], const rocblas_int lda, rocblas_float_complex *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, double *const A[], const rocblas_int lda, double *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_npvt_outofplace_batched(rocblas_handle handle, const rocblas_int n, float *const A[], const rocblas_int lda, float *const C[], const rocblas_int ldc, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_NPVT_OUTOFPLACE_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_BATCHED.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[in] A: array of pointers to type. Each pointer points to an array on the GPU of dimension lda*n.
The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_BATCHED.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[out] C: array of pointers to type. Each pointer points to an array on the GPU of dimension ldc*n.
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.

rocsolver_<type>getri_npvt_outofplace_strided_batched()¶

rocblas_status rocsolver_zgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_double_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_double_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_cgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, rocblas_float_complex *A, const rocblas_int lda, const rocblas_stride strideA, rocblas_float_complex *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_dgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, double *A, const rocblas_int lda, const rocblas_stride strideA, double *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

rocblas_status rocsolver_sgetri_npvt_outofplace_strided_batched(rocblas_handle handle, const rocblas_int n, float *A, const rocblas_int lda, const rocblas_stride strideA, float *C, const rocblas_int ldc, const rocblas_stride strideC, rocblas_int *info, const rocblas_int batch_count)¶

GETRI_NPVT_OUTOFPLACE_STRIDED_BATCHED computes the inverse \(C_j = A_j^{-1}\) of a batch of general n-by-n matrices \(A_j\) without partial pivoting.

The inverse is computed by solving the linear system

\[ A_j C_j = I \]

where I is the identity matrix, and \(A_j\) is factorized as \(A_j = L_j U_j\) as given by GETRF_NPVT_STRIDED_BATCHED.

Parameters

[in] handle: rocblas_handle.
[in] n: rocblas_int. n >= 0.
The number of rows and columns of all matrices A_j in the batch.
[in] A: pointer to type. Array on the GPU (the size depends on the value of strideA).
The factors L_j and U_j of the factorization A_j = L_j*U_j returned by
GETRF_NPVT_STRIDED_BATCHED.
[in] lda: rocblas_int. lda >= n.
Specifies the leading dimension of matrices A_j.
[in] strideA: rocblas_stride.
Stride from the start of one matrix A_j to the next one A_(j+1). There is no restriction for the value of strideA. Normal use case is strideA >= lda*n
[out] C: pointer to type. Array on the GPU (the size depends on the value of strideC).
If info[j] = 0, the inverse of matrices A_j. Otherwise, undefined.
[in] ldc: rocblas_int. ldc >= n.
Specifies the leading dimension of C_j.
[in] strideC: rocblas_stride.
Stride from the start of one matrix C_j to the next one C_(j+1). There is no restriction for the value of strideC. Normal use case is strideC >= ldc*n
[out] info: pointer to rocblas_int. Array of batch_count integers on the GPU.
If info[j] = 0, successful exit for inversion of A_j. If info[j] = i > 0, U_j is singular. U_j[i,i] is the first zero pivot.
[in] batch_count: rocblas_int. batch_count >= 0.
Number of matrices in the batch.