TI C6678 DSP evaluation module
In the News
- FLAME Project receives three year NSF Software Infrastructure for Sustained Innovation (SI2) grant
Nov. 14, 2011: Texas Instruments and UT Austin collaborate to deliver linear algebra library on TI's high performance multicore DSPs
- libflame attains full functionality on TI C66x Family of DSPs
"Look Mom, No Fortran!"
libflame can now be compiled without requiring a Fortran compiler.
(Fortran LAPACK interface still supported)- 24 Oct. 2011. With the new EVD and SVD solvers, libflame achieves most functionality of LAPACK
Contents
Objective
The objective of the FLAME project is to transform the development of dense linear algebra libraries from an art reserved for experts to a science that can be understood by novice and expert alike. Rather than being only a library, the project encompasses a new notation for expressing algorithms, a methodology for systematic derivation of algorithms, Application Program Interfaces (APIs) for representing the algorithms in code, and tools for mechanical derivation, implementation and analysis of algorithms and implementations.
The Library: libflame
Many visitors of this wiki will be primarily interested in the high-performance dense linear algebra library that has resulted from the FLAME project. Information on how to download and use this library can be found HERE.
libflame contains implementations of many operations that are provided by the BLAS and LAPACK libraries. However, not all FLAME implementions support every datatype. Also, in many cases, we use a different naming convention for our routine names. The following table summarizes which routines are supported within libflame and also provides their corresponding netlib name for reference. operation name netlib routine name libflame routine name FLAME/C FLASH GPU support type support lapack2flame support Elemental libflame routine prefix FLA_ FLASH_* FLASH_# Level-3 BLAS general matrix-matrix multiply ?gemm Gemm y y y sdcz N/A y hermitian matrix-matrix multiply ?hemm Hemm y y y sdcz N/A y hermitian rank-k update ?herk Herk y y y sdcz N/A y hermitian rank-2k update ?her2k Her2k y y y sdcz N/A y symmetric matrix-matrix multiply ?symm Symm y y y sdcz N/A y symmetric rank-k update ?syrk Syrk y y y sdcz N/A y symmetric rank-2k update ?syr2k Syr2k y y y sdcz N/A y triangular matrix multiply ?trmm Trmm y y y sdcz N/A y triangular solve with multiple right-hand sides ?trsm Trsm y y y sdcz N/A y LAPACK-level Cholesky factorization ?potrf Chol y y y sdcz sdcz y LU factorization with no pivoting ~ LU_nopiv y y y y LU factorization with partial pivoting ?getrf LU_piv y y y sdcz sdcz y LU factorization with incremental pivoting ~ LU_incpiv y sdcz N/A QR factorization (via UT Householder transforms) ?geqrf QR_UT y y y sdcz sdcz y QR factorization (via incremental UT Householder transforms) ~ QR_UT_inc y sdcz N/A LQ factorization (via UT Householder transforms) ?gelqf LQ_UT y y y sdcz sdcz y Up-and-Downdate Cholesky/QR factor ~ UDdate_UT y sdcz Up-and-Downdate Cholesky/QR factor (via incremental UT Householder-like transforms) ~ UDdate_UT_inc y sdcz N/A Triangular matrix inversion ?trtri Trinv y y y sdcz sdcz y Triangular-transpose matrix multiply ?lauum Ttmm y y y sdcz sdcz y SPD/HPD inversion ?potri+ SPDinv y y y sdcz sdcz y Triangular Sylvester equation solve ?trsyl^ Sylv y y y sdcz sdcz Triangular Lyapunov equation solve ~ Lyap y y y Reduction of Hermitian-positive definite eigenproblem to standard form [sd]sygst, [cz]hegst Eig_gest y y y sdcz sdcz y Reduction to upper Hessenberg form ?gehrd Hess_UT y sdcz sdcz Reduction to tridiagonal form [sd]sytrd, [cz]hetrd Tridiag_UT y sdcz sdcz y Reduction to bidiagonal form ?gebrd Bidiag_UT y sdcz sdcz y Symmetric/Hermitian Eigenvalue Decomposition [sd]syev, [cz]heev Hevd y dz y Generalized Symmetric/Hermitian Eigenvalue Decomposition [sd]sygvx, [cz]hegvx Soon! y Skew Symmetric/Hermitian Eigenvalue Decomposition ~ Soon! y Singular Value Decomposition ?gesvd Svd y dz y Notes: * Invocations of routines with the FLASH_ prefix call SuperMatrix by default. If SuperMatrix was not enabled at configure-time, or it was disabled at runtime with FLASH_Queue_disable(), then FLASH_ routines execute sequentially, though they will still use hierarchical storage.
Related publications
Field G. Van Zee. libflame: The Complete Reference. www.lulu.com, 2010.
Field G. Van Zee, Ernie Chan, Robert van de Geijn, Enrique S. Quintana-Orti, and Gregorio Quintana-Orti. "Introducing: The libflame Library for Dense Matrix Computations." IEEE Computing in Science & Engineering. 11(6):56--62, 2009.
Notation
The key insight that enables the FLAME methodology is a new, more stylized notation for expressing loop-based linear algebra algorithms. This notation closely resembles how algorithms are naturally illustrated with pictures.
Related publications
- Paolo Bientinesi, Enrique S. Quintana-Ortí, and Robert van de Geijn. "Representing Linear Algebra Algorithms in Code: The FLAME APIs," TOMS, 31(1):27-59, March 2005.
- Robert A. van de Geijn and Enrique S. Quintana-Ortí. The Science of Programming Matrix Computations. www.lulu.com. 2008.
Derivation
The FLAME project promotes the systematic derivation of loop-based algorithms hand-in-hand with the proof of their correctness. Key is the ability to identify the loop-invariant: the state to be maintained before and after each loop iteration, which then prescribes the loop-guard, the initialization before the loop, how to progress through the operand(s), and the updates. To derive the shown algorithm for LU factorization one fills in the below "worksheet". In the grey-shaded areas predicates appear that ensure the correctness of the algorithm.
Related publications
- John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn, "FLAME: Formal Linear Algebra Methods Environment," TOMS, 27(4):422-455, December 2001.
- Paolo Bientinesi, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Ortí, and Robert van de Geijn, "The Science of Deriving Dense Linear Algebra Algorithms," TOMS, 31(1):1-26, March 2005.
- Robert A. van de Geijn and Enrique S. Quintana-Ortí. The Science of Programming Matrix Computations. www.lulu.com. 2008.
function [ A_out ] = LU_blk_var1( A, nb_alg )
[ ATL, ATR, ...
ABL, ABR ] = FLA_Part_2x2( A, ...
0, 0, 'FLA_TL');
while ( size( ATL, 1 ) < size( A, 1 ) )
b = min( size( ABR, 1 ), nb_alg );
[ A00, A01, A02, ...
A10, A11, A12, ...
A20, A21, A22 ] = ...
FLA_Repart_2x2_to_3x3( ATL, ATR, ...
ABL, ABR, b, b, 'FLA_BR');
%----------------------------------------------%
A01 = trilu( A00 ) \ A01;
A10 = A10 / triu( A00 );
A11 = A11 - A10 * A01;
A11 = LU_unb_var1( A11 );
%----------------------------------------------%
[ ATL, ATR, ...
ABL, ABR ] = ...
FLA_Cont_with_3x3_to_2x2( A00, A01, A02, ...
A10, A11, A12, ...
A20, A21, A22, ...
'FLA_TL');
end
A_out = [ ATL, ATR
ABL, ABR ];
return
APIs
A number of APIs have been defined fo representing the algorithms in different languages. These include
FLAME LaTeX commads for typesetting algorithms and worksheets. |
|
FLAME M-script (Matlab/Octave) API. |
|
FLAME/C |
FLAME API for the C programming language. |
FLASH |
Extension that allows matrices to be viewed (hierarchically) as submatrices |
FLARE |
The Formal Linear Algebra Recovery Environment (FLARE) adds algorithmic fault-tolerance to some FLAME implementations. |
Runtime system for scheduling tasks with submatrices to threads |
|
C++ library for distributed memory architectures by Jack Poulson |
Related publications
- Paolo Bientinesi, Enrique S. Quintana-Ortí, and Robert van de Geijn. "Representing Linear Algebra Algorithms in Code: The FLAME APIs," TOMS, 31(1):27-59, March 2005.
- Tze Meng Low, Kent Milfeld, Robert van de Geijn, and Field Van Zee. "Parallelizing FLAME Code with OpenMP Task Queues," FLAME Working Note #15. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-2004-50.
- John A. Gunnels, Daniel S. Katz, Enrique S. Quintana-Ortí, and Robert van de Geijn. "Fault-Tolerant High-Performance Matrix-Matrix Multiplication: Theory and Practice," The International Conference for Dependable Systems and Networks (DSN-2001), pp. 47-56, July 2-4, 2001.
- Robert A. van de Geijn and Enrique S. Quintana-Ortí. The Science of Programming Matrix Computations. www.lulu.com. 2008.
Downloads
Libraries and materials can be downloaded for those interested in trying the approach and resulting libraries.
Publications and Reference Materials
We maintain a number of documents and web pages.
Robert A. van de Geijn and Enrique S. Quintana-Orti. The Science of Programming Matrix Computations. www.lulu.com, 2008.
Field G. Van Zee. libflame: The Complete Reference. www.lulu.com, 2010.
- FLAME/C online reference materials
A listing of all routines that are part of the libflame library has been automatically compiled with Doxygen.
Information for Developers
libflame and related projects are maintained in a repository managed with subversion.
Contact us
Please e-mail us at flame@cs.utexas.edu , field@cs.utexas.edu , or rvdg@cs.utexas.edu





