attachment:FLAMEbanner.png

http://www.cs.utexas.edu/users/rvdg/Pictures/TI_DSP.png
TI C6678 DSP evaluation module

In the News



Objective

The objective of the FLAME project is to transform the development of dense linear algebra libraries from an art reserved for experts to a science that can be understood by novice and expert alike. Rather than being only a library, the project encompasses a new notation for expressing algorithms, a methodology for systematic derivation of algorithms, Application Program Interfaces (APIs) for representing the algorithms in code, and tools for mechanical derivation, implementation and analysis of algorithms and implementations.


http://www.lulu.com/content/5915632

The Library: libflame

Many visitors of this wiki will be primarily interested in the high-performance dense linear algebra library that has resulted from the FLAME project. Information on how to download and use this library can be found HERE.

libflame contains implementations of many operations that are provided by the BLAS and LAPACK libraries. However, not all FLAME implementions support every datatype. Also, in many cases, we use a different naming convention for our routine names. The following table summarizes which routines are supported within libflame and also provides their corresponding netlib name for reference.

operation name

netlib routine name

libflame routine name

FLAME/C

FLASH

GPU support

type support

lapack2flame support

Elemental

libflame routine prefix

FLA_

FLASH_*

FLASH_#

Level-3 BLAS

general matrix-matrix multiply

?gemm

Gemm

y

y

y

sdcz

N/A

y

hermitian matrix-matrix multiply

?hemm

Hemm

y

y

y

sdcz

N/A

y

hermitian rank-k update

?herk

Herk

y

y

y

sdcz

N/A

y

hermitian rank-2k update

?her2k

Her2k

y

y

y

sdcz

N/A

y

symmetric matrix-matrix multiply

?symm

Symm

y

y

y

sdcz

N/A

y

symmetric rank-k update

?syrk

Syrk

y

y

y

sdcz

N/A

y

symmetric rank-2k update

?syr2k

Syr2k

y

y

y

sdcz

N/A

y

triangular matrix multiply

?trmm

Trmm

y

y

y

sdcz

N/A

y

triangular solve with multiple right-hand sides

?trsm

Trsm

y

y

y

sdcz

N/A

y

LAPACK-level

Cholesky factorization

?potrf

Chol

y

y

y

sdcz

sdcz

y

LU factorization with no pivoting

~

LU_nopiv

y

y

y

y

LU factorization with partial pivoting

?getrf

LU_piv

y

y

y

sdcz

sdcz

y

LU factorization with incremental pivoting

~

LU_incpiv

y

sdcz

N/A

QR factorization (via UT Householder transforms)

?geqrf

QR_UT

y

y

y

sdcz

sdcz

y

QR factorization (via incremental UT Householder transforms)

~

QR_UT_inc

y

sdcz

N/A

LQ factorization (via UT Householder transforms)

?gelqf

LQ_UT

y

y

y

sdcz

sdcz

y

Up-and-Downdate Cholesky/QR factor

~

UDdate_UT

y

sdcz

Up-and-Downdate Cholesky/QR factor (via incremental UT Householder-like transforms)

~

UDdate_UT_inc

y

sdcz

N/A

Triangular matrix inversion

?trtri

Trinv

y

y

y

sdcz

sdcz

y

Triangular-transpose matrix multiply

?lauum

Ttmm

y

y

y

sdcz

sdcz

y

SPD/HPD inversion

?potri+

SPDinv

y

y

y

sdcz

sdcz

y

Triangular Sylvester equation solve

?trsyl^

Sylv

y

y

y

sdcz

sdcz

Triangular Lyapunov equation solve

~

Lyap

y

y

y

Reduction of Hermitian-positive definite eigenproblem to standard form

[sd]sygst, [cz]hegst

Eig_gest

y

y

y

sdcz

sdcz

y

Reduction to upper Hessenberg form

?gehrd

Hess_UT

y

sdcz

sdcz

Reduction to tridiagonal form

[sd]sytrd, [cz]hetrd

Tridiag_UT

y

sdcz

sdcz

y

Reduction to bidiagonal form

?gebrd

Bidiag_UT

y

sdcz

sdcz

y

Symmetric/Hermitian Eigenvalue Decomposition

[sd]syev, [cz]heev

Hevd

y

dz

y

Generalized Symmetric/Hermitian Eigenvalue Decomposition

[sd]sygvx, [cz]hegvx

Soon!

y

Skew Symmetric/Hermitian Eigenvalue Decomposition

~

Soon!

y

Singular Value Decomposition

?gesvd

Svd

y

dz

y

Notes:

  • y These routines are provided by libflame.
  • ? Expands to one of {sdcz}.
  • ~ These routines are not provided by LAPACK.
  • + The LAPACK routine ?potri() differs from FLA_SPDinv() and FLASH_SPDinv() in that ?potri() require the user to invoke the Cholesky factorization manually and then pass in the result as input, whereas the FLAME implementations perform the Cholesky factorization internally and automatically.
  • ^ LAPACK provides only an unblocked implementation of the triangular Sylvester equation solver. The lapack2flame compatibility interface maps invocations of ?trsyl() to the blocked implementation in libflame.
  • * Invocations of routines with the FLASH_ prefix call SuperMatrix by default. If SuperMatrix was not enabled at configure-time, or it was disabled at runtime with FLASH_Queue_disable(), then FLASH_ routines execute sequentially, though they will still use hierarchical storage.

  • # GPU support must be enabled at configure-time and then invoked with FLASH_Queue_enable_gpu().

Related publications


http://z.cs.utexas.edu/wiki/LA.wiki/LU_factorization

Notation

The key insight that enables the FLAME methodology is a new, more stylized notation for expressing loop-based linear algebra algorithms. This notation closely resembles how algorithms are naturally illustrated with pictures.

Related publications


http://z.cs.utexas.edu/wiki/LA.wiki/LU_factorization

http://www.lulu.com/content/1911788

Derivation

The FLAME project promotes the systematic derivation of loop-based algorithms hand-in-hand with the proof of their correctness. Key is the ability to identify the loop-invariant: the state to be maintained before and after each loop iteration, which then prescribes the loop-guard, the initialization before the loop, how to progress through the operand(s), and the updates. To derive the shown algorithm for LU factorization one fills in the below "worksheet". In the grey-shaded areas predicates appear that ensure the correctness of the algorithm.

Related publications


function [ A_out ] = LU_blk_var1( A, nb_alg )

  [ ATL, ATR, ...
    ABL, ABR ] = FLA_Part_2x2( A, ...
                               0, 0, 'FLA_TL');

  while ( size( ATL, 1 ) < size( A, 1 ) )

    b = min( size( ABR, 1 ), nb_alg );

    [ A00, A01, A02, ...
      A10, A11, A12, ...
      A20, A21, A22 ] = ...
      FLA_Repart_2x2_to_3x3( ATL, ATR, ...
                             ABL, ABR, b, b, 'FLA_BR');

    %----------------------------------------------%

    A01 = trilu( A00 ) \ A01;
    A10 = A10 / triu( A00 );
    A11 = A11 - A10 * A01;
    A11 = LU_unb_var1( A11 );

    %----------------------------------------------%

    [ ATL, ATR, ...
      ABL, ABR ] = ...
      FLA_Cont_with_3x3_to_2x2( A00, A01, A02, ...
                                A10, A11, A12, ...
                                A20, A21, A22, ...
                                'FLA_TL');
  end

  A_out = [ ATL, ATR
            ABL, ABR ];
  return 

APIs

A number of APIs have been defined fo representing the algorithms in different languages. These include

FLaTeX

FLAME LaTeX commads for typesetting algorithms and worksheets.

FLAME@lab

FLAME M-script (Matlab/Octave) API.

FLAME/C

FLAME API for the C programming language.

FLASH

Extension that allows matrices to be viewed (hierarchically) as submatrices

FLARE

The Formal Linear Algebra Recovery Environment (FLARE) adds algorithmic fault-tolerance to some FLAME implementations.

SuperMatrix

Runtime system for scheduling tasks with submatrices to threads

Elemental

C++ library for distributed memory architectures by Jack Poulson

Related publications


Downloads

Libraries and materials can be downloaded for those interested in trying the approach and resulting libraries.


Publications and Reference Materials

We maintain a number of documents and web pages.


Information for Developers

libflame and related projects are maintained in a repository managed with subversion.


Contact us

Please e-mail us at flame@cs.utexas.edu , field@cs.utexas.edu , or rvdg@cs.utexas.edu

FLAMEWiki: FrontPage (last edited 2013-04-24 04:59:48 by localhost)