1\chapter{Developer Application Programming Interfaces} 2\label{chapter:dev-apis} 3 4This chapter describes APIs of interest to developers of \libflamens, 5including advanced users seeking to extend existing functionality to suit 6their own application. 7 8 9 10 11 12 13\section{Locks} 14 15\index{developer APIs!locks} 16 17There are a few instances, most notably within the \libflame implementation 18of SuperMatrix, where locks are needed to ensure that certain data structures 19are updated synchronously by multiple threads. 20\libflame abstracts the implementation of the locking mechanism from the user 21by exporting a general interface that operates upon an internally defined 22\flalock structure. 23This structure contains all of the information needed to identify the actual 24lock, however it may be defined by the implementation. 25\libflame will define its implementation in terms of whichever multithreading 26interface is enabled. 27See Section \ref{sec:configure-options} for more information on how to specify 28the type of multithreading at configure-time. 29 30Note that this API provides only basic locking functionality. 31The functions do {\em not} return any status value, and thus the caller 32cannot check whether the function succeeded or not. 33This was done to simplify the implementation, and also because our primary 34application, SuperMatrix, did not require the ability to recover from the 35kinds of errors that might occur beyond the control of the user, such as 36system errors. 37 38 39\subsection{API} 40 41% --- FLA_Lock_init() ---------------------------------------------------------- 42 43\begin{flaspec} 44\begin{verbatim} 45void FLA_Lock_init( FLA_Lock* lock_ptr ); 46\end{verbatim} 47\purpose{ 48Initialize the lock structure pointed to by \flalockns. 49Upon successful return, the state of the lock becomes initialized and 50unlocked. 51} 52\notes{ 53Attempting to initialize a lock that has already been initialized (and 54not yet released) may result in undefined behavior. 55} 56\begin{checks} 57\checkitem 58\lockptr may not be \fnullns. 59%\itemvsp 60%\checkitem 61% 62\end{checks} 63\begin{params} 64\parameter{\flalockp}{lock\_ptr}{A pointer to an \flalock structure.} 65\end{params} 66\end{flaspec} 67 68% --- FLA_Lock_acquire() ------------------------------------------------------- 69 70\begin{flaspec} 71\begin{verbatim} 72void FLA_Lock_acquire( FLA_Lock* lock_ptr ); 73\end{verbatim} 74\purpose{ 75Attempt to acquire the lock pointed to by \lockptrns. 76If the lock is unavailable (if its state is already locked), the call 77blocks and returns only upon successful acquisition of the lock. 78} 79\begin{checks} 80\checkitem 81\lockptr may not be \fnullns. 82%\itemvsp 83%\checkitem 84% 85\end{checks} 86\begin{params} 87\parameter{\flalockp}{lock\_ptr}{A pointer to an \flalock structure.} 88\end{params} 89\end{flaspec} 90 91% --- FLA_Lock_release() ------------------------------------------------------- 92 93\begin{flaspec} 94\begin{verbatim} 95void FLA_Lock_release( FLA_Lock* lock_ptr ) 96\end{verbatim} 97\purpose{ 98Release the lock associated with the structure pointed to by \flalockns. 99} 100\notes{ 101Attempting to release a lock that is uninitialized or that has not yet 102been acquired may result in undefined behavior. 103} 104\begin{checks} 105\checkitem 106\lockptr may not be \fnullns. 107%\itemvsp 108%\checkitem 109% 110\end{checks} 111\begin{params} 112\parameter{\flalockp}{lock\_ptr}{A pointer to an \flalock structure.} 113\end{params} 114\end{flaspec} 115 116% --- FLA_Lock_destroy() ------------------------------------------------------- 117 118\begin{flaspec} 119\begin{verbatim} 120void FLA_Lock_destroy( FLA_Lock* lock_ptr ); 121\end{verbatim} 122\purpose{ 123Destroy the the lock structure pointed to by \flalockns. 124This causes all system resources associated with the lock that had been 125previously allocated by {\tt FLA\_Lock\_init()} to be freed. 126Upon returning, the state of the lock becomes uninitialized. 127} 128\notes{ 129Attempting to destroy a lock that is currently in the locked state 130may result in undefined behavior. 131} 132\begin{checks} 133\checkitem 134\lockptr may not be \fnullns. 135%\itemvsp 136%\checkitem 137% 138\end{checks} 139\begin{params} 140\parameter{\flalockp}{lock\_ptr}{A pointer to an \flalock structure.} 141\end{params} 142\end{flaspec} 143 144 145 146 147 148 149\section{Memory management} 150 151\index{developer APIs!memory allocation} 152 153% --- FLA_malloc() ------------------------------------------------------------- 154 155\begin{flaspec} 156\begin{verbatim} 157void* FLA_malloc( size_t size ); 158\end{verbatim} 159\purpose{ 160Request a pointer to \size bytes of heap-allocated memory from the system. 161Note that a value of zero for \size will guarantee that \fnull is returned. 162} 163\notes{ 164The programmer is encouraged to use {\tt FLA\_malloc()} instead of calling 165{\tt malloc()} directly. 166Using {\tt FLA\_malloc()} and {\tt FLA\_free()} allows \libflame to output 167via standard error the balance of allocations and releases when 168{\tt FLA\_Finalize()} is called, which provides a basic memory leak 169detection. 170} 171\implnotes{ 172If \libflame was configured with {\tt --enable-memory-alignment={\em N}}, 173then memory will be allocated using {\tt posix\_memalign()} using {\em N} 174as the alignment factor. 175Otherwise, {\tt malloc()} is used, which typically only guarantees 176memory alignment at 8-byte boundaries. 177} 178\caveats{ 179If by chance {\tt malloc()} (or {\tt posix\_memalign()}) fails to allocate 180the requested number of bytes, the library raises an abort signal 181and the program is ended. 182This may seem like overkill, and it probably is. 183But it ensures that this situation does not go unreported. 184Besides, in the unlikely event that {\tt malloc()} does return a \fnull 185pointer, it is most likely because the memory heap is exhausted, which 186would prevent most programs from running correctly (or at all). 187} 188\rvalue{ 189A \voidp pointer to a heap-allocated region of memory \size bytes long. 190} 191\begin{params} 192\parameter{\sizet}{size}{The number of bytes to allocate.} 193\end{params} 194\end{flaspec} 195 196% --- FLA_realloc() ------------------------------------------------------------ 197 198\begin{flaspec} 199\begin{verbatim} 200void* FLA_realloc( void* old_ptr, size_t size ); 201\end{verbatim} 202\purpose{ 203Request a reallocation of previously-allocated memory such that the new 204region is \size bytes in length and contains the original contents of the 205reigion pointed to by {\tt old\_ptr}. 206} 207\implnotes{ 208This function does not guarantee adherence to the library-wide memory 209alignment factor set during configuration via the 210{\tt --enable-memory-alignment={\em N}} option, if it was given. 211We fundamentally cannot implement our own version of {\tt realloc()} 212in user-space because we cannot know how much memory was allocated 213at {\tt old\_ptr}. 214This information is needed if the original contents are to be copied 215over to the new memory region. 216Thus, {\tt FLA\_realloc()} is implemented with {\tt realloc()}, which 217guarantees only 8-byte memory alignment on most systems. 218} 219\rvalue{ 220A \voidp pointer to a heap-allocated region of memory \size bytes long. 221} 222\begin{params} 223\parameter{\voidp}{old\_ptr}{A pointer to the region of memory the user wishes to reallocate to \size bytes.} 224\parameter{\sizet}{size}{The number of bytes to allocate for the new region.} 225\end{params} 226\end{flaspec} 227 228% --- FLA_free() --------------------------------------------------------------- 229 230\begin{flaspec} 231\begin{verbatim} 232void FLA_free( void* ptr ); 233\end{verbatim} 234\purpose{ 235Release a heap-allocated region of memory back to the system. 236Note that passing a value of \fnull for \ptr will cause {\tt FLA\_free()} 237to return immediately without performing any action. 238} 239\notes{ 240The programmer is encouraged to use {\tt FLA\_free()} instead of calling 241{\tt free()} directly. 242Using {\tt FLA\_malloc()} and {\tt FLA\_free()} allows \libflame to output 243via standard error the balance of allocations and releases when 244{\tt FLA\_Finalize()} is called, which provides a basic memory leak 245detection. 246} 247\begin{params} 248\parameter{\voidp}{ptr}{A pointer to a region of memory previously allocated by {\tt FLA\_malloc()} (or {\tt FLA\_realloc()}).} 249\end{params} 250\end{flaspec} 251 252 253 254 255 256 257\section{Object creation} 258 259\index{developer APIs!object creation} 260 261% --- FLA_Obj_create_ext() ----------------------------------------------------- 262 263\begin{flaspec} 264\begin{verbatim} 265FLA_Error FLA_Obj_create_ext( FLA_Datatype datatype, FLA_Elemtype elemtype, dim_t m, 266 dim_t n, FLA_Obj* obj ); 267\end{verbatim} 268\purpose{ 269Create a new object using an extended FLASH-aware interface. 270Upon returning, \obj points to a valid heap-allocated $ m \by n $ object. 271} 272\notes{ 273The total size of the underlying allocated array depends on the value of 274\elemtypens. 275If the elements are requested to be \flascalarns, then the size of each 276element is determined by the value of \datatypens. 277If the elements are of type \flamatrixns, then each element is allocated 278to store an \flaobjns. 279} 280\rvalue{ 281\flasuccess 282} 283\begin{params} 284\parameter{\fladatatype}{datatype}{A constant corresponding to the numerical datatype requested.} 285\parameter{\flaelemtype}{elemtype}{A constant corresponding to the object element type requested.} 286\parameter{\dimt}{m}{The number of rows to be created in new object.} 287\parameter{\dimt}{n}{The number of columns to be created in the new object.} 288\parminout{\flaobjp}{obj}{A pointer to an uninitialized \flaobjns.} 289 {A pointer to a new \flaobj parameterized by {\tt m}, {\tt n}, 290 \elemtypens, and \datatypens.} 291\end{params} 292\end{flaspec} 293 294 295 296 297 298 299\section{SuperMatrix} 300\label{sec:sm-dev} 301 302\index{developer APIs!SuperMatrix} 303 304% --- FLASH_Queue_init() ------------------------------------------------------- 305 306\begin{flaspec} 307\begin{verbatim} 308void FLASH_Queue_init( void ); 309\end{verbatim} 310\purpose{ 311Initialize SuperMatrix. 312This function is normally called from within {\tt FLA\_Init()}. 313} 314\end{flaspec} 315 316% --- FLASH_Queue_finalize() --------------------------------------------------- 317 318\begin{flaspec} 319\begin{verbatim} 320void FLASH_Queue_finalize( void ); 321\end{verbatim} 322\purpose{ 323Finalize SuperMatrix. 324This function is normally called from within {\tt FLA\_Finalize()}. 325} 326\end{flaspec} 327 328% --- FLASH_Queue_get_thread_id() ---------------------------------------------- 329 330\begin{flaspec} 331\begin{verbatim} 332unsigned int FLASH_Queue_get_thread_id( void ); 333\end{verbatim} 334\purpose{ 335Query the ID number of the calling thread. 336The thread ID ranges from zero to $ t - 1 $ where $ t $ is the total number of 337SuperMatrix threads, equal to the unsigned integer returned by 338{\tt FLASH\_Queue\_get\_num\_threads()}. 339} 340\devnotes{ 341This function is not yet implemented. 342} 343\rvalue{ 344An unsigned integer representing the thread ID number of the calling thread. 345} 346\end{flaspec} 347 348% --- FLASH_Queue_get_num_write_blocks() --------------------------------------- 349% 350%\begin{flaspec} 351%\begin{verbatim} 352%unsigned int FLASH_Queue_get_num_write_blocks( void ); 353%\end{verbatim} 354%\purpose{ 355%Query the number of unique blocks (associated with the current set of enqueued 356%tasks) that are scheduled to be overwritten during the computation. 357%} 358%\devnotes{ 359%This function is not used by the implementation. 360%} 361%\rvalue{ 362%An unsigned integer representing the number of blocks to be overwritten. 363%} 364%\end{flaspec} 365 366 367 368 369 370 371 372 373 374 375 376\section{Control trees} 377\label{sec:cntl-trees} 378 379 380\subsection{Motivation} 381 382\index{control trees!motivation} 383 384While \libflame was in its early stages of development, we encountered a basic 385and recurring problem: 386coding blocked FLAME/C algorithms forced us to statically specify the 387implementation used when invoking algorithmic subproblems. 388Sometimes we chose to hard-code a function call an unblocked implementation 389that was itself coded with the FLAME/C API. 390Other times we chose to invoke external BLAS and/or unblocked LAPACK 391implementations via FLAME wrappers. 392And still other situations, such as those encountered when using FLASH and 393algorithms-by-blocks, call for us to invoke yet another blocked 394routine, creating more than one level of ``recursion'' in the overall 395algorithm. 396 397The problem becomes more unavoidable. 398Consider that not all situations call for using the same set of algorithmic 399variants. 400Perhaps for small standalone instances of {\sc syrk}, variant 5 works 401well, but when the {\sc syrk} operation is a subproblem within a larger 402Cholesky factorization, then variant 2 is more appropriate. 403How can we handle both of these cases while statically coding the 404choice of implementation for blocked algorithm subproblems? 405The most straightforward and naive solution would be to duplicate the 406code as many times as there are different situations. 407It is not difficult to see that this would quickly become a nightmare for 408the maintainers of the library. 409 410So, in summary, we wish to code our algorithms in such a manner that a 411subproblem operation is specified {\em without} binding it to a 412particular implementation, and in such a manner that we may only 413maintain {\em one} copy of each blocked algorithm. 414 415 416 417\subsection{The solution in \libflame} 418 419\index{control trees!description} 420 421The aforementioned problem is addressed in \libflame using a technique we 422call {\em control trees}. 423The general idea is simple: encode control information {\em a priori} 424into a tree structure that is passed into algorithmic subproblems and 425decoded by internal functions that remain hidden from the user-level API. 426This approach has five separate but related aspects, all of which require 427us to extend the original FLAME/C API in some way: 428 429\begin{itemize} 430\item 431{\bf Define control tree node structures and API.} 432Our solution is built upon a tree structure where internal nodes 433encapsulate information used by blocked algorithms and leaf nodes 434specify external implementations. 435Specifically, all individual control tree node structures are defined 436with blocksize and variant fields, which allow us to specify which 437algorithmic variant to execute and what blocksize to use within that 438algorithm. 439Each structure will also have an matrix type field which will allow 440us to handle both flat and arbitrary depth hierarchical matrices. 441We define a control tree structure for each operation that we wish 442to support (\flasyrkt for {\sc syrk}, \flacholt for {\sc chol}, etc.) 443These structures contain the standard three fields and also some number 444of fields which may contain pointers to child nodes. 445The number and types of these fields depend on the operation for which 446the control tree is being defined, but in general, the set of fields 447present will be enough to handle all algorithmic variants provided by 448\libflamens. 449Lastly, we must define an API that allows the programmer to easily 450create individual control tree structures and build trees up from child 451nodes. 452\item 453{\bf Control trees created at library startup.} 454Control trees are created dynamically via the structure creation 455interface and stored on the heap. 456The default set used internally by \libflame is instantiated at the time 457that the library is initialized, with one class of trees being configured 458for flat matrices, and another for hierarchical matrices. 459Within each class, several different trees may be created for a given 460operation, depending on the desired execution characteristics. 461A different tree may created for problem sizes deemed to be ``small'' 462and those considered to be ``large'', which would vary as a function 463of the blocksize and cache size. 464Alternatively, if the tree is structured properly, the same tree 465may be used for all problem sizes and shapes with only a very small 466additional cost in overhead incurred from the extra levels of blocked 467algorithms. 468Pointers to the root nodes of the trees are stored in global variable 469space. 470The default set of control trees is destroyed at library shutdown, in 471{\tt FLA\_Finalize()}. 472\item 473{\bf Control tree is selected within operation front-ends.} 474The default set of control trees are used within operation ``front-end'' 475routines. 476These routines are defined as user-level computational routines for Level-3 477BLAS and LAPACK-level operations that are designed for ``global'' problems, 478not subproblems. 479In other words, front-ends are for end users only and are not called 480internally by \libflame developers. 481Examples of front-end routines are: \flasyrkns, \flatrsmns, \flacholns, and 482\flaspdinvns. 483The root node pointers are accessed via {\tt extern} declarations within the 484files that define the front-end routines. 485There, the appropriate tree is selected, if more than one exists, and the 486root node pointer is passed down into the operation's internal ``back-end''. 487\item 488{\bf Internal back-ends handle parameters and decode trees.} 489Internal back-end functions must be defined in order to parse and decode 490the control tree for a particular operation. 491The implementation handles cases where (1) computation should execute 492immediately, such as for flat matrices or for the leaf matrices of 493hierarchical matricies when SuperMatrix is disabled, (2) more recursion 494is necessary, in order to handle arbitrary depth hierarchical matrices, 495and (3) the library should enqueue tasks for parallel execution via 496SuperMatrix. 497The back-end also parses the parameters defined by the operation, such 498as \sidens, \uplons, and \trans arguments, and then the appropriate 499variant or external implementation is called depending on the \variant 500field of the control tree node. 501In the context of flat matrices, a \variant field equal to \flasubproblem 502refers to a node that induces execution of an external implementation. 503For hierarchical matricies, the \flasubproblem variant refers to nodes that 504may or may not cause further recursion, and reuse of the control tree, to 505reach the leaf levels of the hieararchy. 506Otherwise, if a blocked algorithmic variant is called, the control tree is 507passed on. 508\item 509{\bf Algorithmic subproblems invoke internal back-ends.} 510The modified blocked algorithms feature a \cntl control tree pointer 511argument instead of the integer \nbalg argument. 512Recall that the pointer will refer to a control tree structure which 513contains all the information necessary to specify the execution of the 514blocked algorithmic variant, including the blocksize. 515The statement which computes the blocksize is replaced by one which 516uses a new routine, {\tt FLA\_Determine\_blocksize()}. 517The subproblems are replaced with corresponding calls to the internal 518back-end routine for the operation in question. 519C preprocessor macros are used to access the fields within the \cntl 520argument, specifically to extract the blocksize argument and the 521pointers to the child nodes of the current control tree node. 522The child nodes are passed in as the last argument to the internal 523back-ends, and recursion continues. 524\end{itemize} 525 526 527 528\subsection{Structure fields} 529 530There are three fields common to every control tree, regardless of its type. 531 532\input{figs/50-cntl-tree-structs} 533 534\begin{itemize} 535\item 536{\tt matrix\_type}. 537The \matrixtype field denotes the type of matrices, flat or 538hierarchical, that the control tree is to assume. 539A matrix type of \flahier allows hierarchical matrices, where the 540matrices may be of arbitrary depth. 541A matrix type of \flaflat implies that the matrix operands will be flat 542matrix objects. 543Though \libflame objects contain the \elemtype field in each node in the 544matrix hierarchy, the \matrixtype field is needed because the control trees 545for flat and hierarchical matricies differ. 546Control trees for flat matrices explicitly prescribe the execution for every 547level of algorithmic recursion. 548However, control trees for hierarchical matrices must allow recursion to 549an arbitrary depth. 550Thus, we leave it to the internal back-end to detect when the leaves of the 551hierarchy are reached and stop recursion accordingly. 552Note that the \matrixtype field of every node in the control tree must be 553identical. 554\item 555{\tt variant}. 556The \variant field specifies which algorithmic variant should execute. 557Valid values are \flasubproblem and \flablockedvariantone 558through \flablockedvarianttwentyns. 559Note that the semantic meaning of \flasubproblem differs depending on whether 560the tree is configured for flat or hierarchical matrices. 561For control trees of matrix type \flaflat, \flasubproblem refers to a node 562that invokes the operation's external implementation. 563For trees of matrix type \flahier, \flasubproblem refers to a node that 564causes the back-end to either recurse further if the bottom of the hierarchy 565has not yet been reached, or execute (or enqueue) the subproblem otherwise. 566Native unblocked FLAME algorithm implementations are not yet supported. 567\item 568{\tt blocksize}. 569The \blocksize field specifies a structure which contains four blocksize 570values. 571This allows the user to specify different blocksizes depending on the 572numerical datatype being used. 573blocksize structures should be created with {\tt FLA\_Blocksize\_create()}. 574\end{itemize} 575 576Control tree nodes also contain fields which are unique to the operation 577type. 578Specifically, a control tree type is made unique by the number and type 579of sub-tree fields it contains. 580Each control tree structure has at least one pointer to another control 581tree node. 582These child nodes contain the relevant information for executing the 583the operation's algorithmic subproblems. 584For example, the \flasyrkt type allows for two child nodes, one for a 585{\sc syrk} subproblem and one for a {\sc gemm} subproblem. 586Figure \ref{fig:cntl-tree-structs} lists the code defining the 587structure for the \flasyrkt type along a sample of structures 588for other operations supported in \libflamens. 589 590There are two circumstances under which you may leave a subproblem 591field uninitialized. 592\begin{itemize} 593\item 594{\bf The algorithmic variant does not need the full set of subproblems.} 595Some algorithmic variants only use a subset of the subproblem fields made 596available in the control tree structure. 597For example, variants 5 and 6 of the {\sc syrk} algorithm only invoke 598smaller {\sc syrk} subproblems, and thus do not need to perform any 599{\sc gemm} subproblems. 600In this case, the \subgemm field is not referenced by the runtime system 601and thus it may safely be initialized to \fnullns. 602\item 603{\bf The control tree node's variant field is \flasubproblemns.} 604The nodes at the ``leaves'' of the tree should contain a \variant field 605with a special value: \flasubproblemns. 606Every flat matrix control tree contains at least one leaf node that 607indicates that blocked algorithm recursion should stop, and an 608external implementation should be invoked for that node. 609Likewise, every hierarchical matrix control tree contains at least one 610``recurse'' node where further recursion in the algorithm-by-blocks is 611performed if the matrix hierarchy requires it. 612In both cases, {\em none} of the subproblem fields are referenced, and thus 613they may all be safely initialized to \fnullns. 614\end{itemize} 615 616When a control tree contains more than one subproblem field for a given 617operation type, the order of the subproblems matters. 618Consider the {\sc symm} operation and its corresponding control tree 619structure. 620All ten variants of {\sc symm} contain one {\sc symm} subproblem, and 621blocked variants 1 through 8 contain two {\sc gemm} subproblems. 622So control tree nodes for blocked variants 1 through 8 623must also initialize both {\sc gemm} fields. 624But which {\sc gemm} field corresponds to which {\sc gemm} subproblem 625instance in the {\sc symm} algorithm? 626In situations like this, the mapping is simple: the \subgemmo field 627corresponds the {\sc gemm} subproblem that occurs first in {\sc symm} 628algorithm, while the \subgemmtw field corresponds to the {\sc gemm} 629subproblem that occurs second. 630This rule for disambiguating subproblems of identical operation types 631must be observed for all cases where the algorithm contains more than 632one subproblem of a particular operation. 633 634 635\subsection{Control tree API} 636 637\index{developer APIs!control trees} 638 639First, we present the interface for creating and manipulating blocksize 640structures, which are a necessary component of control tree nodes. 641 642\subsubsection{Blocksize structures} 643 644% --- FLA_Blocksize_create() --------------------------------------------------- 645 646\begin{flaspec} 647\begin{verbatim} 648fla_blocksize_t* FLA_Blocksize_create( dim_t b_s, 649 dim_t b_d, 650 dim_t b_c, 651 dim_t b_z ); 652\end{verbatim} 653\purpose{ 654Create a structure containing a set of four blocksizes, one for each of the 655numerical datatypes supported by \libflamens, and initialize the structure 656fields according to the function arguments. 657} 658\ifacenotes{ 659Though the interface allows the programmer to set blocksize values for all 660four datatypes, it is permissible to assign meaningful values to only the 661fields that correspond to the datatypes that will be used by the application. 662} 663\notes{ 664Blocksize structures are allocated on the heap and should be released 665with {\tt FLA\_Blocksize\_free()} when they are no longer needed. 666} 667\begin{checks} 668\checkitem 669None of the blocksize arguments may be zero. 670%\itemvsp 671%\checkitem 672% 673\end{checks} 674\rvalue{ 675A pointer to a heap-allocated \flablocksizet structure. 676} 677\begin{params} 678\parameter{\dimt}{b\_s}{The blocksize to use for single precision real data.} 679\parameter{\dimt}{b\_d}{The blocksize to use for double precision real data.} 680\parameter{\dimt}{b\_c}{The blocksize to use for single precision complex data.} 681\parameter{\dimt}{b\_z}{The blocksize to use for double precision complex data.} 682\end{params} 683\end{flaspec} 684 685% --- FLA_Blocksize_set() ------------------------------------------------------ 686 687\begin{flaspec} 688\begin{verbatim} 689void FLA_Blocksize_set( fla_blocksize_t* bp, 690 dim_t b_s, 691 dim_t b_d, 692 dim_t b_c, 693 dim_t b_z ); 694\end{verbatim} 695\purpose{ 696Set the individual fields in an existing blocksize structure. 697} 698\ifacenotes{ 699Providing a value of zero for one of the blocksize arguments causes the 700function to leave the existing value unchanged for that particular 701blocksize field. 702} 703\begin{checks} 704\checkitem 705\bp may not be \fnullns. 706%\itemvsp 707%\checkitem 708% 709\end{checks} 710\begin{params} 711\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 712\parameter{\dimt}{b\_s}{The blocksize to use for single precision real data.} 713\parameter{\dimt}{b\_d}{The blocksize to use for double precision real data.} 714\parameter{\dimt}{b\_c}{The blocksize to use for single precision complex data.} 715\parameter{\dimt}{b\_z}{The blocksize to use for double precision complex data.} 716\end{params} 717\end{flaspec} 718 719% --- FLA_Blocksize_scale() ---------------------------------------------------- 720 721\begin{flaspec} 722\begin{verbatim} 723void FLA_Blocksize_scale( fla_blocksize_t* bp, 724 double factor ); 725\end{verbatim} 726\purpose{ 727Scale the individual fields of an existing blocksize structure. 728} 729\implnotes{ 730The scaling occurs as follows: the blocksize fields are typecast to 731\cdoublens, then multiplied by the scaling {\tt factor}, and finally 732typecast back to \cint before being stored to the blocksize structure. 733} 734\begin{checks} 735\checkitem 736\bp may not be \fnullns. 737%\itemvsp 738%\checkitem 739% 740\end{checks} 741\begin{params} 742\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 743\parameter{\cdouble}{factor}{The scaling factor to apply to the blocksize values.} 744\end{params} 745\end{flaspec} 746 747% --- FLA_Blocksize_create_copy() ---------------------------------------------- 748 749\begin{flaspec} 750\begin{verbatim} 751fla_blocksize_t* FLA_Blocksize_create_copy( fla_blocksize_t* bp ); 752\end{verbatim} 753\purpose{ 754Create a copy of an existing blocksize structure. 755} 756\begin{checks} 757\checkitem 758\bp may not be \fnullns. 759%\itemvsp 760%\checkitem 761% 762\end{checks} 763\rvalue{ 764A pointer to a heap-allocated \flablocksizet structure. 765} 766\begin{params} 767\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 768\end{params} 769\end{flaspec} 770 771% --- FLA_Blocksize_free() ----------------------------------------------------- 772 773\begin{flaspec} 774\begin{verbatim} 775void FLA_Blocksize_free( fla_blocksize_t* bp ); 776\end{verbatim} 777\purpose{ 778Release the memory allocated to a blocksize structure. 779} 780\notes{ 781{\tt FLA\_Blocksize\_free()} should only be used with pointers to blocksize 782structures that were allocated with {\tt FLA\_Blocksize\_create()} or 783{\tt FLA\_Blocksize\_create\_copy()}. 784} 785\begin{checks} 786\checkitem 787\bp may not be \fnullns. 788%\itemvsp 789%\checkitem 790% 791\end{checks} 792\begin{params} 793\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 794\end{params} 795\end{flaspec} 796 797% --- FLA_Blocksize_extract() -------------------------------------------------- 798 799\begin{flaspec} 800\begin{verbatim} 801dim_t FLA_Blocksize_extract( FLA_Datatype datatype, 802 fla_blocksize_t* bp ); 803\end{verbatim} 804\purpose{ 805Extract the value associated with a specific numerical datatype from a 806blocksize structure. 807} 808\begin{checks} 809\checkitem 810The value of \datatype must refer to a floating-point datatype. 811\itemvsp 812\checkitem 813\bp may not be \fnullns. 814\end{checks} 815\rvalue{ 816An unsigned integer value of type \dimtns. 817} 818\begin{params} 819\parameter{\fladatatype}{datatype}{A constant corresponding to the numerical datatype requested.} 820\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 821\end{params} 822\end{flaspec} 823 824% --- FLA_Query_blocksizes() --------------------------------------------------- 825 826\begin{flaspec} 827\begin{verbatim} 828fla_blocksize_t* FLA_Query_blocksizes( FLA_Dimension dim_tag ); 829\end{verbatim} 830\purpose{ 831Query the library for a reasonable set of blocksizes. 832The user must specify how the blocksizes are chosen by specifying a 833dimension {\em tag}. 834Valid tag values are \fladimensionmns, \fladimensionkns, 835\fladimensionnns, and \fladimensionminns. 836If \libflame was configured with {\tt --enable-goto-interfaces}, 837the first three values correspond to architecture-specific blocksizes 838associated with the $ m $, $ k $, and $ n $ dimensions of the inner-most 839matrix-matrix multiplication kernel in GotoBLAS. 840Otherwise, these three constants are associated with default values that 841may not be optimal. 842The last tag, \fladimensionminns, will cause {\tt FLA\_Query\_blocksizes()} 843to return the smallest of the $ m $, $ k $, and $ n $ blocksizes. 844If unsure, use \fladimensionminns. 845} 846\notes{ 847This function dynamically allocates memory for the structure in which 848the blocksizes are returned. 849It is the user's responsibility to deallocate this structure with 850{\tt FLA\_Blocksize\_free()} when it is no longer needed. 851} 852\rvalue{ 853A pointer to a heap-allocated \flablocksizet structure. 854} 855\begin{params} 856\parameter{\fladimension}{dim\_tag}{A constant specifying how to choose the blocksize.} 857\end{params} 858\end{flaspec} 859 860% --- FLA_Query_blocksize() ---------------------------------------------------- 861 862\begin{flaspec} 863\begin{verbatim} 864dim_t FLA_Query_blocksize( FLA_Datatype datatype, 865 FLA_Dimension dim_tag ); 866\end{verbatim} 867\purpose{ 868Query the library for a reasonable blocksize for a specific datatype. 869The behavior of this function is similar to that of 870{\tt FLA\_Query\_blocksizes()}, 871except that only a single \dimt scalar (for the datatype in question) 872is returned intead of a pointer to an entire \flablocksizet structure. 873} 874\notes{ 875The values returned by this function are the same as those attainable 876by calling {\tt FLA\_Query\_blocksizes()} and then using 877{\tt FLA\_Blocksize\_extract()} to extract the blocksize for \datatypens. 878} 879\begin{checks} 880\checkitem 881The value of \datatype must refer to a floating-point datatype. 882%\itemvsp 883%\checkitem 884% 885\end{checks} 886\rvalue{ 887An unsigned integer value of type \dimtns. 888} 889\begin{params} 890\parameter{\fladatatype}{datatype}{A constant corresponding to the numerical datatype requested.} 891\parameter{\fladimension}{dim\_tag}{A constant specifying how to choose the blocksize.} 892\end{params} 893\end{flaspec} 894 895% --- FLA_Determine_blocksize() ------------------------------------------------ 896 897\begin{flaspec} 898\begin{verbatim} 899dim_t FLA_Determine_blocksize( FLA_Obj A_unproc, 900 FLA_Quadrant to_dir, 901 fla_blocksize_t* bp ); 902\end{verbatim} 903\purpose{ 904Determine the blocksize given the contents of a blocksize structure and 905the current state of the matrix partitioning. 906If the blocksize is larger than the dimension of {\tt A\_unproc}, 907the dimension of {\tt A\_unproc} is returned instead. 908In this case, the dimension in question, be it the length or width, is 909determined by the value of {\tt to\_dir}. 910Specifically, if {\tt to\_dir} denotes vertical movement, the length of 911{\tt A\_unproc} is used, and for lateral movement, the width is used. 912If {\tt to\_dir} denotes diagonal movement, then the minimum dimension 913(as would be returned by {\tt FLA\_Obj\_min\_dim()}) is used. 914} 915\begin{checks} 916\checkitem 917\bp may not be \fnullns. 918%\itemvsp 919%\checkitem 920% 921\end{checks} 922\rvalue{ 923An unsigned integer value of type \dimtns. 924} 925\begin{params} 926\parameter{\flaobj}{A\_unproc}{An \flaobj view into the unprocessed portion of a matrix being tracked by a blocked algorithm.} 927\parameter{\flaquadrant}{to\_dir}{The direction in which the algorithm is moving through the parent matrix of {\tt A\_unproc}.} 928\parameter{\flablocksizet}{bp}{A pointer to an existing blocksize structure.} 929\end{params} 930\end{flaspec} 931 932 933 934 935The remainder of this subsection describes the functions that create and 936initialize control tree nodes for each of the supported linear algebra 937operations. 938These functions share the same first three arguments, which correspond 939to the fields described in the previous subsection. 940Note that you should always invoke these interfaces for the leaves of 941the trees first, and then use the pointers returned from those routines 942in the initialization of higher internal nodes. 943Simply put, you cannot create a non-leaf node until you have created 944and initialized its children nodes. 945 946 947 948\subsubsection{Level-3 BLAS operations} 949 950% --- FLA_Cntl_gemm_obj_create() ----------------------------------------------- 951 952\begin{flaspec} 953\begin{verbatim} 954fla_gemm_t* FLA_Cntl_gemm_obj_create( FLA_Matrix_type matrix_type, 955 int variant, 956 fla_blocksize_t* blocksize, 957 fla_gemm_t* sub_gemm ); 958\end{verbatim} 959\purpose{ 960Create a structure representing a node in a control tree for a general 961matrix-matrix multiplication ({\sc gemm}) operation and initialize its 962fields according to the function arguments. 963} 964\notes{ 965If \variant is \flasubproblemns, none of the pointer arguments 966are used and thus they may be safely set to \fnullns. 967%The \blocksize and \subgemm arguments may be \fnull if \variant is 968%\flasubproblemns. 969} 970\begin{checks} 971\checkitem 972If \variant is not \flasubproblemns, then it must be one of 973\flablockedvariantone through \flablockedvariantsixns. 974%\itemvsp 975%\checkitem 976% 977\end{checks} 978\rvalue{ 979A pointer to a heap-allocated \flagemmt structure. 980} 981\begin{params} 982\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 983\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 984\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 985\parameter{\flahemmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 986\end{params} 987\end{flaspec} 988 989% --- FLA_Cntl_hemm_obj_create() ----------------------------------------------- 990 991\begin{flaspec} 992\begin{verbatim} 993fla_hemm_t* FLA_Cntl_hemm_obj_create( FLA_Matrix_type matrix_type, 994 int variant, 995 fla_blocksize_t* blocksize, 996 fla_hemm_t* sub_hemm, 997 fla_gemm_t* sub_gemm1, 998 fla_gemm_t* sub_gemm2 ); 999\end{verbatim} 1000\purpose{ 1001Create a structure representing a node in a control tree for a Hermitian 1002matrix-matrix multiplication ({\sc hemm}) operation and initialize its 1003fields according to the function arguments. 1004} 1005\notes{ 1006If \variant is \flasubproblemns, none of the pointer arguments 1007are used and thus they may be safely set to \fnullns. 1008Even if \variant specifies a blocked variant, some algorithms contain 1009fewer subproblems and thus do not use every subproblem field 1010argument. 1011In such cases, these arguments may be safely set to \fnullns. 1012Please refer to the blocked algorithmic variant implementations 1013to determine which subproblem fields are unused. 1014%If \variant is \flablockedvariantnine or \flablockedvarianttenns, the 1015%\subgemmo and \subgemmtw arguments may be \fnullns. 1016%If \variant is \flasubproblemns, the \blocksizens, \subhemmns, 1017%\subgemmons, and \subgemmtw arguments may be \fnullns. 1018} 1019\begin{checks} 1020\checkitem 1021If \variant is not \flasubproblemns, then it must be one of 1022\flablockedvariantone through \flablockedvarianttenns. 1023%\itemvsp 1024%\checkitem 1025% 1026\end{checks} 1027\rvalue{ 1028A pointer to a heap-allocated \flahemmt structure. 1029} 1030\begin{params} 1031\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1032\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1033\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1034\parameter{\flahemmt}{sub\_hemm}{A pointer to the node to be used for the {\sc hemm} subproblem.} 1035\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1036\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1037\end{params} 1038\end{flaspec} 1039 1040% --- FLA_Cntl_herk_obj_create() ----------------------------------------------- 1041 1042\begin{flaspec} 1043\begin{verbatim} 1044fla_herk_t* FLA_Cntl_herk_obj_create( FLA_Matrix_type matrix_type, 1045 int variant, 1046 fla_blocksize_t* blocksize, 1047 fla_herk_t* sub_herk, 1048 fla_gemm_t* sub_gemm ); 1049\end{verbatim} 1050\purpose{ 1051Create a structure representing a node in a control tree for a Hermitian 1052rank-k update ({\sc herk}) operation and initialize its 1053fields according to the function arguments. 1054} 1055\notes{ 1056If \variant is \flasubproblemns, none of the pointer arguments 1057are used and thus they may be safely set to \fnullns. 1058Even if \variant specifies a blocked variant, some algorithms contain 1059fewer subproblems and thus do not use every subproblem field 1060argument. 1061In such cases, these arguments may be safely set to \fnullns. 1062Please refer to the blocked algorithmic variant implementations 1063to determine which subproblem fields are unused. 1064%If \variant is \flablockedvariantfive or \flablockedvariantsixns, the 1065%\subgemm argument may be \fnullns. 1066%If \variant is \flasubproblemns, the \blocksizens, \subherkns, 1067%and \subgemm arguments may be \fnullns. 1068} 1069\begin{checks} 1070\checkitem 1071If \variant is not \flasubproblemns, then it must be one of 1072\flablockedvariantone through \flablockedvariantsixns. 1073%\itemvsp 1074%\checkitem 1075% 1076\end{checks} 1077\rvalue{ 1078A pointer to a heap-allocated \flaherkt structure. 1079} 1080\begin{params} 1081\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1082\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1083\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1084\parameter{\flaherkt}{sub\_herk}{A pointer to the node to be used for the {\sc herk} subproblem.} 1085\parameter{\flagemmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1086\end{params} 1087\end{flaspec} 1088 1089% --- FLA_Cntl_her2k_obj_create() ---------------------------------------------- 1090 1091\begin{flaspec} 1092\begin{verbatim} 1093fla_her2k_t* FLA_Cntl_her2k_obj_create( FLA_Matrix_type matrix_type, 1094 int variant, 1095 fla_blocksize_t* blocksize, 1096 fla_her2k_t* sub_her2k, 1097 fla_gemm_t* sub_gemm1, 1098 fla_gemm_t* sub_gemm2 ); 1099\end{verbatim} 1100\purpose{ 1101Create a structure representing a node in a control tree for a Hermitian 1102rank-2k update ({\sc her2k}) operation and initialize its 1103fields according to the function arguments. 1104} 1105\notes{ 1106If \variant is \flasubproblemns, none of the pointer arguments 1107are used and thus they may be safely set to \fnullns. 1108Even if \variant specifies a blocked variant, some algorithms contain 1109fewer subproblems and thus do not use every subproblem field 1110argument. 1111In such cases, these arguments may be safely set to \fnullns. 1112Please refer to the blocked algorithmic variant implementations 1113to determine which subproblem fields are unused. 1114%If \variant is \flablockedvariantnine or \flablockedvarianttenns, the 1115%\subgemmo and \subgemmtw arguments may be \fnullns. 1116%If \variant is \flasubproblemns, the \blocksizens, \subhertkns, 1117%\subgemmons, and \subgemmtw arguments may be \fnullns. 1118} 1119\begin{checks} 1120\checkitem 1121If \variant is not \flasubproblemns, then it must be one of 1122\flablockedvariantone through \flablockedvarianttenns. 1123%\itemvsp 1124%\checkitem 1125% 1126\end{checks} 1127\rvalue{ 1128A pointer to a heap-allocated \flahertkt structure. 1129} 1130\begin{params} 1131\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1132\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1133\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1134\parameter{\flahertkt}{sub\_her2k}{A pointer to the node to be used for the {\sc her2k} subproblem.} 1135\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1136\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1137\end{params} 1138\end{flaspec} 1139 1140% --- FLA_Cntl_symm_obj_create() ----------------------------------------------- 1141 1142\begin{flaspec} 1143\begin{verbatim} 1144fla_symm_t* FLA_Cntl_symm_obj_create( FLA_Matrix_type matrix_type, 1145 int variant, 1146 fla_blocksize_t* blocksize, 1147 fla_symm_t* sub_symm, 1148 fla_gemm_t* sub_gemm1, 1149 fla_gemm_t* sub_gemm2 ); 1150\end{verbatim} 1151\purpose{ 1152Create a structure representing a node in a control tree for a symmetric 1153matrix-matrix multiplication ({\sc symm}) operation and initialize its 1154fields according to the function arguments. 1155} 1156\notes{ 1157If \variant is \flasubproblemns, none of the pointer arguments 1158are used and thus they may be safely set to \fnullns. 1159Even if \variant specifies a blocked variant, some algorithms contain 1160fewer subproblems and thus do not use every subproblem field 1161argument. 1162In such cases, these arguments may be safely set to \fnullns. 1163Please refer to the blocked algorithmic variant implementations 1164to determine which subproblem fields are unused. 1165%If \variant is \flablockedvariantnine or \flablockedvarianttenns, the 1166%\subgemmo and \subgemmtw arguments may be \fnullns. 1167%If \variant is \flasubproblemns, the \blocksizens, \subsymmns, 1168%\subgemmons, and \subgemmtw arguments may be \fnullns. 1169} 1170\begin{checks} 1171\checkitem 1172If \variant is not \flasubproblemns, then it must be one of 1173\flablockedvariantone through \flablockedvarianttenns. 1174%\itemvsp 1175%\checkitem 1176% 1177\end{checks} 1178\rvalue{ 1179A pointer to a heap-allocated \flasymmt structure. 1180} 1181\begin{params} 1182\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1183\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1184\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1185\parameter{\flasymmt}{sub\_symm}{A pointer to the node to be used for the {\sc symm} subproblem.} 1186\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1187\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1188\end{params} 1189\end{flaspec} 1190 1191% --- FLA_Cntl_syrk_obj_create() ----------------------------------------------- 1192 1193\begin{flaspec} 1194\begin{verbatim} 1195fla_syrk_t* FLA_Cntl_syrk_obj_create( FLA_Matrix_type matrix_type, 1196 int variant, 1197 fla_blocksize_t* blocksize, 1198 fla_syrk_t* sub_syrk, 1199 fla_gemm_t* sub_gemm ); 1200\end{verbatim} 1201\purpose{ 1202Create a structure representing a node in a control tree for a symmetric 1203rank-k update ({\sc syrk}) operation and initialize its 1204fields according to the function arguments. 1205} 1206\notes{ 1207If \variant is \flasubproblemns, none of the pointer arguments 1208are used and thus they may be safely set to \fnullns. 1209Even if \variant specifies a blocked variant, some algorithms contain 1210fewer subproblems and thus do not use every subproblem field 1211argument. 1212In such cases, these arguments may be safely set to \fnullns. 1213Please refer to the blocked algorithmic variant implementations 1214to determine which subproblem fields are unused. 1215%If \variant is \flablockedvariantfive or \flablockedvariantsixns, the 1216%\subgemm argument may be \fnullns. 1217%If \variant is \flasubproblemns, the \blocksizens, \subherkns, 1218%and \subgemm arguments may be \fnullns. 1219} 1220\begin{checks} 1221\checkitem 1222If \variant is not \flasubproblemns, then it must be one of 1223\flablockedvariantone through \flablockedvariantsixns. 1224%\itemvsp 1225%\checkitem 1226% 1227\end{checks} 1228\rvalue{ 1229A pointer to a heap-allocated \flasyrkt structure. 1230} 1231\begin{params} 1232\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1233\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1234\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1235\parameter{\flasyrkt}{sub\_syrk}{A pointer to the node to be used for the {\sc syrk} subproblem.} 1236\parameter{\flagemmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1237\end{params} 1238\end{flaspec} 1239 1240% --- FLA_Cntl_syr2k_obj_create() ---------------------------------------------- 1241 1242\begin{flaspec} 1243\begin{verbatim} 1244fla_syr2k_t* FLA_Cntl_syr2k_obj_create( FLA_Matrix_type matrix_type, 1245 int variant, 1246 fla_blocksize_t* blocksize, 1247 fla_syr2k_t* sub_syr2k, 1248 fla_gemm_t* sub_gemm1, 1249 fla_gemm_t* sub_gemm2 ); 1250\end{verbatim} 1251\purpose{ 1252Create a structure representing a node in a control tree for a symmetric 1253rank-2k update ({\sc syr2k}) operation and initialize its 1254fields according to the function arguments. 1255} 1256\notes{ 1257If \variant is \flasubproblemns, none of the pointer arguments 1258are used and thus they may be safely set to \fnullns. 1259Even if \variant specifies a blocked variant, some algorithms contain 1260fewer subproblems and thus do not use every subproblem field 1261argument. 1262In such cases, these arguments may be safely set to \fnullns. 1263Please refer to the blocked algorithmic variant implementations 1264to determine which subproblem fields are unused. 1265%If \variant is \flablockedvariantnine or \flablockedvarianttenns, the 1266%\subgemmo and \subgemmtw arguments may be \fnullns. 1267%If \variant is \flasubproblemns, the \blocksizens, \subsyrtkns, 1268%\subgemmons, and \subgemmtw arguments may be \fnullns. 1269} 1270\begin{checks} 1271\checkitem 1272If \variant is not \flasubproblemns, then it must be one of 1273\flablockedvariantone through \flablockedvarianttenns. 1274%\itemvsp 1275%\checkitem 1276% 1277\end{checks} 1278\rvalue{ 1279A pointer to a heap-allocated \flasyrtkt structure. 1280} 1281\begin{params} 1282\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1283\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1284\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1285\parameter{\flasyrtkt}{sub\_syr2k}{A pointer to the node to be used for the {\sc syr2k} subproblem.} 1286\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1287\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1288\end{params} 1289\end{flaspec} 1290 1291% --- FLA_Cntl_trmm_obj_create() ----------------------------------------------- 1292 1293\begin{flaspec} 1294\begin{verbatim} 1295fla_trmm_t* FLA_Cntl_trmm_obj_create( FLA_Matrix_type matrix_type, 1296 int variant, 1297 fla_blocksize_t* blocksize, 1298 fla_trmm_t* sub_trmm, 1299 fla_gemm_t* sub_gemm ); 1300\end{verbatim} 1301\purpose{ 1302Create a structure representing a node in a control tree for a triangular 1303matrix-matrix multiplication ({\sc trmm}) operation and initialize its 1304fields according to the function arguments. 1305} 1306\notes{ 1307If \variant is \flasubproblemns, none of the pointer arguments 1308are used and thus they may be safely set to \fnullns. 1309Even if \variant specifies a blocked variant, some algorithms contain 1310fewer subproblems and thus do not use every subproblem field 1311argument. 1312In such cases, these arguments may be safely set to \fnullns. 1313Please refer to the blocked algorithmic variant implementations 1314to determine which subproblem fields are unused. 1315%If \variant is \flablockedvariantthree or \flablockedvariantfourns, the 1316%\subgemm argument may be \fnullns. 1317%If \variant is \flasubproblemns, the \blocksizens, \subtrmmns, 1318%and \subgemm arguments may be \fnullns. 1319} 1320\begin{checks} 1321\checkitem 1322If \variant is not \flasubproblemns, then it must be one of 1323\flablockedvariantone through \flablockedvariantfourns. 1324%\itemvsp 1325%\checkitem 1326% 1327\end{checks} 1328\rvalue{ 1329A pointer to a heap-allocated \flatrmmt structure. 1330} 1331\begin{params} 1332\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1333\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1334\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1335\parameter{\flatrmmt}{sub\_trmm}{A pointer to the node to be used for the {\sc trmm} subproblem.} 1336\parameter{\flagemmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1337\end{params} 1338\end{flaspec} 1339 1340% --- FLA_Cntl_trsm_obj_create() ----------------------------------------------- 1341 1342\begin{flaspec} 1343\begin{verbatim} 1344fla_trsm_t* FLA_Cntl_trsm_obj_create( FLA_Matrix_type matrix_type, 1345 int variant, 1346 fla_blocksize_t* blocksize, 1347 fla_trsm_t* sub_trsm, 1348 fla_gemm_t* sub_gemm ); 1349\end{verbatim} 1350\purpose{ 1351Create a structure representing a node in a control tree for a triangular 1352solve with multiple right-hand sides ({\sc trsm}) operation and initialize its 1353fields according to the function arguments. 1354} 1355\notes{ 1356If \variant is \flasubproblemns, none of the pointer arguments 1357are used and thus they may be safely set to \fnullns. 1358Even if \variant specifies a blocked variant, some algorithms contain 1359fewer subproblems and thus do not use every subproblem field 1360argument. 1361In such cases, these arguments may be safely set to \fnullns. 1362Please refer to the blocked algorithmic variant implementations 1363to determine which subproblem fields are unused. 1364%If \variant is \flablockedvariantthree or \flablockedvariantfourns, the 1365%\subgemm argument may be \fnullns. 1366%If \variant is \flasubproblemns, the \blocksizens, \subtrsmns, 1367%and \subgemm arguments may be \fnullns. 1368} 1369\begin{checks} 1370\checkitem 1371If \variant is not \flasubproblemns, then it must be one of 1372\flablockedvariantone through \flablockedvariantfourns. 1373%\itemvsp 1374%\checkitem 1375% 1376\end{checks} 1377\rvalue{ 1378A pointer to a heap-allocated \flatrsmt structure. 1379} 1380\begin{params} 1381\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1382\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1383\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1384\parameter{\flatrsmt}{sub\_trsm}{A pointer to the node to be used for the {\sc trsm} subproblem.} 1385\parameter{\flagemmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1386\end{params} 1387\end{flaspec} 1388 1389 1390 1391\subsubsection{LAPACK-level operations} 1392 1393 1394% --- FLA_Cntl_chol_obj_create() ----------------------------------------------- 1395 1396\begin{flaspec} 1397\begin{verbatim} 1398fla_chol_t* FLA_Cntl_chol_obj_create( FLA_Matrix_type matrix_type, 1399 int variant, 1400 fla_blocksize_t* blocksize, 1401 fla_chol_t* sub_chol, 1402 fla_syrk_t* sub_syrk, 1403 fla_herk_t* sub_herk, 1404 fla_trsm_t* sub_trsm, 1405 fla_gemm_t* sub_gemm ); 1406\end{verbatim} 1407\purpose{ 1408Create a structure representing a node in a control tree for a Cholesky 1409factorization ({\sc chol}) operation and initialize its 1410fields according to the function arguments. 1411} 1412\notes{ 1413If \variant is \flasubproblemns, none of the pointer arguments 1414are used and thus they may be safely set to \fnullns. 1415Even if \variant specifies a blocked variant, some algorithms contain 1416fewer subproblems and thus do not use every subproblem field 1417argument. 1418In such cases, these arguments may be safely set to \fnullns. 1419Please refer to the blocked algorithmic variant implementations 1420to determine which subproblem fields are unused. 1421} 1422\begin{checks} 1423\checkitem 1424If \variant is not \flasubproblemns, then it must be one of 1425\flablockedvariantone through \flablockedvariantthreens. 1426%\itemvsp 1427%\checkitem 1428% 1429\end{checks} 1430\rvalue{ 1431A pointer to a heap-allocated \flacholt structure. 1432} 1433\begin{params} 1434\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1435\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1436\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1437\parameter{\flacholt}{sub\_chol}{A pointer to the node to be used for the {\sc chol} subproblem.} 1438\parameter{\flasyrkt}{sub\_syrk}{A pointer to the node to be used for the {\sc sryk} subproblem.} 1439\parameter{\flasyrkt}{sub\_herk}{A pointer to the node to be used for the {\sc herk} subproblem.} 1440\parameter{\flatrsmt}{sub\_trsm}{A pointer to the node to be used for the {\sc trsm} subproblem.} 1441\parameter{\flatrsmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1442\end{params} 1443\end{flaspec} 1444 1445% --- FLA_Cntl_lu_obj_create() ------------------------------------------------- 1446 1447\begin{flaspec} 1448\begin{verbatim} 1449fla_lu_t* FLA_Cntl_lu_obj_create( FLA_Matrix_type matrix_type, 1450 int variant, 1451 fla_blocksize_t* blocksize, 1452 fla_lu_t* sub_lu, 1453 fla_gemm_t* sub_gemm1, 1454 fla_gemm_t* sub_gemm2, 1455 fla_gemm_t* sub_gemm3, 1456 fla_trsm_t* sub_trsm1, 1457 fla_trsm_t* sub_trsm2 ); 1458\end{verbatim} 1459\purpose{ 1460Create a structure representing a node in a control tree for an LU 1461factorization ({\sc lu}) operation and initialize its 1462fields according to the function arguments. 1463} 1464\notes{ 1465If \variant is \flasubproblemns, none of the pointer arguments 1466are used and thus they may be safely set to \fnullns. 1467Even if \variant specifies a blocked variant, some algorithms contain 1468fewer subproblems and thus do not use every subproblem field 1469argument. 1470In such cases, these arguments may be safely set to \fnullns. 1471Please refer to the blocked algorithmic variant implementations 1472to determine which subproblem fields are unused. 1473} 1474\begin{checks} 1475\checkitem 1476If \variant is not \flasubproblemns, then it must be one of 1477\flablockedvariantone through \flablockedvariantthreens. 1478%\itemvsp 1479%\checkitem 1480% 1481\end{checks} 1482\rvalue{ 1483A pointer to a heap-allocated \flalut structure. 1484} 1485\begin{params} 1486\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1487\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1488\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1489\parameter{\flacholt}{sub\_chol}{A pointer to the node to be used for the {\sc chol} subproblem.} 1490\parameter{\flasyrkt}{sub\_syrk}{A pointer to the node to be used for the {\sc sryk} subproblem.} 1491\parameter{\flasyrkt}{sub\_herk}{A pointer to the node to be used for the {\sc herk} subproblem.} 1492\parameter{\flatrsmt}{sub\_trsm}{A pointer to the node to be used for the {\sc trsm} subproblem.} 1493\parameter{\flatrsmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1494\end{params} 1495\end{flaspec} 1496 1497% --- FLA_Cntl_qrut_obj_create() ----------------------------------------------- 1498 1499\begin{flaspec} 1500\begin{verbatim} 1501fla_qrut_t* FLA_Cntl_qrut_obj_create( FLA_Matrix_type matrix_type, 1502 int variant, 1503 fla_blocksize_t* blocksize, 1504 fla_qrut_t* sub_qrut, 1505 fla_trmm_t* sub_trmm1, 1506 fla_trmm_t* sub_trmm2, 1507 fla_gemm_t* sub_gemm1, 1508 fla_gemm_t* sub_gemm2, 1509 fla_trsm_t* sub_trsm, 1510 fla_axpy_t* sub_axpy, 1511 fla_copy_t* sub_copy ); 1512\end{verbatim} 1513\purpose{ 1514Create a structure representing a node in a control tree for a QR 1515factorization via the UT transform ({\sc qrut}) operation and initialize its 1516fields according to the function arguments. 1517} 1518\notes{ 1519If \variant is \flasubproblemns, none of the pointer arguments 1520are used and thus they may be safely set to \fnullns. 1521Even if \variant specifies a blocked variant, some algorithms contain 1522fewer subproblems and thus do not use every subproblem field 1523argument. 1524In such cases, these arguments may be safely set to \fnullns. 1525Please refer to the blocked algorithmic variant implementations 1526to determine which subproblem fields are unused. 1527} 1528\begin{checks} 1529\checkitem 1530If \variant is not \flasubproblemns, then it must be 1531\flablockedvariantonens. 1532%\itemvsp 1533%\checkitem 1534% 1535\end{checks} 1536\rvalue{ 1537A pointer to a heap-allocated \flaqrutt structure. 1538} 1539\begin{params} 1540\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1541\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1542\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1543\parameter{\flaqrutt}{sub\_qrut}{A pointer to the node to be used for the {\sc qrut} subproblem.} 1544\parameter{\flatrmmt}{sub\_trmm1}{A pointer to the node to be used for the first {\sc trmm} subproblem.} 1545\parameter{\flatrmmt}{sub\_trmm2}{A pointer to the node to be used for the second {\sc trmm} subproblem.} 1546\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1547\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1548\parameter{\flagemmt}{sub\_trsm}{A pointer to the node to be used for the {\sc trsm} subproblem.} 1549\parameter{\flagemmt}{sub\_axpy}{A pointer to the node to be used for the {\sc axpy} subproblem.} 1550\parameter{\flagemmt}{sub\_copy}{A pointer to the node to be used for the {\sc copy} subproblem.} 1551\end{params} 1552\end{flaspec} 1553 1554% --- FLA_Cntl_lq_obj_create() ------------------------------------------------- 1555 1556\begin{flaspec} 1557\begin{verbatim} 1558fla_lqut_t* FLA_Cntl_lqut_obj_create( FLA_Matrix_type matrix_type, 1559 int variant, 1560 fla_blocksize_t* blocksize, 1561 fla_lqut_t* sub_lqut, 1562 fla_trmm_t* sub_trmm1, 1563 fla_trmm_t* sub_trmm2, 1564 fla_gemm_t* sub_gemm1, 1565 fla_gemm_t* sub_gemm2, 1566 fla_trsm_t* sub_trsm, 1567 fla_axpy_t* sub_axpy, 1568 fla_copy_t* sub_copy ); 1569\end{verbatim} 1570\purpose{ 1571Create a structure representing a node in a control tree for a LQ 1572factorization via the UT transform ({\sc lqut}) operation and initialize its 1573fields according to the function arguments. 1574} 1575\notes{ 1576If \variant is \flasubproblemns, none of the pointer arguments 1577are used and thus they may be safely set to \fnullns. 1578Even if \variant specifies a blocked variant, some algorithms contain 1579fewer subproblems and thus do not use every subproblem field 1580argument. 1581In such cases, these arguments may be safely set to \fnullns. 1582Please refer to the blocked algorithmic variant implementations 1583to determine which subproblem fields are unused. 1584} 1585\begin{checks} 1586\checkitem 1587If \variant is not \flasubproblemns, then it must be 1588\flablockedvariantonens. 1589%\itemvsp 1590%\checkitem 1591% 1592\end{checks} 1593\rvalue{ 1594A pointer to a heap-allocated \flalqutt structure. 1595} 1596\begin{params} 1597\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1598\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1599\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1600\parameter{\flalqt}{sub\_lqut}{A pointer to the node to be used for the {\sc lqut} subproblem.} 1601\parameter{\flatrmmt}{sub\_trmm1}{A pointer to the node to be used for the first {\sc trmm} subproblem.} 1602\parameter{\flatrmmt}{sub\_trmm2}{A pointer to the node to be used for the second {\sc trmm} subproblem.} 1603\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1604\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1605\parameter{\flagemmt}{sub\_trsm}{A pointer to the node to be used for the {\sc trsm} subproblem.} 1606\parameter{\flagemmt}{sub\_axpy}{A pointer to the node to be used for the {\sc axpy} subproblem.} 1607\parameter{\flagemmt}{sub\_copy}{A pointer to the node to be used for the {\sc copy} subproblem.} 1608\end{params} 1609\end{flaspec} 1610 1611% --- FLA_Cntl_trinv_obj_create() ---------------------------------------------- 1612 1613\begin{flaspec} 1614\begin{verbatim} 1615fla_trinv_t* FLA_Cntl_trinv_obj_create( FLA_Matrix_type matrix_type, 1616 int variant, 1617 fla_blocksize_t* blocksize, 1618 fla_trinv_t* sub_trinv, 1619 fla_trmm_t* sub_trmm, 1620 fla_trsm_t* sub_trsm1, 1621 fla_trsm_t* sub_trsm2, 1622 fla_gemm_t* sub_gemm ); 1623\end{verbatim} 1624\purpose{ 1625Create a structure representing a node in a control tree for a triangular 1626matrix inversion ({\sc trinv}) operation and initialize its 1627fields according to the function arguments. 1628} 1629\notes{ 1630If \variant is \flasubproblemns, none of the pointer arguments 1631are used and thus they may be safely set to \fnullns. 1632Even if \variant specifies a blocked variant, some algorithms contain 1633fewer subproblems and thus do not use every subproblem field 1634argument. 1635In such cases, these arguments may be safely set to \fnullns. 1636Please refer to the blocked algorithmic variant implementations 1637to determine which subproblem fields are unused. 1638} 1639\begin{checks} 1640\checkitem 1641If \variant is not \flasubproblemns, then it must be one of 1642\flablockedvariantone through \flablockedvariantfourns. 1643%\itemvsp 1644%\checkitem 1645% 1646\end{checks} 1647\rvalue{ 1648A pointer to a heap-allocated \flatrinvt structure. 1649} 1650\begin{params} 1651\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1652\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1653\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1654\parameter{\flacholt}{sub\_trinv}{A pointer to the node to be used for the {\sc trinv} subproblem.} 1655\parameter{\flasyrkt}{sub\_trmm}{A pointer to the node to be used for the {\sc trmm} subproblem.} 1656\parameter{\flasyrkt}{sub\_trsm1}{A pointer to the node to be used for the first {\sc trsm} subproblem.} 1657\parameter{\flatrsmt}{sub\_trsm2}{A pointer to the node to be used for the second {\sc trsm} subproblem.} 1658\parameter{\flatrsmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1659\end{params} 1660\end{flaspec} 1661 1662% --- FLA_Cntl_spdinv_obj_create() --------------------------------------------- 1663 1664\begin{flaspec} 1665\begin{verbatim} 1666fla_spdinv_t* FLA_Cntl_spdinv_obj_create( FLA_Matrix_type matrix_type, 1667 int variant, 1668 fla_blocksize_t* blocksize, 1669 fla_chol_t* sub_chol, 1670 fla_trinv_t* sub_trinv, 1671 fla_ttmm_t* sub_ttmm ); 1672\end{verbatim} 1673\purpose{ 1674Create a structure representing a node in a control tree for a symmetric 1675(or Hermitian) positive definite matrix inversion ({\sc spdinv}) operation 1676and initialize its fields according to the function arguments. 1677} 1678\notes{ 1679Since {\sc spdinv} is implemented as the sequence of its three constituent 1680suboperations, {\sc chol}, {\sc trinv}, and {\sc ttmm} without any matrix 1681partitioning at the {\sc spdinv} level, the \variant field is not used. 1682Also, the {\sc spdinv} front-end interprets the \blocksize field as the 1683cutoff at which to switch from external routines to internal \libflame 1684variants. 1685} 1686%\begin{checks} 1687%\checkitem 1688%If \variant is not \flasubproblemns, then it must be one of 1689%%\itemvsp 1690%%\checkitem 1691%% 1692%\end{checks} 1693\rvalue{ 1694A pointer to a heap-allocated \flaspdinvt structure. 1695} 1696\begin{params} 1697\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1698\parameter{\int}{variant}{Not referenced.} 1699\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created. Note that the front-end interprets these values as the cutoffs at which to switch from external implementations to internal \libflame variants.} 1700\parameter{\flacholt}{sub\_chol}{A pointer to the node to be used for the {\sc chol} suboperation.} 1701\parameter{\flatrinvt}{sub\_trinv}{A pointer to the node to be used for the {\sc trinv} suboperation.} 1702\parameter{\flattmmt}{sub\_ttmm}{A pointer to the node to be used for the {\sc ttmm} suboperation.} 1703\end{params} 1704\end{flaspec} 1705 1706% --- FLA_Cntl_hess_obj_create() ----------------------------------------------- 1707% 1708%\begin{flaspec} 1709%\begin{verbatim} 1710%fla_hess_t* FLA_Cntl_hess_obj_create( FLA_Matrix_type matrix_type, 1711% int variant, 1712% fla_blocksize_t* blocksize, 1713% fla_hess_t* sub_hess, 1714% fla_trmm_t* sub_trmm1, 1715% fla_trmm_t* sub_trmm2, 1716% fla_trmm_t* sub_trmm3, 1717% fla_trmm_t* sub_trmm4, 1718% fla_gemm_t* sub_gemm1, 1719% fla_gemm_t* sub_gemm2, 1720% fla_gemm_t* sub_gemm3 ); 1721%\end{verbatim} 1722%\purpose{ 1723%Create a structure representing a node in a control tree for a reduction 1724%to upper Hessenberg form ({\sc hess}) operation and initialize its 1725%fields according to the function arguments. 1726%} 1727%\notes{ 1728%If \variant is \flasubproblemns, none of the pointer arguments 1729%are used and thus they may be safely set to \fnullns. 1730%Even if \variant specifies a blocked variant, some algorithms contain 1731%fewer subproblems and thus do not use every subproblem field 1732%argument. 1733%In such cases, these arguments may be safely set to \fnullns. 1734%Please refer to the blocked algorithmic variant implementations 1735%to determine which subproblem fields are unused. 1736%} 1737%\devnotes{ 1738%The algorithmic variant implementations for reduction to upper 1739%Hessenberg form do not exist. 1740%If this operation is needed, use the external wrapper routine 1741%{\tt FLA\_Hess\_blk\_external()}. 1742%} 1743%\begin{checks} 1744%\checkitem 1745%If \variant must be \flasubproblem since no blocked variants 1746%for the operation exist yet. 1747%%\itemvsp 1748%%\checkitem 1749%% 1750%\end{checks} 1751%\rvalue{ 1752%A pointer to a heap-allocated \flahesst structure. 1753%} 1754%\begin{params} 1755%\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1756%\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1757%\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1758%\parameter{\flacholt}{sub\_hess}{A pointer to the node to be used for the {\sc hess} subproblem.} 1759%\parameter{\flasyrkt}{sub\_trmm1}{A pointer to the node to be used for the first {\sc trmm} subproblem.} 1760%\parameter{\flasyrkt}{sub\_trmm2}{A pointer to the node to be used for the second {\sc trmm} subproblem.} 1761%\parameter{\flasyrkt}{sub\_trmm3}{A pointer to the node to be used for the third {\sc trmm} subproblem.} 1762%\parameter{\flasyrkt}{sub\_trmm4}{A pointer to the node to be used for the fourth {\sc trmm} subproblem.} 1763%\parameter{\flatrsmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1764%\parameter{\flatrsmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1765%\parameter{\flatrsmt}{sub\_gemm3}{A pointer to the node to be used for the third {\sc gemm} subproblem.} 1766%\end{params} 1767%\end{flaspec} 1768 1769% --- FLA_Cntl_ttmm_obj_create() ----------------------------------------------- 1770 1771\begin{flaspec} 1772\begin{verbatim} 1773fla_ttmm_t* FLA_Cntl_ttmm_obj_create( FLA_Matrix_type matrix_type, 1774 int variant, 1775 fla_blocksize_t* blocksize, 1776 fla_ttmm_t* sub_ttmm, 1777 fla_syrk_t* sub_syrk, 1778 fla_herk_t* sub_herk, 1779 fla_trmm_t* sub_trmm, 1780 fla_gemm_t* sub_gemm ); 1781\end{verbatim} 1782\purpose{ 1783Create a structure representing a node in a control tree for a 1784triangular-transpose 1785matrix multiply ({\sc ttmm}) operation and initialize its 1786fields according to the function arguments. 1787} 1788\notes{ 1789If \variant is \flasubproblemns, none of the pointer arguments 1790are used and thus they may be safely set to \fnullns. 1791Even if \variant specifies a blocked variant, some algorithms contain 1792fewer subproblems and thus do not use every subproblem field 1793argument. 1794In such cases, these arguments may be safely set to \fnullns. 1795Please refer to the blocked algorithmic variant implementations 1796to determine which subproblem fields are unused. 1797} 1798\begin{checks} 1799\checkitem 1800If \variant is not \flasubproblemns, then it must be one of 1801\flablockedvariantone through \flablockedvariantthreens. 1802%\itemvsp 1803%\checkitem 1804% 1805\end{checks} 1806\rvalue{ 1807A pointer to a heap-allocated \flattmmt structure. 1808} 1809\begin{params} 1810\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1811\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1812\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1813\parameter{\flattmmt}{sub\_ttmm}{A pointer to the node to be used for the {\sc ttmm} subproblem.} 1814\parameter{\flasyrkt}{sub\_syrk}{A pointer to the node to be used for the {\sc sryk} subproblem.} 1815\parameter{\flasyrkt}{sub\_herk}{A pointer to the node to be used for the {\sc herk} subproblem.} 1816\parameter{\flatrsmt}{sub\_trmm}{A pointer to the node to be used for the {\sc trmm} subproblem.} 1817\parameter{\flatrsmt}{sub\_gemm}{A pointer to the node to be used for the {\sc gemm} subproblem.} 1818\end{params} 1819\end{flaspec} 1820 1821% --- FLA_Cntl_sylv_obj_create() ----------------------------------------------- 1822 1823\begin{flaspec} 1824\begin{verbatim} 1825fla_sylv_t* FLA_Cntl_sylv_obj_create( FLA_Matrix_type matrix_type, 1826 int variant, 1827 fla_blocksize_t* blocksize, 1828 fla_sylv_t* sub_sylv1, 1829 fla_sylv_t* sub_sylv2, 1830 fla_sylv_t* sub_sylv3, 1831 fla_gemm_t* sub_gemm1, 1832 fla_gemm_t* sub_gemm2, 1833 fla_gemm_t* sub_gemm3, 1834 fla_gemm_t* sub_gemm4, 1835 fla_gemm_t* sub_gemm5, 1836 fla_gemm_t* sub_gemm6, 1837 fla_gemm_t* sub_gemm7, 1838 fla_gemm_t* sub_gemm8 ); 1839\end{verbatim} 1840\purpose{ 1841Create a structure representing a node in a control tree for a triangular 1842Sylvester equation solve ({\sc sylv}) operation and initialize its 1843fields according to the function arguments. 1844} 1845\notes{ 1846If \variant is \flasubproblemns, none of the pointer arguments 1847are used and thus they may be safely set to \fnullns. 1848Even if \variant specifies a blocked variant, some algorithms contain 1849fewer subproblems and thus do not use every subproblem field 1850argument. 1851In such cases, these arguments may be safely set to \fnullns. 1852Please refer to the blocked algorithmic variant implementations 1853to determine which subproblem fields are unused. 1854} 1855\begin{checks} 1856\checkitem 1857If \variant is not \flasubproblemns, then it must be one of 1858\flablockedvariantone through \flablockedvarianteightteenns. 1859%\itemvsp 1860%\checkitem 1861% 1862\end{checks} 1863\rvalue{ 1864A pointer to a heap-allocated \flasylvt structure. 1865} 1866\begin{params} 1867\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1868\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1869\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1870\parameter{\flasylvt}{sub\_sylv1}{A pointer to the node to be used for the first {\sc sylv} subproblem.} 1871\parameter{\flasylvt}{sub\_sylv2}{A pointer to the node to be used for the second {\sc sylv} subproblem.} 1872\parameter{\flasylvt}{sub\_sylv3}{A pointer to the node to be used for the third {\sc sylv} subproblem.} 1873\parameter{\flagemmt}{sub\_gemm1}{A pointer to the node to be used for the first {\sc gemm} subproblem.} 1874\parameter{\flagemmt}{sub\_gemm2}{A pointer to the node to be used for the second {\sc gemm} subproblem.} 1875\parameter{\flagemmt}{sub\_gemm3}{A pointer to the node to be used for the third {\sc gemm} subproblem.} 1876\parameter{\flagemmt}{sub\_gemm4}{A pointer to the node to be used for the fourth {\sc gemm} subproblem.} 1877\parameter{\flagemmt}{sub\_gemm5}{A pointer to the node to be used for the fifth {\sc gemm} subproblem.} 1878\parameter{\flagemmt}{sub\_gemm6}{A pointer to the node to be used for the sixth {\sc gemm} subproblem.} 1879\parameter{\flagemmt}{sub\_gemm7}{A pointer to the node to be used for the seventh {\sc gemm} subproblem.} 1880\parameter{\flagemmt}{sub\_gemm8}{A pointer to the node to be used for the eighth {\sc gemm} subproblem.} 1881\end{params} 1882\end{flaspec} 1883 1884 1885 1886 1887\subsubsection{Miscellaneous operations} 1888 1889% --- FLA_Cntl_swap_obj_create() ----------------------------------------------- 1890 1891\begin{flaspec} 1892\begin{verbatim} 1893fla_swap_t* FLA_Cntl_swap_obj_create( FLA_Matrix_type matrix_type, 1894 int variant, 1895 fla_blocksize_t* blocksize, 1896 fla_swap_t* sub_swap ); 1897\end{verbatim} 1898\purpose{ 1899Create a structure representing a node in a control tree for a matrix swap 1900({\sc swap}) operation and initialize its fields according to the function 1901arguments. 1902} 1903\notes{ 1904If \variant is \flasubproblemns, none of the pointer arguments 1905are used and thus they may be safely set to \fnullns. 1906} 1907\begin{checks} 1908\checkitem 1909If \variant is not \flasubproblemns, then it must be either 1910\flablockedvariantone or \flablockedvarianttwons. 1911%\itemvsp 1912%\checkitem 1913% 1914\end{checks} 1915\rvalue{ 1916A pointer to a heap-allocated \flaswapts structure. 1917} 1918\begin{params} 1919\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1920\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1921\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1922\parameter{\flaswapt}{sub\_swap}{A pointer to the node to be used for the {\sc swap} subproblem.} 1923\end{params} 1924\end{flaspec} 1925 1926% --- FLA_Cntl_transpose_obj_create() ------------------------------------------ 1927 1928\begin{flaspec} 1929\begin{verbatim} 1930fla_transpose_t* FLA_Cntl_transpose_obj_create( FLA_Matrix_type matrix_type, 1931 int variant, 1932 fla_blocksize_t* blocksize, 1933 fla_trans1_t* sub_trans, 1934 fla_swap_t* sub_swap ); 1935\end{verbatim} 1936\purpose{ 1937Create a structure representing a node in a control tree for a matrix 1938transposition ({\sc transpose}) operation and initialize its 1939fields according to the function arguments. 1940} 1941\notes{ 1942If \variant is \flasubproblemns, none of the pointer arguments 1943are used and thus they may be safely set to \fnullns. 1944} 1945\begin{checks} 1946\checkitem 1947If \variant is not \flasubproblemns, then it must be either 1948\flablockedvariantone or \flablockedvarianttwons. 1949%\itemvsp 1950%\checkitem 1951% 1952\end{checks} 1953\rvalue{ 1954A pointer to a heap-allocated \flatransposet structure. 1955} 1956\begin{params} 1957\parameter{\matrixtype}{matrix\_type}{The type of matrix (flat or hierarchical) to support in the control tree in which the node will be used.} 1958\parameter{\int}{variant}{A constant value indicating the choice of variant for executing the computation associated with the control tree node being created.} 1959\parameter{\flablocksizet}{blocksize}{A pointer to a blocksize structure to be used for the node being created.} 1960\parameter{\flatransposet}{sub\_trans}{A pointer to the node to be used for the {\sc transpose} subproblem.} 1961\parameter{\flaswapt}{sub\_swap}{A pointer to the node to be used for the {\sc swap} subproblem.} 1962\end{params} 1963\end{flaspec} 1964 1965% --- FLA_Cntl_obj_free() ------------------------------------------------------ 1966 1967\begin{flaspec} 1968\begin{verbatim} 1969void FLA_Cntl_obj_free( void* cntl ); 1970\end{verbatim} 1971\purpose{ 1972Release the memory allocated for a structure representing a node in a control 1973tree. 1974} 1975\notes{ 1976{\tt FLA\_Cntl\_obj\_free()} should only be used with pointers to control tree 1977structures that were allocated with the {\tt FLA\_Cntl\_*\_create()} routines. 1978} 1979\begin{params} 1980\parameter{\voidp}{cntl}{A pointer to the node to be freed.} 1981\end{params} 1982\end{flaspec} 1983 1984 1985 1986 1987\subsection{Default control trees} 1988 1989The default control trees are created when \libflame is initialized via 1990{\tt FLA\_Init()}. 1991The subroutines in which the actual creation and initialization takes 1992place are named according to the operation name and execution type. 1993For example, control trees for a Cholesky factorization that is to be 1994executed sequentially with conventinal storage are initialized 1995in the subroutine {\tt FLA\_Chol\_cntl\_init()}. 1996Likewise, control trees for a triangular matrix inversion that is to be 1997executed sequentiall with hierarchial storage are initialized 1998in {\tt FLASH\_Trinv\_cntl\_init()}. 1999Figure \ref{fig:cntl-init} shows examples of these routines for 2000the Cholesky factorizatoin with hierarchical storage, which gives the 2001reader an idea of how control trees should be initialized. 2002 2003\input{figs/50-cntl-init} 2004 2005 2006 2007 2008\subsection{Operation front-ends} 2009 2010Once the library has been initialized, the default set of control trees 2011are ready to use. 2012Figure \ref{fig:cntl-front-ends} shows examples of some front-end 2013routines found in \libflamens. 2014This illustrates how control trees are used at the highest level. 2015 2016\input{figs/50-cntl-front-ends} 2017 2018 2019 2020 2021\subsection{Internal back-ends} 2022 2023\index{developer APIs!internal back-ends} 2024 2025The \libflame front-ends and algorithmic variant implementations both 2026directly invoke internal back-end functions. 2027It is here that the control tree is decoded and used to determine 2028how execution will proceed with respect to variant and execution 2029type. 2030Figure \ref{fig:cntl-internal-back-ends} shows the internal routines 2031for Cholesky factorization. 2032 2033\input{figs/50-cntl-internal-back-ends} 2034 2035Interfaces for the supported operation back-ends follow. 2036 2037 2038 2039\subsubsection{Level-3 BLAS operations} 2040 2041% --- FLA_Gemm_internal() ------------------------------------------------------ 2042 2043\begin{flaspec} 2044\begin{verbatim} 2045void FLA_Gemm_internal( FLA_Trans transa, FLA_Trans transb, FLA_Obj alpha, 2046 FLA_Obj A, FLA_Obj B, FLA_Obj beta, FLA_Obj C, 2047 fla_gemm_t* cntl ); 2048\end{verbatim} 2049\purpose{ 2050Perform a {\sc gemm} operation on $ A $, $ B $, and $ C $ according to the 2051parameters specified by control tree node \cntlns. 2052} 2053\begin{checks} 2054\checkitem 2055\cntl must not be \fnullns. 2056%\itemvsp 2057%\checkitem 2058% 2059\end{checks} 2060\moreinfo{ 2061This function's interface is similar to that of \flagemmns. 2062Please see the description for \flagemm for further details. 2063} 2064\end{flaspec} 2065 2066% --- FLA_Hemm_internal() ------------------------------------------------------ 2067 2068\begin{flaspec} 2069\begin{verbatim} 2070void FLA_Hemm_internal( FLA_Side side, FLA_Uplo uplo, FLA_Obj alpha, 2071 FLA_Obj A, FLA_Obj B, FLA_Obj beta, FLA_Obj C, 2072 fla_hemm_t* cntl ); 2073\end{verbatim} 2074\purpose{ 2075Perform a {\sc hemm} operation on $ A $, $ B $, and $ C $ according to the 2076parameters specified by control tree node \cntlns. 2077} 2078\begin{checks} 2079\checkitem 2080\cntl must not be \fnullns. 2081%\itemvsp 2082%\checkitem 2083% 2084\end{checks} 2085\moreinfo{ 2086This function's interface is similar to that of \flahemmns. 2087Please see the description for \flahemm for further details. 2088} 2089\end{flaspec} 2090 2091% --- FLA_Herk_internal() ------------------------------------------------------ 2092 2093\begin{flaspec} 2094\begin{verbatim} 2095void FLA_Herk_internal( FLA_Uplo uplo, FLA_Trans trans, FLA_Obj alpha, 2096 FLA_Obj A, FLA_Obj beta, FLA_Obj C, 2097 fla_herk_t* cntl ); 2098\end{verbatim} 2099\purpose{ 2100Perform a {\sc herk} operation on $ A $ and $ C $ according to the 2101parameters specified by control tree node \cntlns. 2102} 2103\begin{checks} 2104\checkitem 2105\cntl must not be \fnullns. 2106%\itemvsp 2107%\checkitem 2108% 2109\end{checks} 2110\moreinfo{ 2111This function's interface is similar to that of \flaherkns. 2112Please see the description for \flaherk for further details. 2113} 2114\end{flaspec} 2115 2116% --- FLA_Her2k_internal() ----------------------------------------------------- 2117 2118\begin{flaspec} 2119\begin{verbatim} 2120void FLA_Her2k_internal( FLA_Uplo uplo, FLA_Trans trans, FLA_Obj alpha, 2121 FLA_Obj A, FLA_Obj B, FLA_Obj beta, FLA_Obj C, 2122 fla_her2k_t* cntl ); 2123\end{verbatim} 2124\purpose{ 2125Perform a {\sc her2k} operation on $ A $, $ B $, and $ C $ according to the 2126parameters specified by control tree node \cntlns. 2127} 2128\begin{checks} 2129\checkitem 2130\cntl must not be \fnullns. 2131%\itemvsp 2132%\checkitem 2133% 2134\end{checks} 2135\moreinfo{ 2136This function's interface is similar to that of \flahertkns. 2137Please see the description for \flahertk for further details. 2138} 2139\end{flaspec} 2140 2141% --- FLA_Symm_internal() ------------------------------------------------------ 2142 2143\begin{flaspec} 2144\begin{verbatim} 2145void FLA_Symm_internal( FLA_Side side, FLA_Uplo uplo, FLA_Obj alpha, 2146 FLA_Obj A, FLA_Obj B, FLA_Obj beta, FLA_Obj C, 2147 fla_symm_t* cntl ); 2148\end{verbatim} 2149\purpose{ 2150Perform a {\sc symm} operation on $ A $, $ B $, and $ C $ according to the 2151parameters specified by control tree node \cntlns. 2152} 2153\begin{checks} 2154\checkitem 2155\cntl must not be \fnullns. 2156%\itemvsp 2157%\checkitem 2158% 2159\end{checks} 2160\moreinfo{ 2161This function's interface is similar to that of \flasymmns. 2162Please see the description for \flasymm for further details. 2163} 2164\end{flaspec} 2165 2166% --- FLA_Syrk_internal() ------------------------------------------------------ 2167 2168\begin{flaspec} 2169\begin{verbatim} 2170void FLA_Syrk_internal( FLA_Uplo uplo, FLA_Trans trans, FLA_Obj alpha, 2171 FLA_Obj A, FLA_Obj beta, FLA_Obj C, 2172 fla_syrk_t* cntl ); 2173\end{verbatim} 2174\purpose{ 2175Perform a {\sc syrk} operation on $ A $ and $ C $ according to the 2176parameters specified by control tree node \cntlns. 2177} 2178\begin{checks} 2179\checkitem 2180\cntl must not be \fnullns. 2181%\itemvsp 2182%\checkitem 2183% 2184\end{checks} 2185\moreinfo{ 2186This function's interface is similar to that of \flasyrkns. 2187Please see the description for \flasyrk for further details. 2188} 2189\end{flaspec} 2190 2191% --- FLA_Syr2k_internal() ----------------------------------------------------- 2192 2193\begin{flaspec} 2194\begin{verbatim} 2195void FLA_Syr2k_internal( FLA_Uplo uplo, FLA_Trans trans, FLA_Obj alpha, 2196 FLA_Obj A, FLA_Obj B, FLA_Obj beta, FLA_Obj C, 2197 fla_syr2k_t* cntl ); 2198\end{verbatim} 2199\purpose{ 2200Perform a {\sc syr2k} operation on $ A $, $ B $, and $ C $ according to the 2201parameters specified by control tree node \cntlns. 2202} 2203\begin{checks} 2204\checkitem 2205\cntl must not be \fnullns. 2206%\itemvsp 2207%\checkitem 2208% 2209\end{checks} 2210\moreinfo{ 2211This function's interface is similar to that of \flasyrtkns. 2212Please see the description for \flasyrtk for further details. 2213} 2214\end{flaspec} 2215 2216% --- FLA_Trmm_internal() ------------------------------------------------------ 2217 2218\begin{flaspec} 2219\begin{verbatim} 2220void FLA_Trmm_internal( FLA_Side side, FLA_Uplo uplo, FLA_Trans trans, 2221 FLA_Diag diag, FLA_Obj alpha, FLA_Obj A, FLA_Obj B, 2222 fla_trmm_t* cntl ); 2223\end{verbatim} 2224\purpose{ 2225Perform a {\sc trmm} operation on $ A $ and $ B $ according to the 2226parameters specified by control tree node \cntlns. 2227} 2228\begin{checks} 2229\checkitem 2230\cntl must not be \fnullns. 2231%\itemvsp 2232%\checkitem 2233% 2234\end{checks} 2235\moreinfo{ 2236This function's interface is similar to that of \flatrmmns. 2237Please see the description for \flatrmm for further details. 2238} 2239\end{flaspec} 2240 2241% --- FLA_Trsm_internal() ------------------------------------------------------ 2242 2243\begin{flaspec} 2244\begin{verbatim} 2245void FLA_Trsm_internal( FLA_Side side, FLA_Uplo uplo, FLA_Trans trans, FLA_Diag diag, 2246 FLA_Obj alpha, FLA_Obj A, FLA_Obj B, 2247 fla_trsm_t* cntl ); 2248\end{verbatim} 2249\purpose{ 2250Perform a {\sc trsm} operation on $ A $ and $ B $ according to the 2251parameters specified by control tree node \cntlns. 2252} 2253\begin{checks} 2254\checkitem 2255\cntl must not be \fnullns. 2256%\itemvsp 2257%\checkitem 2258% 2259\end{checks} 2260\moreinfo{ 2261This function's interface is similar to that of \flatrsmns. 2262Please see the description for \flatrsm for further details. 2263} 2264\end{flaspec} 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275\subsubsection{LAPACK operations} 2276 2277% --- FLA_Chol_internal() ------------------------------------------------------ 2278 2279\begin{flaspec} 2280\begin{verbatim} 2281FLA_Error FLA_Chol_internal( FLA_Uplo uplo, FLA_Obj A, fla_chol_t* cntl ); 2282\end{verbatim} 2283\purpose{ 2284Perform a {\sc chol} operation on $ A $ according to the 2285parameters specified by control tree node \cntlns. 2286} 2287\begin{checks} 2288\checkitem 2289\cntl must not be \fnullns. 2290%\itemvsp 2291%\checkitem 2292% 2293\end{checks} 2294\moreinfo{ 2295This function's interface is similar to that of \flacholns. 2296Please see the description for \flachol for further details. 2297} 2298\end{flaspec} 2299 2300% --- FLA_Trinv_internal() ----------------------------------------------------- 2301 2302\begin{flaspec} 2303\begin{verbatim} 2304FLA_Error FLA_Trinv_internal( FLA_Uplo uplo, FLA_Diag diag, FLA_Obj A, 2305 fla_trinv_t* cntl ); 2306\end{verbatim} 2307\purpose{ 2308Perform a {\sc trinv} operation on $ A $ according to the 2309parameters specified by control tree node \cntlns. 2310} 2311\begin{checks} 2312\checkitem 2313\cntl must not be \fnullns. 2314%\itemvsp 2315%\checkitem 2316% 2317\end{checks} 2318\moreinfo{ 2319This function's interface is similar to that of \flatrinvns. 2320Please see the description for \flatrinv for further details. 2321} 2322\end{flaspec} 2323 2324% --- FLA_Ttmm_internal() ------------------------------------------------------ 2325 2326\begin{flaspec} 2327\begin{verbatim} 2328void FLA_Ttmm_internal( FLA_Uplo uplo, FLA_Obj A, fla_ttmm_t* cntl ); 2329\end{verbatim} 2330\purpose{ 2331Perform a {\sc ttmm} operation on $ A $ according to the 2332parameters specified by control tree node \cntlns. 2333} 2334\begin{checks} 2335\checkitem 2336\cntl must not be \fnullns. 2337%\itemvsp 2338%\checkitem 2339% 2340\end{checks} 2341\moreinfo{ 2342This function's interface is similar to that of \flattmmns. 2343Please see the description for \flattmm for further details. 2344} 2345\end{flaspec} 2346 2347% --- FLA_SPDinv_internal() ---------------------------------------------------- 2348 2349\begin{flaspec} 2350\begin{verbatim} 2351void FLA_SPDinv_internal( FLA_Uplo uplo, FLA_Obj A, fla_spdinv_t* cntl ); 2352\end{verbatim} 2353\purpose{ 2354Perform a {\sc spdinv} operation on $ A $ according to the 2355parameters specified by control tree node \cntlns. 2356} 2357\begin{checks} 2358\checkitem 2359\cntl must not be \fnullns. 2360%\itemvsp 2361%\checkitem 2362% 2363\end{checks} 2364\moreinfo{ 2365This function's interface is similar to that of \flaspdinvns. 2366Please see the description for \flaspdinv for further details. 2367} 2368\end{flaspec} 2369 2370% --- FLA_Hess_internal() ------------------------------------------------------ 2371% 2372%\begin{flaspec} 2373%\begin{verbatim} 2374%void FLA_Hess_internal( FLA_Obj A, FLA_Obj t, int ilo, int ihi, 2375% fla_hess_t* cntl ); 2376%\end{verbatim} 2377%\purpose{ 2378%Perform a {\sc hess} operation on $ A $ according to the 2379%parameters specified by control tree node \cntlns. 2380%} 2381%\begin{checks} 2382%\checkitem 2383%\cntl must not be \fnullns. 2384%%\itemvsp 2385%%\checkitem 2386%% 2387%\end{checks} 2388%\moreinfo{ 2389%This function's interface is similar to that of \flahessns. 2390%Please see the description for \flahess for further details. 2391%} 2392%\end{flaspec} 2393 2394% --- FLA_LU_nopiv_internal() -------------------------------------------------- 2395 2396\begin{flaspec} 2397\begin{verbatim} 2398FLA_Error FLA_LU_nopiv_internal( FLA_Obj A, fla_lu_t* cntl ); 2399\end{verbatim} 2400\purpose{ 2401Perform a {\sc lunopiv} operation on $ A $ according to the 2402parameters specified by control tree node \cntlns. 2403} 2404\begin{checks} 2405\checkitem 2406\cntl must not be \fnullns. 2407%\itemvsp 2408%\checkitem 2409% 2410\end{checks} 2411\moreinfo{ 2412This function's interface is similar to that of \flalunopivns. 2413Please see the description for \flalunopiv for further details. 2414} 2415\end{flaspec} 2416 2417% --- FLA_LU_piv_internal() ---------------------------------------------------- 2418 2419\begin{flaspec} 2420\begin{verbatim} 2421FLA_Error FLA_LU_piv_internal( FLA_Obj A, FLA_Obj p, fla_lu_t* cntl ); 2422\end{verbatim} 2423\purpose{ 2424Perform a {\sc lupiv} operation on $ A $ according to the 2425parameters specified by control tree node \cntlns. 2426} 2427\begin{checks} 2428\checkitem 2429\cntl must not be \fnullns. 2430%\itemvsp 2431%\checkitem 2432% 2433\end{checks} 2434\moreinfo{ 2435This function's interface is similar to that of \flalunopivns. 2436Please see the description for \flalunopiv for further details. 2437} 2438\end{flaspec} 2439 2440% --- FLA_QR_UT_internal() ----------------------------------------------------- 2441 2442\begin{flaspec} 2443\begin{verbatim} 2444void FLA_QR_UT_internal( FLA_Obj A, FLA_Obj T, fla_qrut_t* cntl ); 2445\end{verbatim} 2446\purpose{ 2447Perform a {\sc qrut} operation on $ A $ according to the 2448parameters specified by control tree node \cntlns. 2449} 2450\begin{checks} 2451\checkitem 2452\cntl must not be \fnullns. 2453%\itemvsp 2454%\checkitem 2455% 2456\end{checks} 2457\moreinfo{ 2458This function's interface is similar to that of \flaqrutns. 2459Please see the description for \flaqrut for further details. 2460} 2461\end{flaspec} 2462 2463% --- FLA_LQ_UT_internal() ----------------------------------------------------- 2464 2465\begin{flaspec} 2466\begin{verbatim} 2467void FLA_LQ_UT_internal( FLA_Obj A, FLA_Obj T, fla_lq_t* cntl ); 2468\end{verbatim} 2469\purpose{ 2470Perform a {\sc lqut} operation on $ A $ according to the 2471parameters specified by control tree node \cntlns. 2472} 2473\begin{checks} 2474\checkitem 2475\cntl must not be \fnullns. 2476%\itemvsp 2477%\checkitem 2478% 2479\end{checks} 2480\moreinfo{ 2481This function's interface is similar to that of \flalqutns. 2482Please see the description for \flalqut for further details. 2483} 2484\end{flaspec} 2485 2486% --- FLA_Sylv_internal() ------------------------------------------------------ 2487 2488\begin{flaspec} 2489\begin{verbatim} 2490void FLA_Sylv_internal( FLA_Trans transa, FLA_Trans transb, FLA_Obj isgn, 2491 FLA_Obj A, FLA_Obj B, FLA_Obj C, FLA_Obj scale, 2492 fla_sylv_t* cntl ); 2493\end{verbatim} 2494\purpose{ 2495Perform a {\sc sylv} operation on $ A $, $ B $, and $ C $ according to the 2496parameters specified by control tree node \cntlns. 2497} 2498\begin{checks} 2499\checkitem 2500\cntl must not be \fnullns. 2501%\itemvsp 2502%\checkitem 2503% 2504\end{checks} 2505\moreinfo{ 2506This function's interface is similar to that of \flasylvns. 2507Please see the description for \flasylv for further details. 2508} 2509\end{flaspec} 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523\subsection{Algorithmic variants} 2524 2525The algorithmic variants in \libflame are coded differently than earlier 2526incarnations of FLAME/C. 2527Figure \ref{fig:noncntl-cntl-chol-code} illustrates these differences 2528for the blocked FLAME/C implementation of algorithmic variant 3 of 2529Cholesky factorization. 2530The top-left code example shows what the code might look like when 2531the programmer used unblocked FLAME variants to perform subproblems. 2532The top-right code is similar, except that it uses blocked variants. 2533Note that this code uses the same algorithmic blocksize for it subproblems 2534as it does for matrix partitioning within the Cholesky algorithm. 2535The bottom-left code again shows how the FLAME group used to present 2536its codes during lectures and presentations, where the routines 2537{\tt FLA\_Chol()}, {\tt FLA\_Trsm()}, and {\tt FLA\_Syrk()} were 2538wrappers to external implementations of those operations.\footnote{ 2539Notice that this class of function names is now reserved for the user-level 2540front-end interfaces documented in Sections \ref{sec:blas3-front-ends} and 2541\ref{sec:lapack-front-ends}, and the corresponding external routines are now 2542explicitly named as {\tt FLA\_Chol\_unb\_external()}, 2543{\tt FLA\_Trsm\_external()}, and {\tt FLA\_Syrk\_external()}. 2544} 2545Finally, the bottom-right code shows how algorithms are now coded within 2546\libflamens, using control trees. 2547The most notable difference between this code and the others is 2548that the subproblems invoke internal back-end routines for the 2549operation in question rather than statically specifying an unblocked, 2550blocked, or external implementation. 2551%All of the details of the implementation for a particular subproblem 2552%are specified within its corresponding control tree node. 2553 2554%A partial but representative list of algorithmic variants is given in Section 2555%\ref{sec:algorithmic-variants}. 2556 2557\input{figs/50-noncntl-cntl-chol-code} 2558 2559 2560 2561 2562 2563\section{Parameter and error checking} 2564 2565\index{developer APIs!parameter and error checking} 2566 2567 2568 2569 2570 2571\subsection{Linear algebra parameters} 2572 2573% --- FLA_Check_valid_side() --------------------------------------------------- 2574 2575\begin{flaspec} 2576\begin{verbatim} 2577FLA_Error FLA_Check_valid_side( FLA_Side side ); 2578\end{verbatim} 2579\purpose{ 2580Confirm that \side is one of the following values defined for the 2581\flaside type: 2582\flaleftns, \flarightns, \flatopns, \flabottomns. 2583} 2584\rvalue{ 2585\flasuccess if \side is valid; 2586\flainvalidside otherwise. 2587} 2588%\begin{params} 2589%\parameter{\flaside}{side}{A constant value to check against.} 2590%\end{params} 2591\end{flaspec} 2592 2593% --- FLA_Check_valid_uplo() --------------------------------------------------- 2594 2595\begin{flaspec} 2596\begin{verbatim} 2597FLA_Error FLA_Check_valid_uplo( FLA_Uplo uplo ); 2598\end{verbatim} 2599\purpose{ 2600Confirm that \uplo is one of the following values defined for the 2601\flauplo type: 2602\flalowertriangularns, \flauppertriangularns. 2603} 2604\rvalue{ 2605\flasuccess if \uplo is valid; 2606\flainvaliduplo otherwise. 2607} 2608%\begin{params} 2609%\parameter{\flauplo}{uplo}{A constant value to check against.} 2610%\end{params} 2611\end{flaspec} 2612 2613% --- FLA_Check_valid_trans() -------------------------------------------------- 2614 2615\begin{flaspec} 2616\begin{verbatim} 2617FLA_Error FLA_Check_valid_trans( FLA_Trans trans ); 2618\end{verbatim} 2619\purpose{ 2620Confirm that \trans is one of the following values defined for the 2621\flatrans type: 2622\flanotranspose, \flatransposens, \flaconjtransposens, \flaconjnotransposens. 2623} 2624\rvalue{ 2625\flasuccess if \trans is valid; 2626\flainvalidtrans otherwise. 2627} 2628%\begin{params} 2629%\parameter{\flatrans}{trans}{A constant value to check against.} 2630%\end{params} 2631\end{flaspec} 2632 2633% --- FLA_Check_valid_real_trans() --------------------------------------------- 2634 2635\begin{flaspec} 2636\begin{verbatim} 2637FLA_Error FLA_Check_valid_real_trans( FLA_Trans trans ); 2638\end{verbatim} 2639\purpose{ 2640Confirm that \trans is either \flanotranspose or \flatransposens. 2641} 2642\notes{ 2643This check is typically used with \trans arguments that are expected to be 2644applied to real matrices. 2645} 2646\rvalue{ 2647\flasuccess if the \trans argument is either \flanotranspose or 2648\flatransposens. 2649\flainvalidrealtrans otherwise. 2650} 2651\end{flaspec} 2652 2653% --- FLA_Check_valid_complex_trans() ------------------------------------------ 2654 2655\begin{flaspec} 2656\begin{verbatim} 2657FLA_Error FLA_Check_valid_complex_trans( FLA_Trans trans ); 2658\end{verbatim} 2659\purpose{ 2660Confirm that \trans is either \flanotranspose or \flaconjtransposens. 2661} 2662\notes{ 2663This check is typically used with \trans arguments that are expected to be 2664applied to Hermitian matrices. 2665} 2666\rvalue{ 2667\flasuccess if the \trans argument is either \flanotranspose or 2668\flaconjtransposens. 2669\flainvalidcomplextrans otherwise. 2670} 2671\end{flaspec} 2672 2673% --- FLA_Check_valid_blas_trans() --------------------------------------------- 2674 2675\begin{flaspec} 2676\begin{verbatim} 2677FLA_Error FLA_Check_valid_blas_trans( FLA_Trans trans ); 2678\end{verbatim} 2679\purpose{ 2680Confirm that \trans is either \flanotransposens, \flatransposens, or 2681\flaconjtransposens. 2682} 2683\notes{ 2684This check is typically used with \trans arguments that are expected to be 2685applied to general matrices. 2686Valid values correspond to those supported by the BLAS interface, and 2687thus \flaconjnotranspose is not allowed. 2688} 2689\rvalue{ 2690\flasuccess if the \trans argument is either \flanotransposens, 2691\flatransposens, or \flaconjtransposens. 2692\flainvalidblastrans otherwise. 2693} 2694\end{flaspec} 2695 2696% --- FLA_Check_valid_diag() --------------------------------------------------- 2697 2698\begin{flaspec} 2699\begin{verbatim} 2700FLA_Error FLA_Check_valid_diag( FLA_Diag diag ); 2701\end{verbatim} 2702\purpose{ 2703Confirm that \diag is one of the following values defined for the 2704\fladiag type: 2705\flanonunitdiagns, \flaunitdiagns, \flazerodiagns. 2706} 2707\rvalue{ 2708\flasuccess if \diag is valid; 2709\flainvaliddiag otherwise. 2710} 2711%\begin{params} 2712%\parameter{\fladiag}{diag}{A constant value to check against.} 2713%\end{params} 2714\end{flaspec} 2715 2716% --- FLA_Check_valid_conj() --------------------------------------------------- 2717 2718\begin{flaspec} 2719\begin{verbatim} 2720FLA_Error FLA_Check_valid_conj( FLA_Conj conj ); 2721\end{verbatim} 2722\purpose{ 2723Confirm that \conj is one of the following values defined for the 2724\flaconj type: 2725\flanoconjugatens, \flaconjugatens. 2726} 2727\rvalue{ 2728\flasuccess if \conj is valid; 2729\flainvalidconj otherwise. 2730} 2731%\begin{params} 2732%\parameter{\flaconj}{conj}{A constant value to check against.} 2733%\end{params} 2734\end{flaspec} 2735 2736% --- FLA_Check_valid_direct() ------------------------------------------------- 2737 2738\begin{flaspec} 2739\begin{verbatim} 2740FLA_Error FLA_Check_valid_direct( FLA_Direct direct ); 2741\end{verbatim} 2742\purpose{ 2743Confirm that \direct is one of the following values defined for the 2744\fladirect type: 2745\flaforwardns, \flabackwardns. 2746} 2747\rvalue{ 2748\flasuccess if \direct is valid; 2749\flainvaliddirect otherwise. 2750} 2751%\begin{params} 2752%\parameter{\fladirect}{direct}{A constant value to check against.} 2753%\end{params} 2754\end{flaspec} 2755 2756% --- FLA_Check_valid_storev() ------------------------------------------------- 2757 2758\begin{flaspec} 2759\begin{verbatim} 2760FLA_Error FLA_Check_valid_storev( FLA_Store storev ); 2761\end{verbatim} 2762\purpose{ 2763Confirm that \storev is one of the following values defined for the 2764\flastore type: 2765\flacolumnwisens, \flarowwisens. 2766} 2767\rvalue{ 2768\flasuccess if \storev is valid; 2769\flainvalidstorev otherwise. 2770} 2771%\begin{params} 2772%\parameter{\flastore}{storev}{A constant value to check against.} 2773%\end{params} 2774\end{flaspec} 2775 2776% --- FLA_Check_valid_quadrant() ----------------------------------------------- 2777 2778\begin{flaspec} 2779\begin{verbatim} 2780FLA_Error FLA_Check_valid_quadrant( FLA_Quadrant quadrant ); 2781\end{verbatim} 2782\purpose{ 2783Confirm that \quadrant is one of the following values defined for the 2784\flaquadrant type: 2785\flatlns, \flatrns, \flablns, \flabrns. 2786} 2787\rvalue{ 2788\flasuccess if \quadrant is valid; 2789\flainvalidquadrant otherwise. 2790} 2791\end{flaspec} 2792 2793 2794 2795 2796\subsection{Datatypes} 2797 2798% --- FLA_Check_valid_datatype() ----------------------------------------------- 2799 2800\begin{flaspec} 2801\begin{verbatim} 2802FLA_Error FLA_Check_valid_datatype( FLA_Datatype datatype ); 2803\end{verbatim} 2804\purpose{ 2805Confirm that \datatype is one of the following values defined for the 2806\fladatatype type: 2807\flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns, 2808\flaintns, \flaconstantns. 2809} 2810\rvalue{ 2811\flasuccess if \datatype is valid; 2812\flainvaliddatatype otherwise. 2813} 2814%\begin{params} 2815%\parameter{\fladatatype}{datatype}{A constant value to check against.} 2816%\end{params} 2817\end{flaspec} 2818 2819% --- FLA_Check_valid_object_datatype() ---------------------------------------- 2820 2821\begin{flaspec} 2822\begin{verbatim} 2823FLA_Error FLA_Check_valid_object_datatype( FLA_Obj A ); 2824\end{verbatim} 2825\purpose{ 2826Confirm that the datatype of $ A $ is one of the following values 2827defined for the \fladatatype type: 2828\flaintns, \flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns, 2829\flaconstantns. 2830} 2831\rvalue{ 2832\flasuccess if the datatype of $ A $ is valid; 2833\flainvaliddatatype otherwise. 2834} 2835%\begin{params} 2836%\parameter{\flaobj}{A}{An \flaobj to check.} 2837%\end{params} 2838\end{flaspec} 2839 2840% --- FLA_Check_floating_datatype() -------------------------------------------- 2841 2842\begin{flaspec} 2843\begin{verbatim} 2844FLA_Error FLA_Check_floating_datatype( FLA_Datatype datatype ); 2845\end{verbatim} 2846\purpose{ 2847Confirm that \datatype refers to one of the following floating point 2848type values defined for \fladatatypens: 2849\flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns, 2850\flaconstantns. 2851} 2852\notes{ 2853Though it is a distinct type, \flaconstant is polymorphic and thus may 2854be considered floating point for the purposes of this function. 2855} 2856\rvalue{ 2857\flasuccess if \datatype is floating point; 2858\flainvalidfloatingdatatype otherwise. 2859} 2860%\begin{params} 2861%\parameter{\fladatatype}{datatype}{A constant value to check against.} 2862%\end{params} 2863\end{flaspec} 2864 2865% --- FLA_Check_int_datatype() ------------------------------------------------- 2866 2867\begin{flaspec} 2868\begin{verbatim} 2869FLA_Error FLA_Check_int_datatype( FLA_Datatype datatype ); 2870\end{verbatim} 2871\purpose{ 2872Confirm that \datatype refers to one of the following integer 2873type values defined for \fladatatypens: 2874\flaintns, \flaconstantns. 2875} 2876\notes{ 2877Though it is a distinct type, \flaconstant is polymorphic and thus may 2878be considered integer for the purposes of this function. 2879} 2880\rvalue{ 2881\flasuccess if \datatype is integer; 2882\flainvalidintegerdatatype otherwise. 2883} 2884%\begin{params} 2885%\parameter{\fladatatype}{datatype}{A constant value to check against.} 2886%\end{params} 2887\end{flaspec} 2888 2889% --- FLA_Check_real_datatype() ------------------------------------------------ 2890 2891\begin{flaspec} 2892\begin{verbatim} 2893FLA_Error FLA_Check_real_datatype( FLA_Datatype datatype ); 2894\end{verbatim} 2895\purpose{ 2896Confirm that \datatype refers to one of the following real numerical 2897type values defined for \fladatatypens: 2898\flafloatns, \fladoublens, \flaconstantns. 2899} 2900\notes{ 2901Though it is a distinct type, \flaconstant is polymorphic and thus may 2902be considered real for the purposes of this function. 2903} 2904\rvalue{ 2905\flasuccess if \datatype is real; 2906\flainvalidrealdatatype otherwise. 2907} 2908%\begin{params} 2909%\parameter{\fladatatype}{datatype}{A constant value to check against.} 2910%\end{params} 2911\end{flaspec} 2912 2913% --- FLA_Check_complex_datatype() --------------------------------------------- 2914 2915\begin{flaspec} 2916\begin{verbatim} 2917FLA_Error FLA_Check_complex_datatype( FLA_Datatype datatype ); 2918\end{verbatim} 2919\purpose{ 2920Confirm that \datatype refers to one of the following complex numerical 2921type values defined for \fladatatypens: 2922\flacomplexns, \fladoublecomplexns, \flaconstantns. 2923} 2924\notes{ 2925Though it is a distinct type, \flaconstant is polymorphic and thus may 2926be considered complex for the purposes of this function. 2927} 2928\rvalue{ 2929\flasuccess if \datatype is complex; 2930\flainvalidcomplexdatatype otherwise. 2931} 2932%\begin{params} 2933%\parameter{\fladatatype}{datatype}{A constant value to check against.} 2934%\end{params} 2935\end{flaspec} 2936 2937% --- FLA_Check_nonconstant_datatype() ----------------------------------------- 2938 2939\begin{flaspec} 2940\begin{verbatim} 2941FLA_Error FLA_Check_nonconstant_datatype( FLA_Datatype datatype ); 2942\end{verbatim} 2943\purpose{ 2944Confirm that \datatype is one of the following non-constant values 2945defined for the \fladatatype type: 2946\flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns, 2947\flaintns. 2948} 2949\notes{ 2950This function is similar to {\tt FLA\_Check\_valid\_datatype()}, 2951except that it does not allow \flaconstantns. 2952} 2953\rvalue{ 2954\flasuccess if \datatype specifies a non-constant datatype; 2955\flainvalidnonconstantdatatype otherwise. 2956} 2957\end{flaspec} 2958 2959% --- FLA_Check_floating_object() ---------------------------------------------- 2960 2961\begin{flaspec} 2962\begin{verbatim} 2963FLA_Error FLA_Check_floating_object( FLA_Obj A ); 2964\end{verbatim} 2965\purpose{ 2966Confirm that the datatype of $ A $ is one of the following floating point 2967type values defined for \fladatatypens: 2968\flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns, 2969\flaconstantns. 2970} 2971\notes{ 2972Though it is a distinct type, \flaconstant is polymorphic and thus may 2973be considered floating point for the purposes of this function. 2974} 2975\rvalue{ 2976\flasuccess if the datatype of $ A $ is floating point; 2977\flaobjectnotfloatingpoint otherwise. 2978} 2979%\begin{params} 2980%\parameter{\flaobj}{A}{An \flaobj to check.} 2981%\end{params} 2982\end{flaspec} 2983 2984% --- FLA_Check_int_object() --------------------------------------------------- 2985 2986\begin{flaspec} 2987\begin{verbatim} 2988FLA_Error FLA_Check_int_object( FLA_Obj A ); 2989\end{verbatim} 2990\purpose{ 2991Confirm that the datatype of $ A $ is one of the following integer 2992type values defined for \fladatatypens: 2993\flaintns, \flaconstantns. 2994} 2995\notes{ 2996Though it is a distinct type, \flaconstant is polymorphic and thus may 2997be considered integer for the purposes of this function. 2998} 2999\rvalue{ 3000\flasuccess if the datatype of $ A $ is integer; 3001\flaobjectnotinteger otherwise. 3002} 3003%\begin{params} 3004%\parameter{\flaobj}{A}{An \flaobj to check.} 3005%\end{params} 3006\end{flaspec} 3007 3008% --- FLA_Check_real_object() -------------------------------------------------- 3009 3010\begin{flaspec} 3011\begin{verbatim} 3012FLA_Error FLA_Check_real_object( FLA_Obj A ); 3013\end{verbatim} 3014\purpose{ 3015Confirm that the datatype of $ A $ is one of the following real numerical 3016type values defined for \fladatatypens: 3017\flafloatns, \fladoublens, \flaconstantns. 3018} 3019\notes{ 3020Though it is a distinct type, \flaconstant is polymorphic and thus may 3021be considered real for the purposes of this function. 3022} 3023\rvalue{ 3024\flasuccess if the datatype of $ A $ is real; 3025\flaobjectnotreal otherwise. 3026} 3027%\begin{params} 3028%\parameter{\flaobj}{A}{An \flaobj to check.} 3029%\end{params} 3030\end{flaspec} 3031 3032% --- FLA_Check_complex_object() ----------------------------------------------- 3033 3034\begin{flaspec} 3035\begin{verbatim} 3036FLA_Error FLA_Check_complex_object( FLA_Obj A ); 3037\end{verbatim} 3038\purpose{ 3039Confirm that the datatype of $ A $ is one of the following complex numerical 3040type values defined for \fladatatypens: 3041\flacomplexns, \fladoublecomplexns, \flaconstantns. 3042} 3043\notes{ 3044Though it is a distinct type, \flaconstant is polymorphic and thus may 3045be considered complex for the purposes of this function. 3046} 3047\rvalue{ 3048\flasuccess if the datatype of $ A $ is complex; 3049\flaobjectnotcomplex otherwise. 3050} 3051%\begin{params} 3052%\parameter{\flaobj}{A}{An \flaobj to check.} 3053%\end{params} 3054\end{flaspec} 3055 3056% --- FLA_Check_nonconstant_object() ------------------------------------------- 3057 3058\begin{flaspec} 3059\begin{verbatim} 3060FLA_Error FLA_Check_nonconstant_object( FLA_Obj A ); 3061\end{verbatim} 3062\purpose{ 3063Confirm that the datatype of $ A $ is one of the following non-constant values 3064defined for the \fladatatype type: 3065\flaintns, \flafloatns, \fladoublens, \flacomplexns, \fladoublecomplexns. 3066} 3067\rvalue{ 3068\flasuccess if the datatype of $ A $ is a non-constant datatype; 3069\flaobjectnotnonconstant otherwise. 3070} 3071\end{flaspec} 3072 3073% --- FLA_Check_identical_object_datatype() ------------------------------------ 3074 3075\begin{flaspec} 3076\begin{verbatim} 3077FLA_Error FLA_Check_identical_object_datatype( FLA_Obj A, FLA_Obj B ); 3078\end{verbatim} 3079\purpose{ 3080Confirm that $ A $ and $ B $ have identical datatypes. 3081} 3082\notes{ 3083This function enforces literal equality between the datatype fields of 3084$ A $ and $ B $. 3085} 3086\rvalue{ 3087\flasuccess if $ A $ and $ B $ have identical datatypes; 3088\flaobjectdatatypesnotequal otherwise. 3089} 3090\end{flaspec} 3091 3092% --- FLA_Check_consistent_object_datatype() ----------------------------------- 3093 3094\begin{flaspec} 3095\begin{verbatim} 3096FLA_Error FLA_Check_consistent_object_datatype( FLA_Obj A, FLA_Obj B ); 3097\end{verbatim} 3098\purpose{ 3099Confirm that the datatype of $ A $ is consistent with the datatype of $ B $. 3100} 3101\notes{ 3102This function is similar to {\tt FLA\_Check\_identical\_object\_datatype()}, 3103except that it considers objects of datatype \flaconstant to be consistent 3104with all other datatypes. 3105} 3106\rvalue{ 3107\flasuccess if the datatype of $ A $ is equal to the datatype of $ B $; 3108\flainconsistentdatatypes otherwise. 3109} 3110\end{flaspec} 3111 3112% --- FLA_Check_consistent_datatype() ------------------------------------------ 3113 3114\begin{flaspec} 3115\begin{verbatim} 3116FLA_Error FLA_Check_consistent_datatype( FLA_Datatype datatype, FLA_Obj A ); 3117\end{verbatim} 3118\purpose{ 3119Confirm that \datatype is consistent with the datatype of $ A $. 3120} 3121\notes{ 3122This function is similar to {\tt FLA\_Check\_consistent\_object\_datatype()} 3123except that it takes one datatype value and one object as its arguments 3124instead of two objects. 3125} 3126\rvalue{ 3127\flasuccess if \datatype is equal to the datatype of $ A $; 3128\flainconsistentdatatypes otherwise. 3129} 3130\end{flaspec} 3131 3132% --- FLA_Check_identical_object_precision() ----------------------------------- 3133 3134\begin{flaspec} 3135\begin{verbatim} 3136FLA_Error FLA_Check_identical_object_precision( FLA_Obj A, FLA_Obj B ); 3137\end{verbatim} 3138\purpose{ 3139Confirm that the numerical precision of the datatype of $ A $ matches the 3140numerical precision of the datatype of $ B $. 3141} 3142\notes{ 3143The function first verifies that both $ A $ and $ B $ are floating point 3144objects. 3145If one or both objects are not floating point, \flaobjectnotfloatingpoint 3146is returned. 3147} 3148\rvalue{ 3149\flasuccess if the datatype precision of $ A $ is equal to the datatype 3150precision of $ B $; 3151\flainconsistentobjectprecision otherwise. 3152} 3153%\begin{params} 3154%\parameter{\flaobj}{A}{An \flaobj to check.} 3155%\parameter{\flaobj}{B}{An \flaobj to check.} 3156%\end{params} 3157\end{flaspec} 3158 3159% --- FLA_Check_conj_and_datatype() -------------------------------------------- 3160 3161\begin{flaspec} 3162\begin{verbatim} 3163FLA_Error FLA_Check_conj_and_datatype( FLA_Conj conj, FLA_Obj A ); 3164\end{verbatim} 3165\purpose{ 3166Confirm, if $ A $ is real, that \conj is not \flaconjugatens. 3167} 3168\rvalue{ 3169\flasuccess if the \conj argument is not in conflict with the complexness of $ A $; 3170\flainvalidconjgivendatatype otherwise. 3171} 3172\end{flaspec} 3173 3174% --- FLA_Check_conj1_trans_and_datatype() -------------------------------------- 3175 3176\begin{flaspec} 3177\begin{verbatim} 3178FLA_Error FLA_Check_conj1_trans_and_datatype( FLA_Trans trans, FLA_Obj A ); 3179\end{verbatim} 3180\purpose{ 3181Confirm, if $ A $ is real, that \trans is neither \flaconjtranspose 3182nor \flaconjnotransposens. 3183} 3184\rvalue{ 3185\flasuccess if the \trans argument is not in conflict with the complexness of $ A $; 3186\flainvalidtransgivendatatype otherwise. 3187} 3188\end{flaspec} 3189 3190 3191 3192 3193\subsection{Element types} 3194 3195% --- FLA_Check_valid_elemtype() ----------------------------------------------- 3196 3197\begin{flaspec} 3198\begin{verbatim} 3199FLA_Error FLA_Check_valid_elemtype( FLA_Elemtype elemtype ); 3200\end{verbatim} 3201\purpose{ 3202Confirm that \elemtype is one of the following values defined for the 3203\flaelemtype type: 3204\flascalarns, \flamatrixns. 3205} 3206\rvalue{ 3207\flasuccess if \elemtype is valid; 3208\flainvalidelemtype otherwise. 3209} 3210\end{flaspec} 3211 3212% --- FLA_Check_object_scalar_elemtype() --------------------------------------- 3213 3214\begin{flaspec} 3215\begin{verbatim} 3216FLA_Error FLA_Check_object_scalar_elemtype( FLA_Obj A ); 3217\end{verbatim} 3218\purpose{ 3219Confirm that the element type of $ A $ is \flascalarns. 3220} 3221\rvalue{ 3222\flasuccess if the element type of $ A $ is \flascalarns; 3223\flaobjectnotscalarelemtype otherwise. 3224} 3225\end{flaspec} 3226 3227% --- FLA_Check_object_matrix_elemtype() --------------------------------------- 3228 3229\begin{flaspec} 3230\begin{verbatim} 3231FLA_Error FLA_Check_object_matrix_elemtype( FLA_Obj A ); 3232\end{verbatim} 3233\purpose{ 3234Confirm that the element type of $ A $ is \flamatrixns. 3235} 3236\rvalue{ 3237\flasuccess if the element type of $ A $ is \flamatrixns; 3238\flaobjectnotmatrixelemtype otherwise. 3239} 3240\end{flaspec} 3241 3242 3243 3244 3245\subsection{Object dimensions} 3246 3247% --- FLA_Check_square() ------------------------------------------------------- 3248 3249\begin{flaspec} 3250\begin{verbatim} 3251FLA_Error FLA_Check_square( FLA_Obj A ); 3252\end{verbatim} 3253\purpose{ 3254Confirm that $ A $ is square. 3255} 3256\rvalue{ 3257\flasuccess if $ A $ is square; 3258\flaobjectnotsquare otherwise. 3259} 3260\end{flaspec} 3261 3262% --- FLA_Check_if_scalar() ---------------------------------------------------- 3263 3264\begin{flaspec} 3265\begin{verbatim} 3266FLA_Error FLA_Check_if_scalar( FLA_Obj A ); 3267\end{verbatim} 3268\purpose{ 3269Confirm that $ A $ is a scalar (ie: that $ A $ is $ 1 \by 1 $). 3270} 3271\rvalue{ 3272\flasuccess if $ A $ is a scalar; 3273\flaobjectnotscalar otherwise. 3274} 3275\end{flaspec} 3276 3277% --- FLA_Check_if_vector() ---------------------------------------------------- 3278 3279\begin{flaspec} 3280\begin{verbatim} 3281FLA_Error FLA_Check_if_vector( FLA_Obj A ); 3282\end{verbatim} 3283\purpose{ 3284Confirm that $ A $ is a vector (ie: that $ A $ is $ n \by 1 $ or $ 1 \by n $ 3285for $ n \ge 0 $). 3286} 3287\rvalue{ 3288\flasuccess if $ A $ is a vector; 3289\flaobjectnotvector otherwise. 3290} 3291\end{flaspec} 3292 3293% --- FLA_Check_conformal_dims() ----------------------------------------------- 3294 3295\begin{flaspec} 3296\begin{verbatim} 3297FLA_Error FLA_Check_conformal_dims( FLA_Trans trans, FLA_Obj A, FLA_Obj B ); 3298\end{verbatim} 3299\purpose{ 3300Confirm that $ A $ and $ B $ have conformal dimensions. 3301If \trans is \flatranspose or \flaconjtransposens, then the function 3302confirms that $ A $ and $ B^T $ have conformal dimensions. 3303} 3304\rvalue{ 3305\flasuccess if $ A $ and $ B $ have conformal dimensions; 3306\flanonconformaldimensions otherwise. 3307} 3308\end{flaspec} 3309 3310% --- FLA_Check_matrix_matrix_dims() ------------------------------------------- 3311 3312\begin{flaspec} 3313\begin{verbatim} 3314FLA_Error FLA_Check_matrix_matrix_dims( FLA_Trans transa, FLA_Trans transb, 3315 FLA_Obj A, FLA_Obj B, FLA_Obj C ); 3316\end{verbatim} 3317\purpose{ 3318Confirm that $ A $, $ B $, and $ C $ have conformal dimensions suitable 3319for a matrix-matrix operation of the form $ C := C + AB $ where 3320$ C $ is $ m \by n $ and $ A $ and $ B $ are $ m \by k $ and $ k \by n $, 3321respectively, after optionally transpositions, per \transa and \transbns. 3322} 3323\rvalue{ 3324\flasuccess if $ A $, $ B $, and $ C $ have conformal dimensions; 3325\flanonconformaldimensions otherwise. 3326} 3327\end{flaspec} 3328 3329% --- FLA_Check_matrix_vector_dims() ------------------------------------------- 3330 3331\begin{flaspec} 3332\begin{verbatim} 3333FLA_Error FLA_Check_matrix_vector_dims( FLA_Trans trans, FLA_Obj A, FLA_Obj x, FLA_Obj y ); 3334\end{verbatim} 3335\purpose{ 3336Confirm that $ A $, $ x $, and $ y $ have conformal dimensions suitable 3337for a matrix-vector operation of the form $ y := y + Ax $ where 3338$ A $ is optionally transposed, per \transns. 3339} 3340\rvalue{ 3341\flasuccess if $ A $, $ x $, and $ y $ have conformal dimensions; 3342\flanonconformaldimensions otherwise. 3343} 3344\end{flaspec} 3345 3346% --- FLA_Check_equal_vector_lengths() ----------------------------------------- 3347 3348\begin{flaspec} 3349\begin{verbatim} 3350FLA_Error FLA_Check_equal_vector_lengths( FLA_Obj x, FLA_Obj y ); 3351\end{verbatim} 3352\purpose{ 3353Confirm that $ x $ and $ y $, which are assumed to be vectors, have equal 3354lengths. 3355} 3356\notes{ 3357This function works as expected if one or both arguments are row vectors. 3358That is, ``length'' for the purposes of this function refers to the length 3359of the vector, not the number of rows in the object. 3360} 3361\rvalue{ 3362\flasuccess if $ x $ and $ y $ have equal lengths; 3363\flaunequalvectorlengths otherwise. 3364} 3365\end{flaspec} 3366 3367% --- FLA_Check_vector_length() ------------------------------------------------ 3368 3369\begin{flaspec} 3370\begin{verbatim} 3371FLA_Error FLA_Check_vector_length( FLA_Obj x, dim_t expected_length ); 3372\end{verbatim} 3373\purpose{ 3374Confirm that $ x $, which is assumed to be a column vector, is of length 3375{\tt expected\_length}. 3376} 3377\notes{ 3378This function checks only the number of rows of $ x $. 3379Therefore, this function will not work as expected with row vectors. 3380} 3381\rvalue{ 3382\flasuccess if the number of rows in $ x $ is {\tt expected\_length}; 3383\flainvalidvectorlength otherwise. 3384} 3385\end{flaspec} 3386 3387% --- FLA_Check_vector_length_min() -------------------------------------------- 3388 3389\begin{flaspec} 3390\begin{verbatim} 3391FLA_Error FLA_Check_vector_length_min( FLA_Obj x, dim_t min_length ); 3392\end{verbatim} 3393\purpose{ 3394Confirm that $ x $, which is assumed to be a column vector, is at least 3395{\tt min\_length} in length. 3396} 3397\notes{ 3398This function checks only the number of rows of $ x $. 3399Therefore, this function will not work as expected with row vectors. 3400} 3401\rvalue{ 3402\flasuccess if the number of rows in $ x $ is at least {\tt min\_length}; 3403\flavectorlengthbelowmin otherwise. 3404} 3405\end{flaspec} 3406 3407% --- FLA_Check_object_dims() -------------------------------------------------- 3408 3409\begin{flaspec} 3410\begin{verbatim} 3411FLA_Error FLA_Check_object_dims( FLA_Trans trans, dim_t m, dim_t n, FLA_Obj A ); 3412\end{verbatim} 3413\purpose{ 3414Confirm that matrix $ A $ is $ m \by n $. 3415If \trans is \flatranspose or \flaconjtransposens, then the function 3416will instead confirm that $ A $ is $ n \by m $. 3417} 3418\rvalue{ 3419\flasuccess if the dimensions of $ A $ are identical to those specified; 3420\flaspecifiedobjdimmismatch otherwise. 3421} 3422\end{flaspec} 3423 3424% --- FLA_Check_submatrix_dims_and_offset() ------------------------------------ 3425 3426\begin{flaspec} 3427\begin{verbatim} 3428FLA_Error FLA_Check_submatrix_dims_and_offset( int m, int n, int i, int j, FLA_Obj A ); 3429\end{verbatim} 3430\purpose{ 3431Confirm that the $ m \by n $ submatrix that has its top-left element located 3432at row-column offset $ (i,j) $ (indexed from zero) does not exceed the bounds 3433of matrix $ A $. 3434In other words, the following constraints are enforced: 3435\begin{itemize} 3436\item $ i \le m(A) $ 3437\itemvsp 3438\item $ j \le n(A) $ 3439\itemvsp 3440\item $ i + m \le m(A) $ 3441\itemvsp 3442\item $ j + n \le n(A) $ 3443\end{itemize} 3444where $ m( A ) $ and $ n( A ) $ denote the number or rows and number of 3445columns in $ A $, respectively. 3446} 3447\notes{ 3448Strictly speaking, only the last two constraints are needed. 3449However, we first check against the first two constraints to allow us to 3450distinguish between situations where the offsets are invalid (in which case 3451the value of the matrix dimensions are moot), and situations where the 3452offsets are valid, but the submatrix dimensions places the submatrix beyond 3453the bounds of $ A $. 3454} 3455\rvalue{ 3456\flasuccess if the specified submatrix is within the bounds of $ A $; 3457otherwise, \flainvalidsubmatrixdims if one of the first two contraints 3458is violated, and \flainvalidsubmatrixoffset if the first two constraints 3459are met but one of the last two constraints is violated. 3460} 3461\end{flaspec} 3462 3463% --- FLA_Check_adjacent_objects_2x2() ----------------------------------------- 3464 3465\begin{flaspec} 3466\begin{verbatim} 3467FLA_Error FLA_Check_adjacent_objects_2x2( FLA_Obj ATL, FLA_Obj ATR, 3468 FLA_Obj ABL, FLA_Obj ABR ); 3469\end{verbatim} 3470\purpose{ 3471Confirm that views $ A_{TL} $, $ A_{TR} $, $ A_{BL} $, and $ A_{BR} $ are 3472vertically and horizontally aligned and adjacent, and that views that are 3473vertically or horizontally adjacent have dimensions appropriately 3474matched for them to form quadrants of a single larger view of the 3475base object. 3476} 3477\rvalue{ 3478\flasuccess if the four object views are aligned and adjacent, and have 3479matching dimensions; 3480otherwise, one of the following, depending on the mismatch: 3481\begin{itemize} 3482\item \flaobjectsnotverticallyadj 3483\itemvsp 3484\item \flaobjectsnotverticallyaligned 3485\itemvsp 3486\item \flaobjectsnothorizontallyadj 3487\itemvsp 3488\item \flaobjectsnothorizontallyaligned 3489\itemvsp 3490\item \flaadjacentobjectdimmismatch 3491\end{itemize} 3492} 3493\end{flaspec} 3494 3495% --- FLA_Check_adjacent_objects_2x1() ----------------------------------------- 3496 3497\begin{flaspec} 3498\begin{verbatim} 3499FLA_Error FLA_Check_adjacent_objects_2x1( FLA_Obj AT, 3500 FLA_Obj AB ); 3501\end{verbatim} 3502\purpose{ 3503Confirm that views $ A_T $ and $ A_B $ are vertically 3504and aligned and adjacent, and that the views 3505have column dimensions appropriately 3506matched for them to form a vertical panel a single larger view of the 3507base object. 3508} 3509\rvalue{ 3510\flasuccess if the two object views are vertically aligned and adjacent, 3511and have matching column dimensions; 3512otherwise, one of the following, depending on the mismatch: 3513\begin{itemize} 3514\item \flaobjectsnotverticallyadj 3515\itemvsp 3516\item \flaobjectsnotverticallyaligned 3517\itemvsp 3518\item \flaadjacentobjectdimmismatch 3519\end{itemize} 3520} 3521\end{flaspec} 3522 3523% --- FLA_Check_adjacent_objects_1x2() ----------------------------------------- 3524 3525\begin{flaspec} 3526\begin{verbatim} 3527FLA_Error FLA_Check_adjacent_objects_1x2( FLA_Obj AL, FLA_Obj AR ); 3528\end{verbatim} 3529\purpose{ 3530Confirm that views $ A_L $ and $ A_R $ are horizontally 3531and aligned and adjacent, and that the views 3532have row dimensions appropriately 3533matched for them to form a horizontal panel a single larger view of the 3534base object. 3535} 3536\rvalue{ 3537\flasuccess if the two object views are horizontally aligned and adjacent, 3538and have matching row dimensions; 3539otherwise, one of the following, depending on the mismatch: 3540\begin{itemize} 3541\item \flaobjectsnothorizontallyadj 3542\itemvsp 3543\item \flaobjectsnothorizontallyaligned 3544\itemvsp 3545\item \flaadjacentobjectdimmismatch 3546\end{itemize} 3547} 3548\end{flaspec} 3549 3550 3551 3552 3553\subsection{UNIX file I/O} 3554 3555% --- FLA_Check_file_descriptor() ---------------------------------------------- 3556 3557\begin{flaspec} 3558\begin{verbatim} 3559FLA_Error FLA_Check_file_descriptor( int fd ); 3560\end{verbatim} 3561\purpose{ 3562Confirm that \fdns, which is assumped to have been returned from 3563the UNIX/Linux {\tt open()} function, is a valid file descriptor. 3564} 3565\notes{ 3566The UNIX/Linux {\tt open()} function returns $ -1 $ when it fails to 3567successfully open a file. 3568} 3569\rvalue{ 3570\flasuccess if \fd is valid; 3571\flaopenreturnederror otherwise. 3572} 3573\end{flaspec} 3574 3575% --- FLA_Check_close_result() ------------------------------------------------- 3576 3577\begin{flaspec} 3578\begin{verbatim} 3579FLA_Error FLA_Check_close_result( int r_val ); 3580\end{verbatim} 3581\purpose{ 3582Confirm that \rvalns, which is assumed to have been returned 3583from {\tt close()}, does not indicate that an error has occurred. 3584} 3585\notes{ 3586The UNIX/Linux {\tt close()} function returns $ -1 $ when it fails to 3587successfully close a file. 3588} 3589\rvalue{ 3590\flasuccess if \rval indicates no error; 3591\flaclosereturnederror otherwise. 3592} 3593\end{flaspec} 3594 3595% --- FLA_Check_lseek_result() ------------------------------------------------- 3596 3597\begin{flaspec} 3598\begin{verbatim} 3599FLA_Error FLA_Check_lseek_result( off_t requested_offset, int r_val ); 3600\end{verbatim} 3601\purpose{ 3602Confirm that \rvalns, which is assumed to have been returned 3603from {\tt lseek()}, does not indicate that an error has occurred. 3604It is further assumed that {\tt requested\_offset} was the byte offset 3605passed to {\tt lseek()} when {\tt lseek()} returned \rvalns. 3606} 3607\notes{ 3608The UNIX/Linux {\tt lseek()} function returns the resulting byte offset 3609relative to the beginning of the file. 3610If {\tt lseek()} is unsuccessful, it returns $ -1 $. 3611} 3612\rvalue{ 3613\flasuccess if \rval indicates no error; 3614\flalseekreturnederror otherwise. 3615} 3616\end{flaspec} 3617 3618% --- FLA_Check_read_result() -------------------------------------------------- 3619 3620\begin{flaspec} 3621\begin{verbatim} 3622FLA_Error FLA_Check_read_result( size_t requested_size, ssize_t r_val ); 3623\end{verbatim} 3624\purpose{ 3625Confirm that \rvalns, which is assumed to have been returned 3626from {\tt read()}, does not indicate that an error has occurred. 3627} 3628\notes{ 3629Under normal circumstances, the UNIX/Linux {\tt read()} function 3630returns the number of bytes successfully read into the destination 3631buffer, where zero indicates no bytes were read due to attempting to 3632read at end of file. 3633It is not considered an error for this return value to be less than 3634the requested number of bytes. 3635If an actual error occurs, {\tt read()} returns $ -1 $. 3636Currently {\tt FLA\_Check\_read\_result()} only returns an error 3637code if {\tt read()} returns $ -1 $. 3638Thus, the {\tt requested\_size} argument is not referenced, but may be 3639used in the future to provide warnings. 3640} 3641\rvalue{ 3642\flasuccess if \rval indicates no error; 3643\flareadreturnederror otherwise. 3644} 3645\end{flaspec} 3646 3647% --- FLA_Check_write_result() ------------------------------------------------- 3648 3649\begin{flaspec} 3650\begin{verbatim} 3651FLA_Error FLA_Check_write_result( size_t requested_size, ssize_t r_val ); 3652\end{verbatim} 3653\purpose{ 3654Confirm that \rvalns, which is assumed to have been returned 3655from {\tt write()}, does not indicate that an error has occurred. 3656} 3657\notes{ 3658Under normal circumstances, the UNIX/Linux {\tt write()} function 3659returns the number of bytes successfully written from the source 3660buffer, where zero indicates no bytes were written. 3661It is not considered an error for this return value to be less than 3662the requested number of bytes. 3663If an actual error occurs, {\tt write()} returns $ -1 $. 3664Currently {\tt FLA\_Check\_write\_result()} only returns an error 3665code if {\tt write()} returns $ -1 $. 3666Thus, the {\tt requested\_size} argument is not referenced, but may be 3667used in the future to provide warnings. 3668} 3669\rvalue{ 3670\flasuccess if \rval indicates no error; 3671\flawritereturnederror otherwise. 3672} 3673\end{flaspec} 3674 3675% --- FLA_Check_unlink_result() ------------------------------------------------ 3676 3677\begin{flaspec} 3678\begin{verbatim} 3679FLA_Error FLA_Check_unlink_result( int r_val ); 3680\end{verbatim} 3681\purpose{ 3682Confirm that \rvalns, which is assumed to have been returned 3683from {\tt unlink()}, does not indicate that an error has occurred. 3684} 3685\notes{ 3686The UNIX/Linux {\tt unlink()} function returns $ -1 $ when it fails to 3687successfully delete a file from the filesystem. 3688} 3689\rvalue{ 3690\flasuccess if \rval indicates no error; 3691\flaunlinkreturnederror otherwise. 3692} 3693\end{flaspec} 3694 3695 3696 3697 3698\subsection{Operation-specific errors} 3699 3700% --- FLA_Check_chol_failure() ------------------------------------------------- 3701 3702\begin{flaspec} 3703\begin{verbatim} 3704FLA_Error FLA_Check_chol_failure( FLA_Error r_val ); 3705\end{verbatim} 3706\purpose{ 3707Confirm that \rvalns, which is assumped to have been 3708returned from the an LAPACK-compatible Cholesky factorization routine, 3709does not indicate that the input matrix was found to be non-symmetric 3710positive definite (or non-Hermitian positive definite). 3711} 3712\notes{ 3713For the purposes of this function, ``LAPACK-compatible'' means the 3714Cholesky factorization routine returns the row/column offset (indexing 3715from one) of the diagonal entry that was found to be negative. 3716} 3717\rvalue{ 3718\flasuccess if \rval is valid; 3719\flacholfailedmatrixnotspd otherwise. 3720} 3721\end{flaspec} 3722 3723% --- FLA_FLA_Check_block_householder_transform() ------------------------------ 3724 3725\begin{flaspec} 3726\begin{verbatim} 3727FLA_Error FLA_Check_block_householder_transform( FLA_Store storev, 3728 FLA_Obj A, FLA_Obj S ); 3729\end{verbatim} 3730\purpose{ 3731Confirm that matrix the dimensions of $ A $ are compatible with the dimensions 3732of a block Householder matrix $ S $. 3733Specifically, if \storev is \flacolumnwisens, then the number of columns of 3734$ A $ must match the order of $ S $. 3735Otherwise, if \storev is \flarowwisens, then the number of rows of $ A $ 3736must match the order of $ S $. 3737} 3738\rvalue{ 3739\flasuccess if the dimensions of $ A $ and $ S $ are compatible; 3740\flablockhousehdimmismatch otherwise. 3741} 3742\end{flaspec} 3743 3744% --- FLA_Check_hess_indices() ------------------------------------------------- 3745% 3746%\begin{flaspec} 3747%\begin{verbatim} 3748%FLA_Error FLA_Check_hess_indices( FLA_Obj A, int ilo, int ihi ); 3749%\end{verbatim} 3750%\purpose{ 3751%Confirm that the indices \ilo and \ihi are reasonable values for a 3752%reduction to upper Hessenberg operation. 3753%} 3754%\notes{ 3755%Any of the following conditions will cause the function to return 3756%an \flainvalidhessenbergindices error value: 3757%\begin{itemize} 3758%\item $ n( A ) = 0 $ and $ \ilo \neq 1 $ and $ \ihi \neq 1 $ 3759%\itemvsp 3760%\item $ \ilo < 1 $ or $ n( A ) < \ilo $ 3761%\itemvsp 3762%\item $ \ihi < 1 $ or $ n( A ) < \ihi $ 3763%\itemvsp 3764%\item $ \ihi < \ilo $ 3765%\end{itemize} 3766%where $ n( A ) $ denotes the number of columns in matrix $ A $. 3767%} 3768%\rvalue{ 3769%\flasuccess if \ilo and \ihi are valid indices for a reduction to upper Hessenberg operation; 3770%\flainvalidhessenbergindices otherwise. 3771%} 3772%\end{flaspec} 3773 3774% --- FLA_Check_sylv_matrix_dims() --------------------------------------------- 3775 3776\begin{flaspec} 3777\begin{verbatim} 3778FLA_Error FLA_Check_sylv_matrix_dims( FLA_Obj A, FLA_Obj B, FLA_Obj C ); 3779\end{verbatim} 3780\purpose{ 3781Confirm that $ A $, $ B $, and $ C $ have conformal dimensions suitable 3782for a triangular Sylvester equation solve of the form $ AX + XB = C $ where 3783$ A $ and $ B $ are $ m \by m $ and $ n \by n $, respectively, and $ X $ and 3784$ C $ are $ m \by n $. 3785} 3786\rvalue{ 3787\flasuccess if $ A $, $ B $, and $ C $ have conformal dimensions; 3788\flanonconformaldimensions otherwise. 3789} 3790\end{flaspec} 3791 3792% --- FLA_Check_valid_isgn_value() --------------------------------------------- 3793 3794\begin{flaspec} 3795\begin{verbatim} 3796FLA_Error FLA_Check_valid_isgn_value( FLA_Obj isgn ); 3797\end{verbatim} 3798\purpose{ 3799Confirm that \isgn is either \flaone or \flaminusonens. 3800} 3801\notes{ 3802This function currently compares \isgn against \flaone and \flaminusone 3803with the function {\tt FLA\_Obj\_is()}, which creates a stronger constraint 3804than just comparing the {\em values} contained within the objects. 3805} 3806\rvalue{ 3807\flasuccess if \isgn is valid; 3808\flainvalidside otherwise. 3809} 3810\end{flaspec} 3811 3812 3813 3814 3815\subsection{Other system errors} 3816 3817% --- FLA_Check_pthread_create_result() ---------------------------------------- 3818 3819\begin{flaspec} 3820\begin{verbatim} 3821FLA_Error FLA_Check_pthread_create_result( int r_val ); 3822\end{verbatim} 3823\purpose{ 3824Confirm that \rvalns, which is assumped to have been 3825returned from the POSIX {\tt pthread\_create()} function, does not indicate 3826that an error has occurred. 3827} 3828\rvalue{ 3829\flasuccess if \rval indicates no error; 3830\flapthreadcreatereturnederror otherwise. 3831} 3832\end{flaspec} 3833 3834% --- FLA_Check_pthread_join_result() ------------------------------------------ 3835 3836\begin{flaspec} 3837\begin{verbatim} 3838FLA_Error FLA_Check_pthread_join_result( int r_val ); 3839\end{verbatim} 3840\purpose{ 3841Confirm that \rvalns, which is assumped to have been 3842returned from the POSIX {\tt pthread\_join()} function, does not indicate 3843that an error has occurred. 3844} 3845\rvalue{ 3846\flasuccess if \rval indicates no error; 3847\flapthreadjoinreturnederror otherwise. 3848} 3849\end{flaspec} 3850 3851% --- FLA_Check_malloc_pointer() ----------------------------------------------- 3852 3853\begin{flaspec} 3854\begin{verbatim} 3855FLA_Error FLA_Check_malloc_pointer( void* ptr ); 3856\end{verbatim} 3857\purpose{ 3858Confirm that \ptrns, which is assumped to have been returned by a memory 3859allocation function such as {\tt malloc()}, is not a \fnull pointer. 3860} 3861\notes{ 3862This routine is similar in behavior to {\tt FLA\_Check\_null\_pointer()}. 3863This only difference is that this function is used specifically to check 3864the validity of pointers returned by {\tt malloc()}, and thus is set up 3865to return a {\tt malloc()}-specific error message. 3866} 3867\rvalue{ 3868\flasuccess if \ptr is not \fnullns; 3869\flamallocreturnednullpointer otherwise. 3870} 3871\end{flaspec} 3872 3873% --- FLA_Check_posix_memalign_failure() --------------------------------------- 3874 3875\begin{flaspec} 3876\begin{verbatim} 3877FLA_Error FLA_Check_posix_memalign_failure( int r_val ); 3878\end{verbatim} 3879\purpose{ 3880Confirm that \rvalns, which is assumped to have been returned by 3881the POSIX {\tt posix\_memalign()} function, does not indicate that an error 3882has occurred. 3883} 3884\notes{ 3885Unlike {\tt malloc()}, {\tt posix\_memalign()} returns an integer 3886value to indicate success (zero) or failure (a non-zero value), and the 3887pointer to the requested memory region is obtained from the function by 3888providing the pointer's address as an argument, which allows 3889{\tt posix\_memalign()} to set the pointer value directly. 3890} 3891\rvalue{ 3892\flasuccess if \rval is valid; 3893\flaposixmemalignfailed otherwise. 3894} 3895\end{flaspec} 3896 3897 3898 3899 3900\subsection{Misc. errors} 3901 3902% --- FLA_Check_valid_pivot_type() --------------------------------------------- 3903 3904\begin{flaspec} 3905\begin{verbatim} 3906FLA_Error FLA_Check_valid_pivot_type( FLA_Pivot_type ptype ); 3907\end{verbatim} 3908\purpose{ 3909Confirm that \ptype is one of the following values defined for the 3910\flapivottype type: 3911\flanativepivotsns, \flalapackpivotsns. 3912} 3913\rvalue{ 3914\flasuccess if \ptype is valid; 3915\flainvalidconj otherwise. 3916} 3917\end{flaspec} 3918 3919% --- FLA_Check_pivot_vector_length() ------------------------------------------ 3920 3921\begin{flaspec} 3922\begin{verbatim} 3923FLA_Error FLA_Check_pivot_vector_length( FLA_Obj ipiv ); 3924\end{verbatim} 3925\purpose{ 3926Confirm that the number of rows in \ipiv is less than or equal to 3927the current length of the interal pivoting work buffer. 3928If the length of \ipiv exceeds the current length of the work buffer, 3929then the work buffer is reallocated to the length of \ipivns. 3930The length of this internal buffer begins as 3931{\tt FLA\_MAX\_LU\_PIVOT\_LENGTH}. 3932If a reallocation takes place, the function confirms that the 3933reallocated pointer is not \fnullns. 3934} 3935\rvalue{ 3936\flasuccess if no reallocation was needed, or if a reallocation took place 3937and the returned pointer is valid; 3938\flanullpointer otherwise. 3939} 3940\end{flaspec} 3941 3942% --- FLA_Check_divide_by_zero() ----------------------------------------------- 3943 3944\begin{flaspec} 3945\begin{verbatim} 3946FLA_Error FLA_Check_divide_by_zero( FLA_Obj alpha ); 3947\end{verbatim} 3948\purpose{ 3949Confirm that $ \alpha $, which is assumed to be a potential denominator 3950in a future floating point division operation, is non-zero. 3951} 3952\rvalue{ 3953\flasuccess if $ \alpha $ is non-zero; 3954\fladividebyzero otherwise. 3955} 3956\end{flaspec} 3957 3958% --- FLA_Check_blocksize_value() ---------------------------------------------- 3959 3960\begin{flaspec} 3961\begin{verbatim} 3962FLA_Error FLA_Check_blocksize_value( dim_t b ); 3963\end{verbatim} 3964\purpose{ 3965Confirm that $ b $ is a valid blocksize. 3966} 3967\notes{ 3968Since \dimt is a type of unsigned integer, the only invalid value that $ b $ 3969might take on is zero. 3970Thus, this function simply confirms that $ b $ is non-zero. 3971} 3972\rvalue{ 3973\flasuccess if $ b $ is valid (non-zero); 3974\flainvalidblocksizevalue otherwise. 3975} 3976\end{flaspec} 3977 3978% --- FLA_Check_blocksize_object() --------------------------------------------- 3979 3980\begin{flaspec} 3981\begin{verbatim} 3982FLA_Error FLA_Check_blocksize_object( FLA_Datatype datatype, fla_blocksize_t* bp ); 3983\end{verbatim} 3984\purpose{ 3985Confirm that the blocksize field associated with \datatype that resides 3986within the structure pointed to by \bp is valid. 3987} 3988\notes{ 3989Similar to {\tt FLA\_Check\_blocksize\_value()}, this function only confirms 3990that the blocksize value is non-zero. 3991} 3992\rvalue{ 3993\flasuccess if the blocksize field associated with \datatype is valid 3994(non-zero); 3995\flainvalidblocksizevalue otherwise. 3996} 3997\end{flaspec} 3998 3999% --- FLA_Check_null_pointer() ------------------------------------------------- 4000 4001\begin{flaspec} 4002\begin{verbatim} 4003FLA_Error FLA_Check_null_pointer( void* ptr ); 4004\end{verbatim} 4005\purpose{ 4006Confirm that \ptr is not a \fnull pointer. 4007} 4008\rvalue{ 4009\flasuccess if \ptr is not \fnullns; 4010\flanullpointer otherwise. 4011} 4012\end{flaspec} 4013 4014% --- FLA_Check_base_buffer_mismatch() ----------------------------------------- 4015 4016\begin{flaspec} 4017\begin{verbatim} 4018FLA_Error FLA_Check_base_buffer_mismatch( FLA_Obj A, FLA_Obj B ); 4019\end{verbatim} 4020\purpose{ 4021Confirm that views $ A $ and $ B $ refer to the same underlying 4022object. 4023} 4024\notes{ 4025This check is performed by comparing the addresses of the views' base 4026objects. 4027} 4028\rvalue{ 4029\flasuccess if $ A $ and $ B $ refer to the same underlying object; 4030\flaobjectbasebuffermismatch otherwise. 4031} 4032\end{flaspec} 4033 4034% --- FLA_Check_num_threads() -------------------------------------------------- 4035 4036\begin{flaspec} 4037\begin{verbatim} 4038FLA_Error FLA_Check_num_threads( unsigned int n_threads ); 4039\end{verbatim} 4040\purpose{ 4041Confirm that \nthreads is at least one. 4042} 4043\notes{ 4044Since \nthreads is declared as an {\tt unsigned int}, the only 4045invalid value that \nthreads may take on is zero. 4046} 4047\rvalue{ 4048\flasuccess if \nthreads is at least one; 4049\flaencounterednonpositiventhreads otherwise. 4050} 4051\end{flaspec} 4052 4053 4054