1      SUBROUTINE PCMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2*
3*  -- ScaLAPACK auxiliary routine (version 1.7) --
4*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5*     and University of California, Berkeley.
6*     May 25, 2001
7*
8*     .. Scalar Arguments ..
9      INTEGER            INDX, INCX, IX, JX, N
10      COMPLEX            AMAX
11*     ..
12*     .. Array Arguments ..
13      INTEGER            DESCX( * )
14      COMPLEX            X( * )
15*     ..
16*
17*  Purpose
18*  =======
19*
20*  PCMAX1 computes the global index of the maximum element in absolute
21*  value of a distributed vector sub( X ). The global index is returned
22*  in INDX and the value is returned in AMAX,
23*
24*  where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25*                         X(IX,JX:JX+N-1) if INCX = M_X.
26*
27*  Notes
28*  =====
29*
30*  Each global data object is described by an associated description
31*  vector.  This vector stores the information required to establish
32*  the mapping between an object element and its corresponding process
33*  and memory location.
34*
35*  Let A be a generic term for any 2D block cyclicly distributed array.
36*  Such a global array has an associated description vector DESCA.
37*  In the following comments, the character _ should be read as
38*  "of the global array".
39*
40*  NOTATION        STORED IN      EXPLANATION
41*  --------------- -------------- --------------------------------------
42*  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
43*                                 DTYPE_A = 1.
44*  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45*                                 the BLACS process grid A is distribu-
46*                                 ted over. The context itself is glo-
47*                                 bal, but the handle (the integer
48*                                 value) may vary.
49*  M_A    (global) DESCA( M_ )    The number of rows in the global
50*                                 array A.
51*  N_A    (global) DESCA( N_ )    The number of columns in the global
52*                                 array A.
53*  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
54*                                 the rows of the array.
55*  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
56*                                 the columns of the array.
57*  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58*                                 row of the array A is distributed.
59*  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60*                                 first column of the array A is
61*                                 distributed.
62*  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
63*                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
64*
65*  Let K be the number of rows or columns of a distributed matrix,
66*  and assume that its process grid has dimension p x q.
67*  LOCr( K ) denotes the number of elements of K that a process
68*  would receive if K were distributed over the p processes of its
69*  process column.
70*  Similarly, LOCc( K ) denotes the number of elements of K that a
71*  process would receive if K were distributed over the q processes of
72*  its process row.
73*  The values of LOCr() and LOCc() may be determined via a call to the
74*  ScaLAPACK tool function, NUMROC:
75*          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76*          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77*  An upper bound for these quantities may be computed by:
78*          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79*          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80*
81*  Because vectors may be viewed as a subclass of matrices, a
82*  distributed vector is considered to be a distributed matrix.
83*
84*  When the result of a vector-oriented PBLAS call is a scalar, it will
85*  be made available only within the scope which owns the vector(s)
86*  being operated on.  Let X be a generic term for the input vector(s).
87*  Then, the processes which receive the answer will be (note that if
88*  an operation involves more than one vector, the processes which re-
89*  ceive the result will be the union of the following calculation for
90*  each vector):
91*
92*  If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93*  row or process column owns the vector operand, therefore only the
94*  process of coordinate {RSRC_X, CSRC_X} receives the result;
95*
96*  If INCX = M_X, then sub( X ) is a vector distributed over a process
97*  row. Each process part of this row receives the result;
98*
99*  If INCX = 1, then sub( X ) is a vector distributed over a process
100*  column. Each process part of this column receives the result;
101*
102*  Based on PCAMAX from Level 1 PBLAS. The change is to use the
103*  'genuine' absolute value.
104*
105*  The serial version was contributed to LAPACK by Nick Higham for use
106*  with CLACON.
107*
108*  Arguments
109*  =========
110*
111*  N       (global input) pointer to INTEGER
112*          The number of components of the distributed vector sub( X ).
113*          N >= 0.
114*
115*  AMAX    (global output) pointer to REAL
116*          The absolute value of the largest entry of the distributed
117*          vector sub( X ) only in the scope of sub( X ).
118*
119*  INDX    (global output) pointer to INTEGER
120*          The global index of the element of the distributed vector
121*          sub( X ) whose real part has maximum absolute value.
122*
123*  X       (local input) COMPLEX array containing the local
124*          pieces of a distributed matrix of dimension of at least
125*              ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126*          This array contains the entries of the distributed vector
127*          sub( X ).
128*
129*  IX      (global input) INTEGER
130*          The row index in the global array X indicating the first
131*          row of sub( X ).
132*
133*  JX      (global input) INTEGER
134*          The column index in the global array X indicating the
135*          first column of sub( X ).
136*
137*  DESCX   (global and local input) INTEGER array of dimension DLEN_.
138*          The array descriptor for the distributed matrix X.
139*
140*  INCX    (global input) INTEGER
141*          The global increment for the elements of X. Only two values
142*          of INCX are supported in this version, namely 1 and M_X.
143*          INCX must not be zero.
144*
145*  =====================================================================
146*
147*     .. Parameters ..
148      INTEGER            BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149     $                   LLD_, MB_, M_, NB_, N_, RSRC_
150      PARAMETER          ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1,
151     $                     CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6,
152     $                     RSRC_ = 7, CSRC_ = 8, LLD_ = 9 )
153      COMPLEX            ZERO
154      PARAMETER          ( ZERO = ( 0.0E+0, 0.0E+0 ) )
155*     ..
156*     .. Local Scalars ..
157      CHARACTER          CBTOP, CCTOP, RBTOP, RCTOP
158      INTEGER            ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159     $                   JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160     $                   NPCOL, NPROW, NQ
161*     ..
162*     .. Local Arrays ..
163      COMPLEX            WORK( 2 )
164*     ..
165*     .. External Subroutines ..
166      EXTERNAL           BLACS_GRIDINFO, CCOMBAMAX1, CGAMX2D,
167     $                   IGEBR2D, IGEBS2D, INFOG2L, PCTREECOMB,
168     $                   PB_TOPGET
169*     ..
170*     .. External Functions ..
171      LOGICAL            LSAME
172      INTEGER            ICMAX1, INDXL2G, NUMROC
173      EXTERNAL           ICMAX1, INDXL2G, NUMROC
174*     ..
175*     .. Intrinsic Functions ..
176      INTRINSIC          ABS, CMPLX, MOD, NINT, REAL
177*     ..
178*     .. Executable Statements ..
179*
180*     Get grid parameters
181*
182      ICTXT = DESCX( CTXT_ )
183      CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )
184*
185*     Quick return if possible.
186*
187      INDX = 0
188      AMAX = ZERO
189      IF( N.LE.0 )
190     $   RETURN
191*
192*     Retrieve local information for vector X.
193*
194      LDX = DESCX( LLD_ )
195      CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX,
196     $              IXROW, IXCOL )
197*
198      IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN
199         INDX = JX
200         AMAX = X( IIX+(JJX-1)*LDX )
201         RETURN
202      END IF
203*
204*     Find the maximum value and its index
205*
206      IF( INCX.EQ.DESCX( M_ ) ) THEN
207*
208         IF( MYROW.EQ.IXROW ) THEN
209*
210            ICOFF = MOD( JX-1, DESCX( NB_ ) )
211            NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL )
212            IF( MYCOL.EQ.IXCOL )
213     $         NQ = NQ-ICOFF
214*
215            CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP )
216*
217            IF( LSAME( RBTOP, ' ' ) ) THEN
218*
219               IF( NQ.GT.0 ) THEN
220                  LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
221                  WORK( 1 ) = X( IIX+(LCINDX-1)*LDX )
222                  WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX,
223     $              DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) )
224               ELSE
225                  WORK( 1 ) = ZERO
226                  WORK( 2 ) = ZERO
227               END IF
228*
229               CALL PCTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL,
230     $                          CCOMBAMAX1 )
231*
232               AMAX = WORK( 1 )
233               IF( AMAX.EQ.ZERO ) THEN
234                  INDX = JX
235               ELSE
236                  INDX = NINT( REAL( WORK( 2 ) ) )
237               END IF
238*
239            ELSE
240*
241               CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP )
242*
243               IF( NQ.GT.0 ) THEN
244                  LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
245                  AMAX = X( IIX + (LCINDX-1)*LDX )
246               ELSE
247                  AMAX = ZERO
248               END IF
249*
250*              Find the maximum value
251*
252               CALL CGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1,
253     $                       IDUMM, MAXPOS, 1, -1, MYROW )
254*
255               IF( AMAX.NE.ZERO ) THEN
256*
257*                 Broadcast corresponding global index
258*
259                  IF( MYCOL.EQ.MAXPOS ) THEN
260                     INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL,
261     $                               DESCX( CSRC_ ), NPCOL )
262                     CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
263     $                             1 )
264                  ELSE
265                     CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
266     $                             1, MYROW, MAXPOS )
267                  END IF
268*
269               ELSE
270*
271                  INDX = JX
272*
273               END IF
274*
275            END IF
276*
277         END IF
278*
279      ELSE
280*
281         IF( MYCOL.EQ.IXCOL ) THEN
282*
283            IROFF = MOD( IX-1, DESCX( MB_ ) )
284            NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW )
285            IF( MYROW.EQ.IXROW )
286     $         NP = NP-IROFF
287*
288            CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP )
289*
290            IF( LSAME( CBTOP, ' ' ) ) THEN
291*
292               IF( NP.GT.0 ) THEN
293                  LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
294                  WORK( 1 ) = X( LCINDX + (JJX-1)*LDX )
295                  WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX,
296     $              DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) )
297               ELSE
298                  WORK( 1 ) = ZERO
299                  WORK( 2 ) = ZERO
300               END IF
301*
302               CALL PCTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL,
303     $                          CCOMBAMAX1 )
304*
305               AMAX = WORK( 1 )
306               IF( AMAX.EQ.ZERO ) THEN
307                  INDX = IX
308               ELSE
309                  INDX = NINT( REAL( WORK( 2 ) ) )
310               END IF
311*
312            ELSE
313*
314               CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP )
315*
316               IF( NP.GT.0 ) THEN
317                  LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
318                  AMAX = X( LCINDX + (JJX-1)*LDX )
319               ELSE
320                  AMAX = ZERO
321               END IF
322*
323*              Find the maximum value
324*
325               CALL CGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1,
326     $                       MAXPOS, IDUMM, 1, -1, MYCOL )
327*
328               IF( AMAX.NE.ZERO ) THEN
329*
330*                 Broadcast corresponding global index
331*
332                  IF( MYROW.EQ.MAXPOS ) THEN
333                     INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW,
334     $                               DESCX( RSRC_ ), NPROW )
335                     CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
336     $                             INDX, 1 )
337                  ELSE
338                     CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
339     $                             INDX, 1, MAXPOS, MYCOL )
340                  END IF
341*
342               ELSE
343*
344                  INDX = IX
345*
346               END IF
347*
348            END IF
349*
350         END IF
351*
352      END IF
353*
354      RETURN
355*
356*     End of PCMAX1
357*
358      END
359*
360      SUBROUTINE CCOMBAMAX1 ( V1, V2 )
361*
362*  -- ScaLAPACK auxiliary routine (version 1.7) --
363*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
364*     and University of California, Berkeley.
365*     May 1, 1997
366*
367*     .. Array Arguments ..
368      COMPLEX            V1( 2 ), V2( 2 )
369*     ..
370*
371*  Purpose
372*  =======
373*
374*  CCOMBAMAX1 finds the element having maximum real part absolute
375*  value as well as its corresponding globl index.
376*
377*  Arguments
378*  =========
379*
380*  V1        (local input/local output) COMPLEX array of
381*            dimension 2.  The first maximum absolute value element and
382*            its global index. V1(1) = AMAX, V1(2) = INDX.
383*
384*  V2        (local input) COMPLEX array of dimension 2.
385*            The second maximum absolute value element and its global
386*            index. V2(1) = AMAX, V2(2) = INDX.
387*
388*  =====================================================================
389*
390*     .. Intrinsic Functions ..
391      INTRINSIC          ABS, REAL
392*     ..
393*     .. Executable Statements ..
394*
395      IF( ABS( REAL( V1( 1 ) ) ).LT.ABS( REAL( V2( 1 ) ) ) ) THEN
396         V1( 1 ) = V2( 1 )
397         V1( 2 ) = V2( 2 )
398      END IF
399*
400      RETURN
401*
402*     End of CCOMBAMAX1
403*
404      END
405