1      SUBROUTINE PZMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
2*
3*  -- ScaLAPACK auxiliary routine (version 1.7) --
4*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
5*     and University of California, Berkeley.
6*     May 1, 1997
7*
8*     .. Scalar Arguments ..
9      INTEGER            INDX, INCX, IX, JX, N
10      COMPLEX*16         AMAX
11*     ..
12*     .. Array Arguments ..
13      INTEGER            DESCX( * )
14      COMPLEX*16         X( * )
15*     ..
16*
17*  Purpose
18*  =======
19*
20*  PZMAX1 computes the global index of the maximum element in absolute
21*  value of a distributed vector sub( X ). The global index is returned
22*  in INDX and the value is returned in AMAX,
23*
24*  where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
25*                         X(IX,JX:JX+N-1) if INCX = M_X.
26*
27*  Notes
28*  =====
29*
30*  Each global data object is described by an associated description
31*  vector.  This vector stores the information required to establish
32*  the mapping between an object element and its corresponding process
33*  and memory location.
34*
35*  Let A be a generic term for any 2D block cyclicly distributed array.
36*  Such a global array has an associated description vector DESCA.
37*  In the following comments, the character _ should be read as
38*  "of the global array".
39*
40*  NOTATION        STORED IN      EXPLANATION
41*  --------------- -------------- --------------------------------------
42*  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
43*                                 DTYPE_A = 1.
44*  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
45*                                 the BLACS process grid A is distribu-
46*                                 ted over. The context itself is glo-
47*                                 bal, but the handle (the integer
48*                                 value) may vary.
49*  M_A    (global) DESCA( M_ )    The number of rows in the global
50*                                 array A.
51*  N_A    (global) DESCA( N_ )    The number of columns in the global
52*                                 array A.
53*  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
54*                                 the rows of the array.
55*  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
56*                                 the columns of the array.
57*  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
58*                                 row of the array A is distributed.
59*  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
60*                                 first column of the array A is
61*                                 distributed.
62*  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
63*                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
64*
65*  Let K be the number of rows or columns of a distributed matrix,
66*  and assume that its process grid has dimension p x q.
67*  LOCr( K ) denotes the number of elements of K that a process
68*  would receive if K were distributed over the p processes of its
69*  process column.
70*  Similarly, LOCc( K ) denotes the number of elements of K that a
71*  process would receive if K were distributed over the q processes of
72*  its process row.
73*  The values of LOCr() and LOCc() may be determined via a call to the
74*  ScaLAPACK tool function, NUMROC:
75*          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
76*          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
77*  An upper bound for these quantities may be computed by:
78*          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
79*          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
80*
81*  Because vectors may be viewed as a subclass of matrices, a
82*  distributed vector is considered to be a distributed matrix.
83*
84*  When the result of a vector-oriented PBLAS call is a scalar, it will
85*  be made available only within the scope which owns the vector(s)
86*  being operated on.  Let X be a generic term for the input vector(s).
87*  Then, the processes which receive the answer will be (note that if
88*  an operation involves more than one vector, the processes which re-
89*  ceive the result will be the union of the following calculation for
90*  each vector):
91*
92*  If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
93*  row or process column owns the vector operand, therefore only the
94*  process of coordinate {RSRC_X, CSRC_X} receives the result;
95*
96*  If INCX = M_X, then sub( X ) is a vector distributed over a process
97*  row. Each process part of this row receives the result;
98*
99*  If INCX = 1, then sub( X ) is a vector distributed over a process
100*  column. Each process part of this column receives the result;
101*
102*  Based on PZAMAX from Level 1 PBLAS. The change is to use the
103*  'genuine' absolute value.
104*
105*  The serial version was contributed to LAPACK by Nick Higham for use
106*  with ZLACON.
107*
108*  Arguments
109*  =========
110*
111*  N       (global input) pointer to INTEGER
112*          The number of components of the distributed vector sub( X ).
113*          N >= 0.
114*
115*  AMAX    (global output) pointer to DOUBLE PRECISION
116*          The absolute value of the largest entry of the distributed
117*          vector sub( X ) only in the scope of sub( X ).
118*
119*  INDX    (global output) pointer to INTEGER
120*          The global index of the element of the distributed vector
121*          sub( X ) whose real part has maximum absolute value.
122*
123*  X       (local input) COMPLEX*16 array containing the local
124*          pieces of a distributed matrix of dimension of at least
125*              ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
126*          This array contains the entries of the distributed vector
127*          sub( X ).
128*
129*  IX      (global input) INTEGER
130*          The row index in the global array X indicating the first
131*          row of sub( X ).
132*
133*  JX      (global input) INTEGER
134*          The column index in the global array X indicating the
135*          first column of sub( X ).
136*
137*  DESCX   (global and local input) INTEGER array of dimension DLEN_.
138*          The array descriptor for the distributed matrix X.
139*
140*  INCX    (global input) INTEGER
141*          The global increment for the elements of X. Only two values
142*          of INCX are supported in this version, namely 1 and M_X.
143*          INCX must not be zero.
144*
145*  =====================================================================
146*
147*     .. Parameters ..
148      INTEGER            BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
149     $                   LLD_, MB_, M_, NB_, N_, RSRC_
150      PARAMETER          ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1,
151     $                     CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6,
152     $                     RSRC_ = 7, CSRC_ = 8, LLD_ = 9 )
153      COMPLEX*16         ZERO
154      PARAMETER          ( ZERO = ( 0.0D+0, 0.0D+0 ) )
155*     ..
156*     .. Local Scalars ..
157      CHARACTER          CBTOP, CCTOP, RBTOP, RCTOP
158      INTEGER            ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
159     $                   JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
160     $                   NPCOL, NPROW, NQ
161*     ..
162*     .. Local Arrays ..
163      COMPLEX*16         WORK( 2 )
164*     ..
165*     .. External Subroutines ..
166      EXTERNAL           BLACS_GRIDINFO, IGEBR2D, IGEBS2D, INFOG2L,
167     $                   PB_TOPGET, PZTREECOMB, ZCOMBAMAX1, ZGAMX2D
168*     ..
169*     .. External Functions ..
170      LOGICAL            LSAME
171      INTEGER            IZMAX1, INDXL2G, NUMROC
172      EXTERNAL           IZMAX1, INDXL2G, NUMROC
173*     ..
174*     .. Intrinsic Functions ..
175      INTRINSIC          ABS, DBLE, DCMPLX, MOD, NINT
176*     ..
177*     .. Executable Statements ..
178*
179*     Get grid parameters
180*
181      ICTXT = DESCX( CTXT_ )
182      CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )
183*
184*     Quick return if possible.
185*
186      INDX = 0
187      AMAX = ZERO
188      IF( N.LE.0 )
189     $   RETURN
190*
191*     Retrieve local information for vector X.
192*
193      LDX = DESCX( LLD_ )
194      CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX,
195     $              IXROW, IXCOL )
196*
197      IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN
198         INDX = JX
199         AMAX = X( IIX+(JJX-1)*LDX )
200         RETURN
201      END IF
202*
203*     Find the maximum value and its index
204*
205      IF( INCX.EQ.DESCX( M_ ) ) THEN
206*
207         IF( MYROW.EQ.IXROW ) THEN
208*
209            ICOFF = MOD( JX-1, DESCX( NB_ ) )
210            NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL )
211            IF( MYCOL.EQ.IXCOL )
212     $         NQ = NQ-ICOFF
213*
214            CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP )
215*
216            IF( LSAME( RBTOP, ' ' ) ) THEN
217*
218               IF( NQ.GT.0 ) THEN
219                  LCINDX = JJX-1+IZMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
220                  WORK( 1 ) = X( IIX+(LCINDX-1)*LDX )
221                  WORK( 2 ) = DCMPLX( DBLE( INDXL2G( LCINDX,
222     $              DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) )
223               ELSE
224                  WORK( 1 ) = ZERO
225                  WORK( 2 ) = ZERO
226               END IF
227*
228               CALL PZTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL,
229     $                          ZCOMBAMAX1 )
230*
231               AMAX = WORK( 1 )
232               IF( AMAX.EQ.ZERO ) THEN
233                  INDX = JX
234               ELSE
235                  INDX = NINT( DBLE( WORK( 2 ) ) )
236               END IF
237*
238            ELSE
239*
240               CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP )
241*
242               IF( NQ.GT.0 ) THEN
243                  LCINDX = JJX-1+IZMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
244                  AMAX = X( IIX + (LCINDX-1)*LDX )
245               ELSE
246                  AMAX = ZERO
247               END IF
248*
249*              Find the maximum value
250*
251               CALL ZGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1,
252     $                       IDUMM, MAXPOS, 1, -1, MYROW )
253*
254               IF( AMAX.NE.ZERO ) THEN
255*
256*                 Broadcast corresponding global index
257*
258                  IF( MYCOL.EQ.MAXPOS ) THEN
259                     INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL,
260     $                               DESCX( CSRC_ ), NPCOL )
261                     CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
262     $                             1 )
263                  ELSE
264                     CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
265     $                             1, MYROW, MAXPOS )
266                  END IF
267*
268               ELSE
269*
270                  INDX = JX
271*
272               END IF
273*
274            END IF
275*
276         END IF
277*
278      ELSE
279*
280         IF( MYCOL.EQ.IXCOL ) THEN
281*
282            IROFF = MOD( IX-1, DESCX( MB_ ) )
283            NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW )
284            IF( MYROW.EQ.IXROW )
285     $         NP = NP-IROFF
286*
287            CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP )
288*
289            IF( LSAME( CBTOP, ' ' ) ) THEN
290*
291               IF( NP.GT.0 ) THEN
292                  LCINDX = IIX-1+IZMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
293                  WORK( 1 ) = X( LCINDX + (JJX-1)*LDX )
294                  WORK( 2 ) = DCMPLX( DBLE( INDXL2G( LCINDX,
295     $              DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) )
296               ELSE
297                  WORK( 1 ) = ZERO
298                  WORK( 2 ) = ZERO
299               END IF
300*
301               CALL PZTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL,
302     $                          ZCOMBAMAX1 )
303*
304               AMAX = WORK( 1 )
305               IF( AMAX.EQ.ZERO ) THEN
306                  INDX = IX
307               ELSE
308                  INDX = NINT( DBLE( WORK( 2 ) ) )
309               END IF
310*
311            ELSE
312*
313               CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP )
314*
315               IF( NP.GT.0 ) THEN
316                  LCINDX = IIX-1+IZMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
317                  AMAX = X( LCINDX + (JJX-1)*LDX )
318               ELSE
319                  AMAX = ZERO
320               END IF
321*
322*              Find the maximum value
323*
324               CALL ZGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1,
325     $                       MAXPOS, IDUMM, 1, -1, MYCOL )
326*
327               IF( AMAX.NE.ZERO ) THEN
328*
329*                 Broadcast corresponding global index
330*
331                  IF( MYROW.EQ.MAXPOS ) THEN
332                     INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW,
333     $                               DESCX( RSRC_ ), NPROW )
334                     CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
335     $                             INDX, 1 )
336                  ELSE
337                     CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
338     $                             INDX, 1, MAXPOS, MYCOL )
339                  END IF
340*
341               ELSE
342*
343                  INDX = IX
344*
345               END IF
346*
347            END IF
348*
349         END IF
350*
351      END IF
352*
353      RETURN
354*
355*     End of PZMAX1
356*
357      END
358*
359      SUBROUTINE ZCOMBAMAX1 ( V1, V2 )
360*
361*  -- ScaLAPACK auxiliary routine (version 1.7) --
362*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
363*     and University of California, Berkeley.
364*     May 1, 1997
365*
366*     .. Array Arguments ..
367      COMPLEX*16         V1( 2 ), V2( 2 )
368*     ..
369*
370*  Purpose
371*  =======
372*
373*  ZCOMBAMAX1 finds the element having maximum real part absolute
374*  value as well as its corresponding globl index.
375*
376*  Arguments
377*  =========
378*
379*  V1        (local input/local output) COMPLEX*16 array of
380*            dimension 2.  The first maximum absolute value element and
381*            its global index. V1(1) = AMAX, V1(2) = INDX.
382*
383*  V2        (local input) COMPLEX*16 array of dimension 2.
384*            The second maximum absolute value element and its global
385*            index. V2(1) = AMAX, V2(2) = INDX.
386*
387*  =====================================================================
388*
389*     .. Intrinsic Functions ..
390      INTRINSIC          ABS, DBLE
391*     ..
392*     .. Executable Statements ..
393*
394      IF( ABS( DBLE( V1( 1 ) ) ).LT.ABS( DBLE( V2( 1 ) ) ) ) THEN
395         V1( 1 ) = V2( 1 )
396         V1( 2 ) = V2( 2 )
397      END IF
398*
399      RETURN
400*
401*     End of ZCOMBAMAX1
402*
403      END
404