• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

barrier/H07-May-2022-1,198996

catman/H07-May-2022-2,1231,626

common/H07-May-2022-1,374943

dsh/H07-May-2022-2,7732,398

dtop/H07-May-2022-2,1281,883

dvt/H07-May-2022-1,4721,258

html/man/H07-May-2022-1715

jsd/H07-May-2022-2,0251,703

pcp/H07-May-2022-1,4331,246

regress/H07-May-2022-158

rvt/H07-May-2022-7,4816,052

tools/H07-May-2022-1,087991

CHANGESH A D27-Feb-20087.4 KiB211140

INSTALLH A D24-May-2005710 2115

Makefile.amH A D22-May-2007121 31

Makefile.inH A D03-May-202218.4 KiB602523

READMEH A D27-Feb-20084.5 KiB10481

README-DVTH A D02-Nov-2003838 1813

TODOH A D12-Dec-20051.2 KiB4225

aclocal.m4H A D27-Feb-200831.5 KiB875782

autogen.shH A D04-Oct-20044.3 KiB153129

clusterit.specH A D27-Feb-20082.2 KiB9280

config.h.inH A D27-Feb-20084.4 KiB170113

configureH A D03-May-2022251.3 KiB9,0897,599

configure.acH A D27-Feb-20081.8 KiB7262

depcompH A D04-Oct-200414.8 KiB527335

install-shH A D04-Oct-20049.3 KiB326189

missingH A D04-Oct-200410.4 KiB361270

README

1$Id: README,v 1.11 2008/02/27 19:35:46 garbled Exp $
2
3Welcome to clusterit-2.5 !
4
5This is a collection of clustering tools, to turn your ordinary
6everyday pile of UNIX workstations into a speedy parallel beast.
7
8To get started quickly, please read the file INSTALL.
9
10Initially this work was based on the work of IBM's PSSP, and copied
11heavily from the ideas there.  Its also lightly based on the work
12pioneered in GLUnix.  I've decided to simplify, and complexify it
13however:
14
15Glunix is a monstrosity.  It allows better control over the
16individual nodes, and much better load sharing.  However I'm convinced
17alot of the speed advantages of having a parallel cluster are lost with
18the incredible overhead of running the glunix master and daemon services
19on a host.  Glunix does however offer a real paralell programming
20environment.  Something which is totally beyond the scope of this package.
21
22PSSP is also a very powerful set of tools.  Not much more than a bunch
23of staples written in perl, they provide an incredible tool for tying
24an unwieldy number of UNIX machines into one fast demon of an MPP.
25
26The advantages of both systems are central control of a large number of
27machines.  Unfortunately, they all have dwarbacks.. as does my solution.
28
29What my solution provides:
30
31*Fast* parallel execution of remote commands.
32	C vs. Perl.  You do the math.
33
34Heterogenous cluster makeup.
35	This makes it very easy to administer a large number of machines,
36of varying architectures, and operating systems.  The fact that my tools are
37completely architecture independent, make it possible to dsh commands out
38to machines that aren't even running the same OS!  This can be useful for a
39variety of mass administration tasks an admin may have to undertake.
40
41Choice of authentication.
42	IBM forces you to use kerberos 4 for authentication on the SP's.
43This is actually fine for a closed environment like an SP, but for something
44to be run on just a stack of otherwise useful boxes, you need more freedom.
45This suite allows you to do whatever you like.. ssh, kerberos, .rhosts.
46Whatever suits your security and speed requirements best.
47
48Sequential node, and random node execution
49	The idea here is that these dsh-like programs allow you to do something
50akin to load balanced scripting.  For example one could set up an NFS shared
51build directory, and issue the command
52make -j4 CC='seq gcc'
53Which would execute a build in paralell, on 4 nodes in your cluster, assigning
54processes to each node in sequence.   The run command is equivilent to saying:
55"I dont care where you run, just run and tell me how things turned out."
56
57Job Scheduled Shell:
58
59The jsd/jsh pair of programs was specifically designed for parallel
60compiling.  The idea is that the user sets up a benchmark program of some
61sort, which is executed by the jsd program.  This benchmark then ranks
62the machines in the cluster by performance.  When the jsh command is run,
63the fastest machine will be given the command to execute.  At the same
64time, jsd keeps track of the node being in use, and refuses to give other
65commands to it, until it completes.  In this way, you can avoid the
66problem where a single slower machine tends to accumulate much work
67because it isn't finishing quickly enough.  It also tends to favor the
68fastest machine in a cluster, giving it most of the work in a parallel
69compile.
70
71Barrier sync for shell scripting.
72	This is a new idea.  The barrier mechanism consists of a daemon run on
73a host, and a client which can be used to barrier sync with.  An example of use
74would be:
75
76#!/bin/sh
77do something
78barrier -h host -k token -s 5
79do something else
80
81Then, you would dsh the execution of this script to your hosts.  The barrier
82makes sure that all hosts have completed the first "something" before the
83continue on to the next something.  The -s, is the level of paralellism for
84the script, ie: how many processes to wait for before continuing.
85
86dvt:
87
88This is a parallel interactive execution environment.  The user is given
89windows for each host in the cluster, and a central management window.
90Keystrokes typed on the central management window, will be relayed to all
91of the subordinate windows. This allows the user to vi a file on 20
92machines simultaneously, for example.  You can also select a window, and
93use it like a normal xterm, to perform actions on just that host.
94
95What my solution does not provide:
96
97A parallel programming API
98	Use MPI, or PVM, or whatever for that.. thats outside the scope of
99this suite.
100
101Please visit the ClusterIt homepage for more information
102http://clusterit.sourceforge.net/
103Tim Rightnour <root@garbled.net>
104

README-DVT

1I've had reports of people having trouble running dvt on various machines
2and clusters.  IMHO, this is probably the second most useful peice of
3clusterit, so hopefully these hints will get you started.  If not, feel
4free to email me at root@garbled.net for help.  I'm usually nice.
5
61) Make sure that rvt is in your path, and is executable.  Try running it
7manually just to make sure it works.
8
92) rvt/dvt default to using rlogin.  You probably want to set this to ssh.
10If you aren't sure this is working, or don't want to deal with ssh, one trick
11I ususally use is to set the RLOGIN_CMD environment variable to "telnet" and
12just login manually.
13
143) Make sure you have your display set, and all that fun stuff.  If you can't
15run an xterm, dvt/rvt won't work either.
16
17If none of this helps, send me a note, perhaps I can help you with it.
18