The Center has a 16-processor IBM SP-2 distributed memory message passing multicomputer and an 8-processor IBM J-40 shared memory multiprocessor. The following, among others, are available in these machines.
This document tries to provide some basic information on using the CPDC parallel machines. Please read this carefully before trying to use this machines. However, this document is in no way complete and does not attempt to explain everything accurately. Since these are still developmental machines, everything here may not match with the existing environment. So if you find anything missing, or have comments or suggestions, please send me a mail. We will try to keep this file up-to-date with the latest information.
Make sure that the following are in your PATH :
/usr/local/bin |
/usr/bin |
/usr/sbin |
/usr/ucb |
/usr/lpp/X11/bin |
/u/loadl/bin |
/usr/local/mpich/bin |
/usr/lpp/xlC/bin |
Make sure that the following are in your MANPATH :
/usr/local/man |
/usr/share/man |
/usr/local/mpich/man |
/u/loadl/man |
Also, remember to have a .rhosts file listing all the SP-2 nodes, blueser and aixdev.
For MPI programs, check out the online documentation which has links to various MPI web pages. The Hitchhiker's guide has a number of example MPI programs. To compile MPI programs, you may use the following commands, if you are using the native IBM MPI compiler.
mpcc -o prog prog.c |
mpCC -o prog prog.C |
mpxlf -o prog prog.f |
If you are using mpich, use mpicc and mpif77. There is no mpiCC, however. For compiling C++ programs, you will need to use a C++ compiler and link the mpich library explicitly. For more information, refer to the man pages.
To compile HPF programs, use xlhpf90 (Standard High Performance Fortran compiler) or xlhpf (High Performance Fortran compiler with F77 behavior). The relevant libraries are -L/usr/lpp/xlhpf/lib,-lxlf90,-lxlf,-lm,-lc and some of the relevant options are -qhpf,-qfree=f90,-qflag=l:l,-qreport=hpflist. You may need to have the environment variable LIBPATH set to the libraries directories.
To compile F90 programs, use xlf90. Some of the relevant libraries are -lxlf90,-lxlf,-lm,-lc and some of the relevant options are -qfree=f90. You may need to have the environment variable LIBPATH set to the libraries directories.
To compile an HPF program using PGI's compiler, use pghpf. Do a "man pghpf" to learn more about this compiler.
To run an MPI program on blueser compiled using the mpich library, use mpirun. Refer to the man page for details.
To run an MPI program compiled using IBM's native compiler, use poe. To submit a job on SP-2 nodes, use the loadleveler. Check out the loadleveler documentation for details.
If you are not using the loadleveler for submission of a job, you need to have a host.list file to inform poe about the set of processors you want to use. For instance, to run a program on 4 processors, you need to have a host.list file containing a list of at least 4 processors like
speth01.cpdc.ece.nwu.edu |
speth02.cpdc.ece.nwu.edu |
speth03.cpdc.ece.nwu.edu |
speth04.cpdc.ece.nwu.edu |
Then issue the command :
prog -procs 4 -euidevice css0 -euilib us -spname spcws |
However, interactive use of SP-2 nodes the above way has been disabled for the time being. At present, you will either have to build your job using the xloadl interface or using a command file. Here is a sample command file.
#
# @ executable = /usr/bin/poe
# @ class = small
# @ arguments = full path of your program together with parameters -euidevice css0 -euilib us -spname spcws
# @ job_type = parallel
# @ requirements = (Adapter == "hps_user")
# @ output = full path of output file, if any
# @ error = full path of error file, if any
# @ min_processors = 4
# @ max_processors = 8
# @ queue
Submit this script using llsubmit. You may use llq to query the job status, llsummary to return job resource information, llcancel to cancel your job etc.
The load leveler now recognizes different job classes and user groups and it is essential for a user to specify these in his job scripts to run a job successfully.
Job classes are used to identify resources needed by a job. Five job classes are recognized:
small : This class is intended for very small jobs that take less than 2 minutes to execute.
small-10 : This class is intended for small jobs that take less than 10 minutes to execute.
medium-60 : This class is intended for jobs that take less than 1 hour to finish.
medium : This class is intended for an average job and runs under 4 hours.
large : This class is intended for jobs that finish within 12 hours.
Read the note below regarding the restrictions on the use of these classes.
The order of priority for the job classes in the increasing order is large, medium and small.
To specify a medium class in your job script:
# @ class = medium
If you do not specify a class in your job script, the default is set to small. Also, if a job exceeds the time limit for the class it will be aborted. For example a medium class job will be aborted if it does not terminate after 4 hours.
Three user groups are recognized:
students : This group is for people who have class accounts.
research : This group is for CPDC researchers.
usr : This is the default group.
You can specify students group in the job script by:
# @ group = students
A job that does not specify a group defaults to usr group.
Right now, any particular user is not allowed to run more than 3 jobs at a time. So if you submit more than 3 jobs, you will find that only 3 of your jobs are in queue (I) and the extra jobs are not in queue (NQ). These extra jobs will be enqueued once the previous ones exit.
There is a limit to the total number of jobs that can be submitted at any point of time. If you find that LL is not accepting any more jobs, please wait for some time for some jobs to finish and then try again.
***** Important Note *****
During the day (8 a.m.
to 8 p.m.), a maximum of 8 processors can be used for classes
medium-60, medium and large. Any job with these classes and number of
processors more than 8 will automatically have the number of processors
reduced to 8. Use of processors/classes is unrestricted at any other time.
As a consideration to other users, submit only ONE medium-60, medium or
large job during the day.
As usual, anyone seen violating these will observe undesirable side-effects.
Please send mail to staff if you encounter any problem with the IBM J-40 or the IBM SP-2.
Last Modified : Tuesday, 14-Apr-98 15:53:21