Cal Poly Logo
ITS/Grid Computing 
C a l i f o r n i a   P o l y t e c h n i c   S t a t e   U n i v e r s i t y  
Grid Computing Home > Quick Start



:: Grid Computing ::


Crash course for the impatient

This is a quick guide for those users who are familiar with the Linux environment and will be running their own code (non-commercial software package).

This guide will point you to the right direction on how to get your code running without detailed explanations.

After successfully logon, execute the command below to create all files necessary for this guide:

$ run_quick

You will see the following files:

  • hi.c
  • hello++.cc
  • fpi.f
  • cpi.c
  • pbs.sh
  • mpirun.sh
Now to test your environment compile the following files:

$ mpicc -o cpi cpi.c
$ mpif77 -o fpi fpi.f

The mpicc and mpif77 command is used to compile a non-serial C and Fortran code. A non-serial code is a code that branches out to other nodes for greater processing power, using the Message Passing Interface (MPI) library.

Subsequently:

$ gcc -o hi hi.c
$ g++ -o hello++ hello++.cc

will compile a non-serial C and C++ code.

Fell free to look at "cpi.c" and/or "fpi.f" source code. Parallel programming is not difficult to learn. For more on this subject:

http://www.cs.rit.edu/~ncs/parallel.html.

The Portable Batch System (PBS)

PBS is the system that will do all the work for you. You will use PBS to submit and obtain information about your job. While there are a number of commands available, most users will need to be familiar with only two commands, qsub and qstat.

There is a tool available to aid in the creation of a PBS file.
It is a script called PBSwriter.sh.
To run it just type PBSwriter.sh, and answer the questions.

  • qsub
  • This is the command to submit your job to the system. A job is a shell script that will run your program. It is in this shell script that you will set attributes to customize how you want you program to run.

    Simple script (pbs.sh):

    #!/bin/bash
    #
    #SIMPLE SCRIPT
    #
    #PBS -l nodes=2,walltime=12:00
    #PBS -N simple
    #
    #cd into the directory where I typed qsub
    cd $PBS_O_WORKDIR
    #run it
    ./hi
    #

    Lines whose first characters are #PBS are called directives. Directives are parameters to "qsub" and they must precede any executable line on the script.

    A table of PBS directives is shown at the bottom of this page.

    In this sample, after the directive we used the "-l" the "list" flag, the flag that specifies resources, in this case we are asking for 2 nodes and a total runtime of 12 hours.
    The "-N" sets a name to your job (otherwise it would use the script name).

    It is a good practice to include the "cd $PBS_O_WORKDIR" command in all your scripts. By doing that you are forcing the output of your job to be placed in the same directory you invoked the "qsub" command.

    Now run the script:

    $ qsub pbs.sh

    Upon your job submission, a "XXX.mgt.cluster.com" message will be seen, where XXX is your job "ID", and with this number will be able to monitor the status of your job using the "qstat" command (impatient ? Read "qstat" below).

    The output of this job will be placed where you called "qsub". You will see two new files on your current directory. The format of those files are:

    <script_name>.eXXX and <script_name>.oXXX

    In our case "simple.eXXX" and "simple.oXXX" (XXX is the job ID).

    The output of or "hi.c" program will be placed in the ".oXXX" file, unless there is an error.
    In this case of an error, the error message will be placed in the ".eXXX" file and "oXXX" will be empty.

    A better script (mpirun.sh):

    #!/bin/bash
    #
    #BETTER SCRIPT
    #
    #PBS -l nodes=8
    #PBS -l walltime=30:00
    #PBS -j oe
    #PBS -o vars.out
    #PBS -N vars_mpi
    #
    #How many procs do I have?
    NN=`cat $PBS_NODEFILE | wc -l`
    echo "Processors received = "$NN
    #
    echo "script running on host `hostname`"
    #
    #cd into the directory where I typed qsub
    cd $PBS_O_WORKDIR
    echo
    echo "PBS NODE FILE"
    cat $PBS_NODEFILE
    echo
    #run it
    mpirun -machinefile $PBS_NODEFILE -np $NN cpi
    #

    In this better script above we use the "-j" (join) flag. This flag joins both error and the output into a single file, and with the "-o" flag, we name that output file to "vars.out".

    With this script we introduce the mpirun command, the command to run a parallel code. The parameters passed to mpirun are necessary, the "-machinefile <nodes_to_run>" asks for specific nodes to run ( on all nodes "$PBS_NODEFILE" in this case), and "-np" specifies the number of processors (all available processors "$NN").
    Lastly with the "-N" flag we name our job, and this will be the parameter passed to qstat to obtain the status of that job.


  • qstat
  • This command will list all jobs currently running/queued.
    For in depth usage, consult the qstat man page.

    PBS Directives

    A PBS directive is placed on a line that begins with the keyword #PBS.
    Here is the list of the most used directives:

    Directive Description
    #PBS -S /bin/csh Set the login shell for the batch job, and assigns its value to the variable $SHELL of the execution environment. The set value is also assigned to the job attribute Shell_Path_List
    #PBS -l ncpus=1,mem=256mb Request 1 cpu and 256mb for this job. PBS will compute the number of nodes that satisfies this request then set the environment variable NCPUS to twice the number of nodes.
    #PBS -l file=50gb Request 50 Gbyte of storage space for this job.
    #PBS -l walltime=hh:mm:ss Request that this job run for at most hh hours, mm minutes, and ss seconds.
    #PBS -l nice=n Set the job nice value to 20 + n and its scheduling priority to 20 - n. The range of legal values of n is -20 <= n < 20, e.g., #PBS -l nice=-20 sets the highest priority (40). It is recommended that the users do not use this option, or, if they use it, they should specify n >= 5.
    #PBS -v var_list Instruct PBS to export the variables in var_list from the job submission environment to the batch job environment, e.g., #PBS -v DISPLAY.
    #PBS -V Export all environment variables from the job submission environment to the batch job environment.
    #PBS -N Job_name Give a name to the job. Default name is the name of the job script
    #PBS -j oe Directs that the standard error stream and the standard output stream of the job will be merged, intermixed, as standard output.
    #PBS -k oe Directs PBS to keep the stdout spooled stream (-k o) or stderr ( -k e) or both (-k oe) on the execution host, placing them in the home directory of that user on the execution host in the files job_name.o# and job_name.e#
    #PBS -m abe Instruct PBS to send e-mail to the job owner when the jobs begins (b) ends (e) or aborts (a).
    To send email to someone other than the job owner, use the -M option to qsub.
    #PBS -m n Instruct PBS to never send e-mail to the job owner
    #PBS -o path_name
    #PBS -e path_name
    Defines the path to be used for the standard out and error streams of the job. If path_name is a relative path, PBS assumes the absolute path is
    $PBS_O_WORKDIR/path_name.
    The output and error paths are assigned to the job attributes Output_Path and Error_Path respectively.
    #PBS -a date_time Declares the time after which the job is eligible for execution, where date_time has the form:
    [[[[CC]YY]MM]DD]hhmm[.SS]
    Example:
    qsub -a 0008231036.27 gauss.pbs
    			
    makes the job eligible for execution at 10:36:27 on August 23, 2000
    #PBS -c c=num_min Specify the time interval for check pointing the job, in minutes. For example, to checkpoint the job every 30 minutes:
    #PBS -c c=30
    #PBS -W depend=rules e.g., #PBS -W depend=after:jobID Specify job dependencies. The right hand side is a comma separated list of dependency rules, where a rule has the syntax:
    type[:argument[:argument]][,type...]
    For example, to specify that this job may be scheduled only after jobs 100 and 101 complete successfully:
    #PBS -W depend=afterok:100:101
    To specify that this job must start only after job 100 completes:
    #PBS -W depend=after:100

     

     

    Cal Poly Home | Cal Poly Find It
     
    Home | Application Form | Compilers | Quick Start | Submitting Parallel Jobs | Benchmark Tests | Accessing the Cluster | Commercial Softwares | Research



    ITS/Grid Computing
    California Polytechnic State University
    San Luis Obispo, Ca 93407
    jburdett@calpoly.edu