MPH1 and MPH2: Multi-Component Multi-Executable and Multi-Component Single Execut able

Unified Communication Module for Each Mode

This is an implmentation interface of two modes with a unified MPH_all communication module included in file "mph.F". A common Makefile is still shared for different modes of applications and machine architectures.

You could get the source code as a tar file here or access the individual files as follows:

MPH.outline: brief introduction

MPH2b.design: extra explanation

README: this file

mph.F: three separate MPH_xxxx modules

Makefile: shared makefile.

mph1: subdirectory for MPH1 component model source code examples

mph2a: subdirectory for MPH2a component model source code examples

mph2b: subdirectory for MPH2b component model source code examples

To compile:
==========

Note: Vince Wayland of NCAR contributes for SGI and Compaq

    The shared "Makefile" detects the machine architecture and compiles
    appropriately for IBM, SGI and Compaq. Also, depends on which mode
    of MPH you want to use, you could generate different executables by
    typing "make mph1" "make mph2a" or "make mph2b" (or "gmake ..." depends
    on your machine).

    One notice before you compile is that for NERSC IBM we need to use $TMPDIR
    stuff in CFOPTS to work around with F90 modules due to the GPFS file system,
    You may probably want to define "CFOPTS=" if your file system is compatible
    with F90 modules.

To run:
======

    After compile, you will have executables generated ("pop", "ccm", "cpl"
    for mph1, and "master" for mph2a and mph2b) in the corresponding subdirectory.
    Each sample subdirectoy also includes batch scripts and sample output.

Go to that directory first, and then:

    1) to run on NERSC and NCAR IBM SP interactively:
        a) % unsetenv MP_TASKS_PER_NODE
        b) % setenv pop_out_env pop.log
            % setenv ccm_out_env ccm.log
            % setenv cpl_out_env cpl.log
        c) make sure the following command in ONE LINE:
            for mph1:
            % poe -pgmmodel mpmd -cmdfile tasklist -nodes 6 -procs 9
                -stdoutmode ordered -infolevel 2 > & output &
            for mph2a and mph2b:
            % poe master -nodes 5 -procs 9 -stdoutmode ordered
                -infolevel 2 > & output &

to run on IBM SP with batch script:
% llsubmit script

    2) to run on NERSC CRAY T3E interactively:
        a) % setenv pop_out_env pop.log
            % setenv ccm_out_env ccm.log
            % setenv cpl_out_env cpl.log
        b) we could not run mph1 since there is no mpmd mechnism on T3E.
            for mph2a and mph2b:
            % mpprun -n 9 master > & output&

         to run on NERSC CRAY T3E with batch script:
         % cqsub run.t3e
         and the script "run.t3e" looks like:
                #!/bin/csh
                #QSUB -q debug
                #QSUB -l mpp_t=300         # Maximum residency time (for parallel jobs).
                #QSUB -l mpp_p=9            # Maximum PEs Needed (for parallel jobs).
                cd $HOME/MPH_all2/mph2a
                ja                                           # Turn on Job Accounting
                setenv ccm_out_env mph2a_ccm.log
                setenv pop_out_env mph2a_pop.log
                setenv cpl_out_env mph2a_cpl.log
                mpprun -n 9 master > output
                ja -s                                      # Print Job Accounting Summary

    3) to run on NCAR SGI interactively:
        a) % setenv pop_out_env pop.log
            % setenv ccm_out_env ccm.log
            % setenv cpl_out_env cpl.log
        b) for mph1:
            % mpirun -p "[%g]" -np 6 pop : -np 2 ccm : -np 1 cpl > output.a
            for mph2a and mph2b:
            % mpirun -p "[%g]" -np 9 master > output.a

        It's pretty simple to build an NQE batch script around it;
        just hard to get it run through any of the queues here.
        The tasklist file is not needed (can't be used?) on the SGI.

    4) to run on NCAR Compaq with batch script:
        You need two shell scripts, run.dec and runscript.dec.
        (Thanks to Dan Anderson & Bill Celmaster in NCAR).
        As on the SGI O2K script, a concern is to get all of the
        processes into execution at the same time.

        run.dec:
        #! /bin/csh
        prun -n9 -t runscript.dec

        for mph1, "runscript.dec" looks like this:
        #! /bin/csh
        #RMS_NODEID __ the node ID of the node this process is running on
        #RMS_NPROCS __ total number of prun processes spawned
        #RMS_RANK __ which prun process of the RMS_NPROCS processes

        echo "node= " $RMS_NODEID " Process number= " $RMS_RANK " of "
        $RMS_NPROCS
        if ($RMS_RANK >= 0 && $RMS_RANK <= 5) pop & # 6 procs for pop
        if ($RMS_RANK >= 6 && $RMS_RANK <= 7) ccm & # 2 procs for ocn
        if ($RMS_RANK == 8) cpl & # 1 proc for cpl
        exit

        for mph2a and mph2b, "runscript.dec" looks like this:
        #! /bin/csh
        #RMS_NODEID __ the node ID of the node this process is running on
        #RMS_NPROCS __ total number of prun processes spawned
        #RMS_RANK __ which prun process of the RMS_NPROCS processes

        echo "node= " $RMS_NODEID " Process number= " $RMS_RANK " of "
        $RMS_NPROCS
        master &

To use:
======

    Users need to "use MPH_all" in the application codes, and invoke
    the appropriate "MPH_setup_..." function call for the MPH mode
    used. You could use the "MPH_help" call to get the corresponding info.
    It will also provides you the available inquiry functions for that mode.

    Each component maintains its own output in a separate file (file name
    defined by environment variable either in command line or in batch run
    script), assuming the local processor 0 of each component being
    responsible for most output, other occasional writes from all the
    components are stored in one combined standand output file.

    This is accomplished by processor rank 0 of each component call
    subroutine "MPH_redirect_output" with the model name as argument.
    IBM and SGI could do the output redirect with the help of system
    function "getenv" or "pxfgetenv". Compaq cannot do this. And T3E is
    able to get the correct output files created using "pxfgetenv",
    but only output with those "write(6,*)" could be redirected, but not
    those with "write(*,*)", since * is equal to unit 101, and permanently
    related to the non-redirectable stdout.

For more information, please go to web page:
http://www.nersc.gov/research/SCG/acpi/MPH

Last modified February 16, 2001.

Back to NERSC ACPI Homepage