MCS572 UIC HP9000/800 V2250
Parallel Processor (The borg)
Users Local Guide

version 0.95
29 November 1998


Professor Floyd B. Hanson

Mail address:

Office address:

E-Mail address:

Hanson World Wide WEB Home Page:

UIC Fall 1998 Course:

MCS 572 Class World Wide WEB Home Page:


Table of Contents


Introduction: HP9000 Overview.

This User's Local Guide is intended to be a sufficient, hands-on introduction to the UIC Academic Computing and Communications Center (ACCC) HP9000/800 V2250 parallel processor called the "borg", the picture of this V-Class Exemplar Computer Server and its Star Trek namesake being found at the web link below:

The UIC HP9000/800 V 2250 has 14 parallel processors, collected into 7 pairs of 2 processors each on a chip. The local machine name is the `borg' and its full internet address is `borg.cc.uic.edu'. The HP9000/800 V2250 model denotes the Hewlett-Packard V-Class Exemplar Computer Server. The ACCC has the following brief information on the HP9000 borg:

Parallel Processor Configuration.

The basic computer model is a Shared-Memory Processor (SMP) or Symmetric Multi-Processor (SMP too). The HP9000 V-Class Computer Server is synchronized by many control units: (See also

particularly Figures 1-1 (Functional Block Diagram of a V-Class System) and 1-2 (ERAC Interconnection).)

The HP9000 system is configured in the attached figure:

The processor and memory are connected by an 8×8 Hyperplane CrossBar Switch using the 4 ERACs so that each processor is connected to all 32 memory banks. Each ERAC has a bandwidth of 1.9 GB/s (in GigaBytes per second), so that the overall bandwidth of the set of 4 ERACs is 15 GB/s. A diagram of the CrossBar switch is linked below:

Processing Units.

The HP9000 parallel processor uses Hewlett-Packard PA-RISC PA-8200 processor chips at a 240MHz clock rate (4.17 nanosecond clock time) with a vendor promised 960 MegaFlops (MF= million floating point operations per second, i.e., almost 1GF or 1 billion floating point operations per second) peak performance. The `800' and `2250' in the HP9000/800 V2250 model number corresponds to the 8200 chip number. The CPU is based on the HP 64-bit PA-RISC 2.0 architecture. The term RISC denotes `Reduced Instruction Set Computer', in contrast to the formally widely used CISC or `Complex Instruction Set Computer' in many PCs. See

Memory Organization.

The HP9000 is a shared/symmetric memory (SMP).

Physical Memory:  The physical memory or RAM (Random Access Memory) or global memory on the borg totals 4 GB (4 Giga Bytes or 4 × 1024&sub3; Bytes) of storage distributed over 32 Memory Banks with 64 MB (64 × 1024&sub2; Bytes). The partitioning of the Memory (RAM) into 32 memory banks is called "32-Way Interleaving" and allows for simultaneous memory access to different banks, which would not be possible if the entire memory was treated as one memory unit. Each array is stored over as many memory banks as possible. Each EMAC controls a set of 4 Memory Banks with a total of 256 MB of memory. The physical memory has a latency (delay) of 500 nanoseconds, approximately. The physical memory holds executing programs and data (called "Coherent Memory"), but also holds much system information. See also

particularly Figures 2-3 (Coherent Memory Space Layout) and 2-4 (Conceptual Layout of Physical Memory of a Fully Populated System).

Virtual Memory:  Virtual memory is available through the use of memory units called pages, permitting the execution of much larger programs than can fit in physical memory by accessing by page segments of memory. The compiler generates the addresses of the much larger virtual memory, but only those pages currently executing are translated into physical memory. The HP PA-RISC 8200 translates both 32-bit and 64-bit virtual and physical memory addresses into 64-bit addresses. The Translation Lookaside Buffer (TLB) is the hardware device that helps translate or map virtual memory references to physical memory pages to permit valid memory accesses. See also:

Cache:  In addition, each processor has a 2 MB of fast data cache memory, a 2 MB of fast instruction cache memory, making a total of 4 MB each. Cache memory is generally smaller, faster and closer to the CPU than physical memory. (See also

Hard Disk Memory:  Hard disk memory consists of 8 × 9 GB Ultra SCSI disks, making 72 GB total. On these disks your user directories, operating system, other system software, and application software are stored. Hard disk memory is generally bigger, slower and furthest away from the CPU than hard disk memory. Hence, the memory hierarchy is the order from cache to physical to hard disk memory. Users can use up to 50 MB of hard disk memory in their home directory (`/homes/home7/[userid]'). The 50 MB is a soft quota and 75 MB is the hard quota, presumably for temporary storage. Users also have a scratch directory in `/scratch/[userid]' of up to 500 MB, but scratch memory is temporary and volatile subject to systems administration erasure. Users can check out their quotas, by the command:

Input-Output Connections.

A number of PCI (Peripheral Component Interconnect) bus cards connect the processors and physical memory to the Exemplar I/O (Input/Output) Subsystem with a 240 MBytes/second channel bandwidth. This I/O Subsystem is connected to the network by a Fast-Ethernet switch which makes a high speed connection at 200 Megabits per second to the UIC ADNii (Academic Network 2) into which a user would typically logon to the UIC Exemplar. Also an ATM (Asynchronous Transfer Mode, not Automatic Teller Machine) network switch provides a Backup Interface when enabled.

Machine Benchmark Performance.

The UIC HP9000 borg has a aggregate peak performance of about 2,979 MF for the 1000X1000 linear algebra system solve using 8 processors or about 3.0 GigaFlops (GF = billion (10^9) floating point operations per second or 3,000 MF) according to UIC ACCC borg performance data and comparison with the Old borg:

{Yet no HP9000 is powerful enough to appear on the current Top500 Computer Report (report is big)

Note that units of the Hockney asymptotic performance rates, actual maximum Rmax and vendor peak Rpeak, are in MegaFlops, while Nmax is the maximum solved matrix corresponding to Rmax and N1/2 is the matrix size corresponding to half Rmax.

Another benchmark using the NASA NAS BT Simulated Computational Fluid Dynamics Application does not list any HP9000. Gunter's benchmark site:

The HP9000 would be classified as a mini-supercomputer, not quite a supercomputer which denotes the class of the most powerful machines. In fact,

only mentions Supercomputing Technologies are used.

However, it offers local access and many of the parallel constructs found on massively parallel machines.

Load Share Facility (LSF).

The Load Share Facility (LSF) is used to handle the parallel scheduling of user programs at UIC and you are requested to use it, although there are other methods that can be used on HP9000s. There is a UIC borg webpage briefly describing LSF Commands at

and very large (over 200 pages or 13 MB)) official guide from Platform Computing Corporation accessible by ftp (file transfer protocol: see below) to the borg using your borg account at the location:

However, this user guide and the brief UIC LSF description should be sufficient to start with.

The basic execution envelope command is lsrun, the UIC Systems Administration requests that you run your computer jobs with lsrun, and it is found in the directory path:

which is a virtual link to the physical command path:

as are most LSF commands. A typical use would be of the form:

where "[executable]" is your executable file run in the background (>&) and "[output_file]" your output file where output is redirected (&).

Other LSF commands are given below in the section:

Operating System (HP-UX).

The HP9000 operating system is called HP-UX, which is a parallel extension of the Hewlett-Packard version of Unix called UX or HP-UX. The current version (01 October 1998) is HP-UX 11.0 Operating System. It is binary compatible with the scalar HP-UX Unix used on Hewlett-Packard workstations. See also:

Login and File Transfer.

TELNET:  Access to the `borg' is by either the universal TCP/IP Internet Protocol:

command which can be abbreviated when access if from a `cc.uic.edu` machine by such as the student server `icarus', as preferred by the Computer Center, or by the Unix commands: which carry your accessing terminal emulation through much better but is restricted to Unix to Unix connections: `ssh' is the Secure-Shell version that encrypts your password to protect it.

FTP:  File transfer to the `borg' is best accomplished by the universal TCP/IP Internet Protocol command:

which from a Computer Center machine like `icarus' can be abbreviated as `ftp borg', or from the new `borg' to remote machine: to your home or lab computer.

Caution:  If CMS is your home computer system, then you are restricted to file transfers only from CMS to the new `borg' while you are logged onto CMS while accessing the new `borg'. This is because CMS accounts are single user accounts restricted to single access only. That is, FTP to and Telnet from CMS would count as a double access so is not permitted in CMS. This is very different from Unix which is a multiuser system allowing multiple access, i.e., you can even log into the new `borg' several times simultaneous, but you also degrade your own performance.

Programming Languages and Compilers.

The usual optimizing or parallelizing compilers on the HP9000 are the HP Fortran 90 compiler f90 found in the new `borg' directory location:

and the HP-UX optimizing C compiler cc, found the new `borg' directory location:

The F90 options

FORTRAN:  A typical format for compiling and linking a HP Fortran program named `[program].f' with parallel optimizations {`+O3'} in the background {last `&'}, provided the PATH variable is set properly to the system value, is

where `lsrun' is the `borg' Load Share run command. See also:

.

where both compiler information listings {`+list'} and error messages {`>&'} are redirected into the file `[program].LIST', timer utilities are permitted {`+U77'}, while the executable module is renamed `[executable-file]'. It is assumed that the user replaces the square brackets and their contents by actual legal names.

On the `borg' enter

In summary, the optimizing Fortran-Compilers are

See also,

C:  Similar format is used to compile and link C code, provided the PATH variable is properly set to the system value:

where `lsrun' is the `borg' Load Share run command, and where compiler information listings and error messages {`>&'} are redirected into the file `[program].LIST' enabled by `+L' option, while the executable module is renamed `[executable-file]' and `-lm' allows access to the function of math library {`#include ' in code will not work without `-lm' option in `cc' command}. See also:

C-Version Caution:  There are multiple versions of the C-compiler. The `cc' HP C Compiler is the one in `/opt/ansic/bin/cc', and, in fact, the HP-UX Standard C Compiler is the one in `/bin/cc' is actually link to the optimizing one, which is what you want to happen.

There also the GNU (Glad its Not Unix public domain) Project version `gcc' in

is not yet enabled.

In summary, the C-Compilers are

Use those optimizing compilers found in the `/opt' directory.

C++:  Note: both c++ and g++ do not seem to be currently implemented, but the optimizing ANSI C++ (aCC) is available. The `aCC' takes some of the same options as `cc'.

See also:

Shell Caution:  If you have trouble with the above compiler command syntax, it may be that you have your borg account shell set as the Bourne shell (ksh) rather than the user standard C-shell (csh). You can change you shell for the current session by the command

or you can change the shell permanently to a C-shell for which this guide is written by the command:

where `[borg_user_name]' must be replaced by your actual borg account name, but you must also logout and logback in again to activate the shell change. Using a C-shell should make your access must easier, while the Korn shell (ksh) or generic shell (sh) would be more powerful if you were doing computer systems work rather than scientific computing. New student accounts are set up with a Korn shell (ksh) in the ACCC template, probably because that is what the systems people use. See the SHELL PATH section below or consult the ACCC page:

Editors and Emulations.

There are several editors on the Exemplar (borg.cc.uic.edu), but their ease of use depends on the particular terminal emulation used to access the Exemplar.

XTERM:  The default terminal emulation is MIT's X-Windows systems which is typically used with access from Sun Unix or IBM AIX workstations permitting the windowing interface of these workstations to pass through to the remote Exemplar.

TERM vt100:  However, if you are not using X-Windows to access the `borg', then it is suggested that you use the DEC `vt100' terminal emulation by entering the Unix command

on the `borg' command line, or better replace your `set xterm' line in your Login resource configuration files `.login' for C-Shell (or `.profile' for K-Shell) on the `borg'.

VI:  If your are also going to be using the usual Unix visual editor `vi', see the following manual (man) pages:

along with the vt100 emulation then you will want to replace your Edit Resource Configuration file `.exrc', which by default template is for the `emacs' {"Ensures Maximally, Almost, Carpal-tunnel Syndrome" {%>)}; an editor favored by computer systems people}, by the following version: In fact, since certain execution commands force the vt100 emulation to be revert back to `xterm', a better fix would be to redefine the `vi' editor command in your C-SHell Resource Configuration file `.cshrc', by trying the command : by adding the following line among the aliases: Using the write and quit command: within vi and activating the change by entering on the `borg' command line the source execution command: Then the editing command: should work as usual.

THE:  There is also The Hesseling Editor `the', which is like the CMS xedit command, so use the manual command:

for more information.

PICO:  In addition, there is also the `pico' editor from the `pine' mail program, but `pine' itself is not on the `borg' since you are permitted only to send mail from the `borg' but not to it. To use `pico', it is suggested that you use the simple display system `pilot', mentioned below, that uses `pico' to implement its `Edit' submenu command . There are man (manual) pages accessible by the commands:

Manual and File Paths.

On the borg, many needed files are hidden away deep in the directory structure, so that you must specify the appropriate paths in your .login or .cshrc if using the C-Shell (or .profile if using the K-Shell (sh)) so that you do not have to do it with each use. Hence, C-Shell users will need to add the lines in your .login file: Users can also use the forms suggested on the ACCC hpborg.html:

These forms are so that you inherit the system file path $PATH and manual path $MANPATH as initializations for the C-SHELL (form is different if you use the K or Bourne-Shells), and make sure that you have the paths for the optimizing f90, cc, cpp (C Preprocessor) and lsrun commands. You may even append custom paths to the system paths.

Load Share Program Execution.

lsrun:  On the `borg' at UIC, a custom environment so that you do not have to do it with each user command called `lsrun' greatly simplifies the execution of programs (the full path is `/usr/local/lsf/bin/lsrun'. The Computer Center requires use of `lsrun'. A typical format for executing a module named `[executable-file]' in the background {last `&'} is

where both calculated output and error messages {`>&'} are redirected into the file `[output-file]'.

Users can also use lsrun for the compile/load step as well.

Pilot: Simple File Display & Execute Browser.

If you are used to using the Pine Mail Program, instead of HP-UX you might prefer to use the Pine-Like simply file display and execute browser system called by typing in HP-UX the command:

which is fairly self explanatory using the `Arrow Keys' to move around the file display and using commands like

The browser displays a file list at the top of the screen with a the bottom few lines indicating the available commands just listed above.

Launch:  The "Launch (l)" command uses the Bourne Shell (sh) instead of the simple user C-Shell (csh), so the above Compile/Link (f90 or cc) or Load Share Run Executable (lsrun) commands must be modified slightly for `sh' syntax that differs from `csh' syntax. Basically, you need to

Here, the "2>&1" means merge standard error (2=stderr) with standard output (1=stdout, here redirected to the file "[output]"), not as simple as the csh form but more precise, which is why systems people prefer sh or other shells and users usually prefer the simpler csh. }Note: Standard input (0=stdin) can also be used.}

On the borg type:

  1. pilot {except do not type any quotes!}

  2. l {to Launch (pilot for execute) command}

  3. Either:

  4. lsrun pgm > pgm.output 2>&1
{Some more Hints:

Edit:  The Edit command within Pilot (really Pico) uses EMACS like ctrl commands like `^Key' which means the `Ctrl-Key' pressing together (or at least `Ctrl' held then LOWER CASE `Key' pressed:


Shell Scripts for Runing Programs.

Shell scripts for borg Command Line or pilot Launch (l) command have been place in the borg directory `/homes/home7/hanson/' that can be used to compile and execute programs without concern for which shell is being used, since the shell selection is forced in the script.

Here is a directory listing of the files:

They execute on the `borg' command line or with the `pilot' Launch (l) command. ACCC consultant said new student accounts are set up with the K-Shell ksh. Here is a procedure for using these scripts:

  1. On your borg account, you can just copy them from my home directory since they should be readable to all: type

    {"*sc" is UNIX short hand for all the script files and the "." denotes your current directory where you are assumed to want the files.}

  2. You need to make the files executable (x) for you the user (u): type

  3. These files assume that you have a work file "pgm.f" or "pgm.c": type

  4. The script functions:

  5. On the Command line, for example, just enter {Caution: your have to wait for the compile to finish, before the run.}

  6. Using `pilot' with the Launch (`l') command, for example, enter:

UIC borg HP9000 Accounts.

A personal account on the `new borg' is available to any faculty member who requests one and to any student who has obtained a faculty member's sponsorship permission. Note that the UIC HP9000 is not intended to be used for general needs such as email, news reading, or Web browsing; its resources should be used only for computationally intensive needs. If you wish to obtain an account please send email to

with the Subject: Include your netid, your telephone number, and a brief description of your intended usage. Students must also provide the name and email address of the faculty member who is sponsoring the account, i.e., students should send in the message the following items:

(For MCS 572 students, the Intended Usage is MCS 572 Introduction to Supercomputing, with Faculty Sponsor Prof. F. Hanson)

  • Return to TABLE OF CONTENTS?

  • This mini-local-guide is meant to indicate ``what works'' and what is ``useful'' for beginning users. The guide also gives alternate methods for access from UNIX systems.

    HP and UX are trademarks of Hewlett-Packard Company. UNIX is a trademark of X/Open Company, Inc.


    Background References

    This guide is intended t be self contained, but users who want further information, can consult the following sources (you can just click on the highlighted topics to access if you are surfing the world Wide Web):

    1. Professor F. B. Hanson, MCS 572 Introduction to Supercomputing Home Page, {provides a large variety of links to useful supercomputing information}.

    2. UIC Computer Center (ACCC=Academic Computing and Communications Center, formerly ADN),

    3. Hewlett-Packard Company,

    4. James F. Kerrigan, Migrating to Fortran 90, O'Reilly & Associates, Inc., Sebastopol, CA, 1993.
      {See also Prof. Hanson's online Web Publications Collection on Fortran accessible from the class homepage.}

    5. J. Peek, T. O'Reilly, M. Loukides, and others UNIX Power Tools, O'Reilly & Associates, Inc., Sebastopol, CA, 1993. {See also Prof. Hanson's online Web Publications Collection on UNIX accessible from the class homepage.}

    6. E. Cutler, D. Gilly, and T. O'Reilly (Editors), The X-Window System in a Nutshell, O'Reilly & Associates, Inc., Sebastopol, CA, 1992.

    7. man [command] (CR), when invoked in a UNIX-like system such as HP-UX UNIX, produces an on-line listing of the manual pages on the command [command], or similar function {Caution: make sure you have your `MANPATH' meta variable set so your manual path inherits the large number of borg system paths}.

    8. Consultation concerning problems related to using the UIC HP9000 can be obtained from Professor Hanson {718 SEO, X3-2142, hanson@uic.edu, contact by email is best}. For those in the MCS572 Class, they should contact Professor Hanson first, but in an emergency they should contact the Computer Center Consultants at systems@uic.edu with Subject: `Borg Problem'.


    Annotated UIC HP9000 Sample Session

    The login procedure depends on your local method of accessing the HP9000 from UIC, but the best access is from a Unix type system since the UIC HP9000 operating system is HP-UX, which is substantially Unix and it is to the user's advantage to use Unix to Unix communication. If you do not now have a Unix account you should try to get one from your department's Unix system or from the UIC Computer Center graduate student Unix (Sun Solaris Unix) server called `icarus'. Unix workstations are available in many science and engineering departments. Communication using the IBM mainframe suffers from horrible terminal emulation problems, so that you should avoid it if you can. If that does not work out or is not practical see Professor Hanson about other alternatives.

    UNIX Access:  The ACCC preferred remote telnet command:

    However, the Unix remote login command also works:

    and is best since it passes your local terminal emulation better from the local session through to the remote UIC one and you do not have to enter your login name as with `telnet' and can proceed directly to the `password' step below.

    PCLAB Access:  Access by `telnet' TCP/IP command also works for PCLab PCs, or CMS, as well as in Unix, with the format:

    {Caution:  From a UNIX operating system, it is essential to use lower case; borg.cc.uic.edu is the full Internet name for the UIC HP9000, but the shortened name borg.cc.uic.edu also works; the corresponding Internet Number of the new `borg' is `128.248.155.55' (the old borg is `128.248.100.55') and is more basic since the Internet Name is derived from the number and the number may work when the UIC computer domain name server (DNS) is down; the new `borg' should respond with:}

    LOGGING OUT:  You can end this session at any time you have a `borg: ' prompt by entering:

    or pressing the `Ctrl' control key and 'd' key simultaneously (i.e., `ctrl-d', but you can hold down the `Ctrl-key' and then press the `d-key').

    FILE COMMANDS:  You can check what the name of your `borg' home directory (file system) is by the Unix ``print working directory'' command:}

     

    VI SETUP:  If you are going to be using the Unix `vi' visual editor rather than the `emacs' editor or an X-Windows editor via `xterm', then you might want to set your terminal environment to the standard `vt100' terminal emulation by the command:

    Along with this change to `vt100' (or `vt132'), you should first save your old Unix edit (`EX') resource configuration file `.exrc' using the Unix copy `cp' command :

    to a `vt100' friendly one (if permissions permit from Professor Hanson's account) by

    or by using the web link with a web browser at your accessing computer to get a copy:

    although you might have to use the File Transfer Protocol `ftp' to get it to the `borg', for example by Anonymous FTP:

    FTP will be described below in Section on FTP.

     

    FORTRAN SESSION:  For a sample session for compiling and executing a Fortran Program, you can get a copy of the MCS572 `borg' starter problem via the web and transfer it to the borg:

    or by `cp' copy command:

    or by Anonymous FTP:

    {Note:  if this Anonymous FTP method is used than the getting of `.exrc' above and `start.f' here can be combined.}

     

    For an example of compiling and linking the MCS572 Fortran based `borg' starter problem (assuming `start.f' has been transferred to your `borg' account), enter:

    C SESSION: 

    You can run a sample HP C program by transferring to the `borg' a starter C code version:

    or by `cp' copy command:

    or by Anonymous FTP:

    {In order to compile and link this C code, enter:}


    ftp File Transfers with UIC/HP-UX

    The FTP file transfer protocol is the fastest method of file transfer between UIC and UIC HP-UX, because it uses a fast internet communication link.

    {Caution: FTP involving UIC CMS should be initiated from CMS, because you can not have multiple write links to the same CMS disk (multiple read links should be OK. There are no similar problems using UNIX system, because UNIX is a multi-user system.}

    ftp File Transfers at the UIC HP9000

    At the UIC HP9000 you can transfer file between the HP9000 and UICVM or UNIX, even the NPACI/SDSC Cray T90, NPACI/SDSC Cray T3E or NCSA Cray Origin if you have an account there. The `ftp' command on UNIX is very much like the `ftp' command in HP-UX.

    In order to transfer a file from HP-UX and to UIC, enter the commands:

    For Transfer to UIC Unix:

    For Transfer to UIC CMS:

    ``borg: '' ftp uicvm.cc.uic.edu (CR)

    or

    ``borg: '' ftp 128.248.2.50 (CR)

    {This command allows you to enter the FTP communication system that uses the same lines and protocol as Telnet, but essentially only allows file transfer. The Internet numbers are more reliable. In the HP-UX to UICVM FTP connection, you will be prompted for your CMS user-id and your CMS password:}

    ``Connected to uicvm.uic.edu'' {If you do not get connected but end up in FTP you can try 'open 128.248.2.50' without restarting in HP-UX again.}

    ``Name(uicvm.cc.uic.edu:u[default-id]):'' [CMS-user-id like `[userid]'] (CR)

    ``Password(uicvm.cc.uic.edu:[CMS-user-id]):'' [CMS password] (CR) {If successful, then:} {If you make a mistake with either your password or username, you can enter `user (CR)' after the ``ftp>'' prompt to restart. At the HP-UX FTP ``ftp>'' prompt (it differs from the IBM FTP prompt), you can issue FTP commands:}

    ``ftp >'' help [FTP-command] (CR) {This `help' command gives a short information or definition of the command `[FTP-command]'; `help', alone, will display a list of FTP commands; `?' is an brief alias for `help'.}

    ``ftp >'' ls (CR) {Either `ls' or `dir' list the current contents of the remote directory if you need more information. `pwd' displays the remote (HP-UX here) working directory.}

    ``ftp>'' ls *.fortran (CR) {This example causes the listing of Fortran files on your CMS disk, with the wild-card `*' standing for any filename. Similarly, use `ls *.f (CR)' in UNIX.}

    ``ftp >'' put [HP-UX-fn.ext] [UICVM-fn.ft.fm] (CR)

    or {The put command stores the local (HP-UX) file on the remote (UICVM or UNIX) system in an FTP session started from HP-UX. `send' is an alias for `put', while `mput [HP-UX-files] (CR) is used to send multiple files to UIC with the similar names.}

    ``ftp >'' get [UICVM-fn.ft.fm] [HP-UX-fn.ext] (CR) {The GET command transfers a file from UIC to HP-UX If successful, you should get messages like this:} {`recv' is an alias of `get'.and `mget [UIC-files] (CR)' is used to receive multiple files quickly with wild-cards, but the file names will be the same as they are on the remote machine. Caution: `mget' stores the CMS file into HP-UX with an upper case name, unlike `get' which stores into HP-UX properly with a lower case name. You can transfer some more files if you want, changing the directory if needed, or you can quit FTP by using the command:}

    ``ftp >'' quit (CR) {You can also use `bye' to exit FTP, except in IBM FTP. Either will get you back to the HP-UX (UNIX+) shell with prompt ``u* *%''.}

    ``borg: ''


    ftp File Transfers from UIC UNIX

    The file transfer protocol program from a UNIX session is a similar to file transfers from the UIC HP-UX sessions, because both have UNIX or extended UNIX operating systems, as discussed in the last section.


    ftp File Transfers from the UIC PC Labs

    File transfer protocol (ftp) on a PC Lab PC may not be practical for must users, due to lack of permanent storage. Transfer between CMS or UNIX and the HP9000 may be more practical when you are accessing them using `telnet' from the PCs. The nearest Xerox PostScript printer to 2249f is SEL2263, while others are SEL2265, SEL2058, SEO308 and elsewhere. However, if the PC is your favorite medium, then use it as in the above HP9000 or Unix subsections.


    ftp File Transfers at UIC/CMS

    At UIC the IBMNET version of FTP is used on CMS and it uses the following commands:

    An alternate method of sending files is to use the CMS NOTE command and reading in the CMS file using the CMS GET command. This BITNET file transfer method can produce variable results, because CMS SENDFILE does not work for this purpose, BITNET expects a blank first line in the message and it depends on all the computer links between here and there.


    Execution of HP Fortran (f90) or HP C (cc)

    Compilation:

     

    Linking/Loading Step Only:

     

    Execution Step:


    Example 1: Execution using the Terminal for Input and Output

    As practice, you can lsrun any source program that you have transported to HP-UX. {If available, the simple code `convert.f'

          program convert 
    code: convert from debug fortran  cogs, slightly modified.  
    change:  input & output is to & from terminal, input at prompt.  
    Caution: compile, load, and execute in HP-UX using the three commands:  
    command:   f90 -o convert convert.f
    command:   convert
          real a(999) 
          write(*,*) 'input any integer less than 1000:'
          read(*,*) i 
          a(i) = float(i) 
          write(*,6000) a(i) 
    6000  format(' floating point representation: ',e13.5)
          write(*,*) 'What happens when you exceed array bound of 999?'
          stop
          end
    

    Can also be obtained on UICVM CMS using}

    and

    {Since this simple-minded `convert.f' program uses the terminal as undeclared input and output units, corresponding respectively to `read(*,*)' and 'write(*,[n])' statements, without specifying an the `open' statement. Be sure to do this in your temporary directory which you can change to by using `cd $TMP'. The source `convert.f' is executed with the 3 commands:}

    {To rerun the same code without recompiling, merely enter `convert' again:}


    Example 2: Execution using Input and Output with Files

    The second example uses data files for both input unit 5 and output unit 6, as well as the UNIX Fortran seconds timer `second()'

    {The source is a modified version of the old USER'S GUIDE craytest code. A copy can be obtained via a web browser:

    or a copy can be obtained by Anonymous FTP:

    or

    An old Cray copy also resides in Hanson`s public CMS disk. Use your own copy or get Hanson`s CMS copy by using HP-UX `ftp' to your own UICVM CMS account and when in FTP, enter

    {Your output will be in `tempt.output' and you can list is by the command:}

    ``borg:'' cat tempt.output (CR) {Your output should look something like:}

    {If you wish to re-run the program again with a different number, the enter}

    ``borg: '' ex tempt.data (CR) {again, or}

    ``borg: '' !ex (CR) {and enter within EX the subcommand}

    ``:'' 1c (CR)

    5000 (CR)

    . {to change `500' to `5000' and to end the change subcommand, while to end `ex' enter after the ``:'' prompt:}

    ``:'' wq (CR) {after the ``:'' prompt, and then enter}

    ``borg: '' lsrun tempt (CR) {and}

    ``borg: '' cat tempt.output (CR) {again. When you are done with `tempt.f', remove all the files from HP-UX that you do not need, using:}

    ``borg: '' rm tempt (CR) {for example, but especially the big executables like `tempt'.}


    Modifications for C: Compile and Execution with C

    For information on C language programs use the HP-UX commands:


    HP-UX Specific UNIX Commands.


    HP-UX f90 Combined Fortran Compile and Load Command

    On the `borg' enter

    Note:  It is much better to use makefiles for such commands.


    HP-UX cc Combined C Compile and Load Command

    The current optimizing HP-UX C compiler is `/usr/convex/cc6.5/cc'.


    HP-UX ld Load Command

    See `man ld' for more information. Note that the linker or loader `ld' works for both Fortran and C code object files.


    HP-UX Parallel Information Functions

    The HP HP9000 has several Parallel Information Functions that permit finding the number of processors with threads (parallel execution streams running on parallel processors), the number of threads, and similar information. The type declarations and usage of these functions is best illustrated by the following HP Fortran `f90' code fragments:

    Fortran Example: 

    c ............  deleted nonrelevant code
    Check:  Parallel Information Functions 
    Caution:  Avoid "()" args in declaration statements as in Programmer's Guide.
    Check:  Output (write) statements for function name meanings.
    Code:  Typical Parallel Information Function Declaration Statements:
          integer num_procs
          integer num_threads
          integer num_nodes
          integer num_node_threads
          integer my_thread
          integer my_node
          integer level_of_parallelism
    Code:  Typical Parallel Information Output Variable Declaration Statements:
          integer nproc,nth,nnode,nnodeth,myth,mynode,levpar,vlevpar(n)
    c ............ deleted nonrelevant code 
    Code:  Typical Parallel Information Function Output Statements: 
          write(6,*) 'Parallel Information Function Output:'
    606   format(1x,a,' = ',i3)
    Caution: Deadlock could result if num_procs, etc., are used in I/O operations,
    continued: i.e., do not use Parallel Information Functions directly in writes.
          nproc=num_procs()
          write(6,606) 'Number Processors with Threads',nproc
          nth=num_threads()
          write(6,606) 'Number Threads',nth
          nnode=num_nodes()
          write(6,606) 'Number HyperNodes',nnode
          nnodeth=num_node_threads()
          write(6,606) 'Number Threads on HyperNodes',nnodeth
          myth=my_thread()
          write(6,606) 'My Thread ID',myth
          mynode=my_node()
          write(6,606) 'My Hypernode ID',mynode
          levpar=level_of_parallelism()
          write(6,606) 'Level of Parallelism',levpar
    Code:  Output Table Explanation of Allowed Levels of Parallelism 0 to 9:
          write(6,*) 'where level 0 means Not Parallel'
          write(6,*) 'where level 1 means Asymmetric Thread Parallelism'
          write(6,*) 'where level 2 means Node Parallelism'
          write(6,*) 'where level 3 means Node Parallelism plus level 1'
          write(6,*) 'where level 4 means Thread Parallelism'
          write(6,*) 'where level 5 means Thread Parallelism plus level 1'
          write(6,*) 'where level 6 means Thread and Node Parallelism'
          write(6,*) 'where level 7 means Thread Parallelism plus levels',
         &           '1 & 2'
          write(6,*) 'where level 8 means Single Dim. Thread Parallelism'
          write(6,*) 'where level 9 means Single Dim. Thread Parallelism',
         &           'plus level 1'
    c ........ deleted nonrelevant code:
    Code:  Example of Use of "level_of_parallelism()" in a loop:
          do 5 i=1,n
          .........     
    CodeTechnique: vector form is used to be less likely to hinder loop parallelism:
          vlevpar(i)=level_of_parallelism()
          .........     
     5    continue            
          write(6,606) 'Level of Parallelism: do 5',vlevpar(n)
    c.............
          .......
    

    C Example (untested): 

    /* Include Parallel Information Required Header File: */
    #include 
    
    /* Parallel Information Functions: */
        int num_procs(void);
        int num_threads(void);
        int num_nodes(void);
        int num_node_threads(void);
        int my_thread(void);
        int my_node(void);
        int level_of_parallelism(void);
    
    /* Parallel Information Output Variables (can't use directly in printf): */
        int nproc,nth,nnode,nnodeth,myth,mynode,levpar,vlevpar[n];
    
    /* Parallel Information Function Output: */
        printf("\n Parallel Information Function Output:\n");
        nproc=num_procs();
        printf("\n Number Processors with Threads =%3d\n",nproc);
        nth=num_threads();
        printf("\n Number Threads =%3d\n",nth);
        nnode=num_nodes();
        printf("\n Number HyperNodes =%3d\n",nnode);
        nnodeth=num_node_threads();
        printf("\n Number Threads on HyperNodes =%3d\n",nnodeth);
        myth=my_thread();
        printf("\n My Thread ID =%8d\n",myth);
        mynode=my_node();
        printf("\n My Hypernode ID =%8d\n",mynode);
        levpar=level_of_parallelism();
        printf("\n Level of Parallelism =%3d\n",levpar);
        printf("\n where level 0 means Not Parallel,\n");
        printf("\n where level 1 means Asymmetric Thread Parallelism,\n");
        printf("\n where level 2 means Node Parallelism,\n");
        printf("\n where level 3 means Node Parallelism plus level 1,\n");
        printf("\n where level 4 means Thread Parallelism,\n");
        printf("\n where level 5 means Thread Parallelism plus level 1,\n");
        printf("\n where level 6 means Thread and Node Parallelism,\n");
        printf("\n where level 7 means Thread Parallelism plus levels 1 & 2,\n");
        printf("\n where level 8 means Single Dim. Thread Parallelism,\n");
        printf("\n where level 9 means Single Dim. Thread Parallelism +level 1,\n");
    /* ....deleted nonrelevant code .... */
        /* do 5 i loop */
        for (i = 1; i <= n; ++i) {
    /* ....deleted nonrelevant code .... */
            vlevpar[i]=level_of_parallelism();
            /* do 5 j loop */
       for (j = 1; j <= 3; ++j) {
    /* L5: */   box[i-1][j-1] = j * (float)1.5 * i / ((float)n+1.);
            }
    }
        printf("\n Level of Parallelism =%3d\n",vlevpar[n]);
    /* ....deleted nonrelevant code .... */
    


    HP-UX mpa Execution Attribute Modifying Command

    WARNING: THIS SECTION IS FOR THE OLD SPP1200 BORG, BUT WILL BE REVISED WHEN SUITABLE REPLACEMENT IS FOUND FOR THE NEW BORG

    Some command options can not be directly entered in compiler, load or execution commands, but the `mpa' modifying attribute commands permits specifying such things as the number of processors or threads for an executable module file.


    HP-UX Debugging and Performance Commands


    HP-UX Network Queueing System (NQS)

    For more information about batch processing with NQS, check the man pages for


    f90 Fortran Extensions

    HP Fortran90 (f90) contains extensions beyond `f77' Fortran. Fortran90 is best used from within the HP Fortran optimizing compiler `f90' (Although you can not put the `-f90' option on the `f90' command, the optimization report for `f90 +O3 +list' shows that `f90' is indeed the implicit default even if not requested, although it may be turned off with the `-nof90' option.

    The `f90' compiler is part of the Pacific-Sierra Research Corporation VAST optimizing compiler that is used by most optimizing compilers and is used along with the `vf90' translator, both being found in the directory `/usr/convex/vast90/' {CAUTION: There is no `vf90' license on the `borg' now and `f90' needs `vf90'}.

    For optimization, it is recommended that your f90 program aid the f90 parallel optimization model, i.e.,


    f90 Compiler Options

    See also Section

    and Subsection

    Also see the appropriate sections, `man cc' for items on HP Standard C.


    f90 Miscellaneous Extensions

    Fortran 90 Array Notation and Array Sections:  HP Fortran f90 allows most Fortran90 extensions for arrays, making array statements like

    Array Constructors:  Fortran 90 array constructors permit initialization of vectors and arrays by enclosing data separated by commas with all data enclosed between `(/' and `/)' delimiters. For example (assuming proper dimensioning):

    Caution:  Constructors can not be currently used in print statement arguments.}


    Fortran90 Array Construction Functions

    Many of the Fortran90 Array Functions that have been available on the Connection Machine are now available for the `f90' compiler version 9.5. However, a few functions take their arguments in a different order on the HP than on the Connection Machine.


    Fortran90 Array Reduction Functions

    The reduction functions reduce the input to a scalar output.


    Fortran90 Array Manipulation Functions

    The manipulation functions rearrange the elements of the target matrix. See `cm-guide.tex' on `CMS getdisk hanson' for examples. However, the arguments may be in a different order on the Connection Machine.


    Fortran90 Array Location Functions

    The location functions find the location of elements of the target matrix. See `cm-guide.tex' on `CMS getdisk hanson' for examples. However, the arguments may be in a different order on the Connection Machine and the Connection Machine also has the location functions `firstloc', `lastloc' and `project'.


    Fortran90 Array Matrix Multiply Functions

    The matrix multiply functions compute the matrix products of the target matrices. See `cm-guide.tex' on `CMS getdisk hanson' for examples.


    Fortran90 Array Functions TEST CODE

    The following f90 code contains examples of use of many of the Fortran90 array intrinsic functions mentioned above. There are some subtle differences:

    1. Intrinsic statement is needed for all f90 intrinsics within f90 codes.
    2. Constructors of the form b=(/1 2 3/) will now work with the f90 compiler.
    3. Fortran90 array intrinsics used within f90 will take no auxiliary markers or keywords like "dim=" or "mask=".
    4. f90 version ?.? (Fall 1996) is the default f90 and the command is in `/bin/convex/fc9.5/fc'. (Note the unix command `which f90' displays the link `/usr/convex/bin/f90' which should be in your C-Shell path, but this eventually links to `/bin/convex/fc9.5/fc')
    5. Similarly, cc version 6.5 (Fall 1996) is the default cc and the command is in `/bin/convex/cc6.5/cc'. (Note the unix command `which cc' displays the link `/usr/convex/bin/cc', which should be in your C-Shell path, but this eventually links to `/bin/convex/fc9.5/fc')
    6. array sections can not be used in print statements: NOT print*,b(1:3)
    7. How do you sum an entire array only subject to a mask, but with no dimension restrictions?
      If  b =  1  3  5            logical mask=b.gt.3
               2  4  6
      
      then   s3=sum(b,1,mask)  or  s2=sum(b,2,mask) work when 
      
      real s3(3),s2(2)
      
      but   isum=sum(b,mask)  or  isum=sum(b,,mask) or isum=sum(b,:,mask) do NOT work.
      
      That is, how do you enter a scalar dim for the whole array?
      
    Here is a sample code with many examples, heavily commented and followed by the actual output run on borg.cc.uic.edu using the commands

    borg:    f90 +O3 +U77 +list -o f90test f90test.f >& f90test.LIST&
    borg:    lsrun f90test >& f90test.output&
    
    %%%%%%%%%%% begin f90test.f %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    
          program f90test
    code98:  use drand48() pseudo random number generator since random_number bad
    code97:  update by removing old comments to cmfortran
    code96:  retest=f90test.f redone on borg = convex spp1200/xa-16
    cf90er: => convex fc9.5 flagged for error.
          integer, parameter :: m = 6
          integer, parameter :: n = 4
          integer :: i,j
          integer :: seedval
          integer, dimension(2) :: s2, ctr1, ctr2, ctr3, b2 ,cr1 ,cr2
          integer, dimension(3) :: s3 ,at ,ar1 ,ar2 ,br1 ,br2
          integer, dimension(4) :: as(4)
          integer, dimension(2,2) :: c ,bi
          integer, dimension(2,3) :: b, a ,cu
          integer, dimension(3,2) :: ct
          integer, dimension(3,4) :: cs
          integer, dimension(4,3) :: cst
          logical, dimension(2,3) :: test
          logical, dimension(64,64) :: inmask
          double precision, parameter :: tol = 0.5e-5
          integer, parameter :: niter = 5000
          double precision :: diffav
          double precision, dimension(8,8) :: us
          double precision, dimension(64,64) :: u , du
          double precision :: drand48
          double precision, dimension(m,n) :: uniran
          double precision, dimension(n,m) :: truniran
          intrinsic  sum,maxval,minval,product
         & ,dot_product,matmul,transpose
         & ,cshift,eoshift,spread,random_number
          data b/1,2,3,4,5,6/     !replace constructors initialization
          data as/2,3,4,5/
          data at/2,3,4/
    c --------------------Array Constructors:
           b(1,1:3) = (/1, 3, 5/)  ! initialize first row, along dimension 2.
           b(2,1:3) = (/2, 4, 6/)  ! initialize second row, along dimension 2.
          print*,'Note: constructors like "(/1,2/)" allowed in fc9.5'
          br1 = b(1,:)
          br2 = b(2,:)
          print60,br1,br2
    60    format(' b(2,3)'/(3i3))
    c --------------------Sum Function sum:
          isum = sum(b) ! => isum = 21; i.e., Front-End scalar.
          print61,' isum=sum(b)=',isum
    61    format(1x,a36,i4)
          isum = sum(b(:,1:3:2)) ! => isum = 14; sole ':' means all values '1:2'.
          print61,' isum = sum("b(:,1:3:2)")=',isum
          bi=b(:,1:3:2)
          isum=sum(bi)
          print61,' isum = sum("b(:,1:3:2)")=',isum
          print*,'CAUTION: "dim=", etc., markers= NOT allowed in intrinsics'
          s2 = sum(b,2) ! redeclared with the correct array section shape.
          print62,' s2 = sum(b,2)=',s2  ! => s2 = (/9,12/), row sums
    62    format(1x,a32,2i3)
          s3 = sum(b,1)  ! => s3 = (/3,7,11/); column sums.
          print63,' s3 = sum(b,1)=',s3
    63    format(1x,a32,3i3)
          print*,'CAUTION:  "mask=" marker= STILL not allowed either.'
          s3 = sum(b,1,b.gt.3) ! => s3 = (/0,4,11/); i.e., conditional col sum
          print63,' s3 = sum(b,1,"b.gt.3") =',s3  
          test=b.gt.3
          s3 = sum(b,1,test) ! => s3 = (/0,4,11/); i.e., conditional col sum
          print63,' s3 = sum(b,1,"b.gt.3") =',s3  
          s2 = sum(b,2,test) ! => s2 = (/5,10/); i.e., conditional row sum
          print62,' s2 = sum(b,2,b.gt.3) =',s2  
    cf8er:isum = sum(b,0,test) ! => isum = 18; i.e., add only elements
    cf8er:print61,' isum = sum(b,0,b.gt.3) =',isum ! that are greater than three.
          print*,' CAUTION:  If "sum(array[dim[,mask]])", CANT use zero (0)'
         &      ,' for [dim] for whole array when there is a mask.'
    c --------------------Maximum Value Function maxval:
          imax = maxval(b) ! => imax = 6; array maximum value.
          print61,' imax = maxval(b)=',imax
          s3 = maxval(b,1) ! => s3 = (/2,4,6/); column maximums.
          print63,' s3 = maxval(b,1)=',s3
          s2 = maxval(b,2) ! => s2 = (/5,6/); row maximums.
          print62,' s2 = maxval(b,2)=',s2
    c --------------------Minimum Value Function minval:
          imin = minval(b) ! => imin = 1; array minimum value.
          print61,' imin = minval(b)=',imin
    c --------------------Product Function product:
          s2 = product(b,2) ! => s2 = (/15,48/); products of column elements.
          print62,' s2 = product(b,2)=',s2
    c --------------------Dot Product Function dot_product:
          idot = dot_product(br1,br2) ! => idot = 44; dot product of row
          print61,' idot = dot_product(b(1,:),b(2,:))=',idot ! vectors of b.
          print*,' CAUTION:  Array syntax not allowed in actual arguments.'
    c --------------------Matrix Multiplication Function matmul:
          ! assuming array b of the previous section.
          ![Ans] = matmul([Array_1],[Array_2]) ! computes matrix multiplication
                                               ! of two rank two matrices.
          c = matmul(b(:,1:2),b(:,2:3)) ! => c(1,:)=(/15,23/);c(2,:)=(/22,34/).
          c=transpose(c)
          print623,'c=matmul(b(:,1:2),b(:,2:3))=',c
    623   format(1x,a36/(2i3))
          ![Ans] = transpose([Array]) ! transforms an array to its transpose.
          ct = transpose(b) ! => ct(1,:)=(/1,2/);ct(2,:)=(/3,4/);ct(3,:)=(/5,6/).
          ctr1 = ct(1,:)
          ctr2 = ct(2,:)
          ctr3 = ct(3,:)
          print623,'ct = transpose(b)=',ctr1,ctr2,ctr3
    c --------------------Circular Shift Function cshift:
            ! assume b is again initialized as
            !        b =  1 3 5
            !             2 4 6
          a = cshift(b,1,2)  ! => a = 3 5 1
                             !        4 6 2
    cshift  EG1:
          ar1 = a(1,:)
          ar2 = a(2,:)
          print633,'a = cshift(a,1,2)=',ar1,ar2
    633   format(1x,a36/(3i3))
        ! i.e., b(i,(j+shift) "mod" n) -> a(i,j) for j=1:2, etc.;
        ! nonstandard modulus fn: 0 "mod" n = n; 1 "mod" n = 1; ...;  n "mod" n = n
        ! i.e., the result is computed from shifting subscript in specified
            ! dimension of the source array by the specified shift.
          a = cshift(b,-1,2)  ! => a = 5 1 3
                              !        6 2 4
    cshift  EG2:
          ar1 = a(1,:)
          ar2 = a(2,:)
          print633,'a = cshift(b,-1,2)=',ar1,ar2
            ! i.e., b(i,(j+shift) "mod" n) -> a(i,j) for j=2:3, etc.
    cshift  EG3:
          s2(1) = 1
          s2(2) = 2
          a = cshift(b,s2,2)  ! a = 3 5 1
                              !     6 2 4
            ! i.e., an array-valued shift, or shift per row.
          ar1 = a(1,:)
          ar2 = a(2,:)
          print633,'a = cshift(b,(/1,2/),2)=',ar1,ar2
    cshift Laplace Example:
            ! Jacobi Iteration for a 5-star discretization of 
            !        2D Laplace's equation:
          u = 0
          u(1,:)=2
          u(64,:)=2
          u(:,1)=2
          u(:,64)=1
          inmask = .FALSE.
          inmask(2:63,2:63) = .TRUE.
          diffav = 1
          iter=0
          do while (diffav.gt.tol.and.iter.lt.niter)
             iter=iter+1
             du = 0
             where(inmask)
                du = 0.25*(cshift(u,1,1)+cshift(u,-1,1)+cshift(u,1,2)
         &          +cshift(u,-1,2)) - u
                u = u + du
             end where
             du = du*du
             diffav = sqrt(sum(du)/(62*62))
          end do 
            ! which is the main program fragment of laplace.fcm.
    cf90er:print66,'u = laplace-shift(u)=',u(1:64:16,1:64:16)
    cf90er:     &   ,' array section like "u(1:64:16,1:64:16)".'
          print*,'CAUTION: array sections not allowed in print'
          us = u(1:64:9,1:64:9)
          us=transpose(us)
          print66,'u = laplace-shift(u)= ; iter=',iter,'; av-diff ='
         &       ,diffav,us
    66    format(1x,a36,i5,a11,e10.3/(8f8.4))
    c --------------------End Off Shift Function eoshift:
          a = eoshift(b,-1,0,1) ! a = 0 0 0 note default boundary value is 0.
                                !     1 3 5
          ar1 = a(1,:)
          ar2 = a(2,:)
          print633,'a = eoshift(b,-1,0,1)=',ar1,ar2
          s2=(/-1,0/)
          b2=(/7,8/)
          a = eoshift(b,s2,b2,2) ! => a = 7 1 3
                                 !        2 4 6
          ar1 = a(1,:)
          ar2 = a(2,:)
          print633,'a = eoshift(b,(/-1,0/),(/7,8/),2)=',ar1,ar2
          a = eoshift(b,2,0,2) ! => a = 5 0 0
                               ! =>     6 0 0
          ar1 = a(1,:)
          ar2 = a(2,:)
          print623,'a = eoshift(b,2,2)=',ar1,ar2
    c --------------------Spread Function spread:
          cs = spread(as,1,3)
             ! contents of cs:
             !        2 3 4 5
             !        2 3 4 5
             !        2 3 4 5
          cst = transpose(cs)
          print64,'as =',as
    64    format(1x,a32,4i3)
          print643,'cs = spread(as,1,3)=',cst
    643   format(1x,a36/(4i3))
    c --------------------
          cs = spread(at,2,4)
             ! contents of c:
             !        2 2 2 2
             !        3 3 3 3
             !        4 4 4 4
          cst = transpose(cs)
          print63,'at =',at
          print643,'cs = spread(at,2,4)=',cst
    c ---------------------------------------------------------------------------
    ! i.e., b=spread(a,d,c)  =>
    ! a(n_1,n_2,...,n_(d-1),n_d,...,n_r) -> b(n_1,n_2,...,n_(d-1),c,n_d,...,n_r)
    ! where r is the rank of source array a and n_i is the size of dimension i;
    ! noting that a new dimension of size c is added before dimension d.
    c ---------------------------------------------------------------------------
    C Random Number Generator: NOT random_number, BUT drand48 with srand48
    c random_number executes, but gives nonrandom arrays: use drand48() instead
          print*,'F90 random_number still does work not on borg, Fall 1998'
          seedval = 123456
          call srand48(seedval)
          do i = 1, m
             do j = 1, n
                uniran(i,j) = drand48()
               enddo
          enddo
          truniran = transpose(uniran)
          write(6,65) truniran
    65    format(' function drand48() uniform random array:'/(4f14.10))
          stop
          end
    
    %%%%%%%%%%% end f90test.f %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    

    %%%%%%%%%%% begin f90test.output %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    
     Note: constructors like "(/1,2/)" allowed in fc9.5
     b(2,3)
      1  3  5
      2  4  6
                             isum=sum(b)=  21
                isum = sum("b(:,1:3:2)")=  14
                isum = sum("b(:,1:3:2)")=  14
     CAUTION: "dim=", etc., markers= NOT allowed in intrinsics
                       s2 = sum(b,2)=  9 12
                       s3 = sum(b,1)=  3  7 11
     CAUTION:  "mask=" marker= STILL not allowed either.
             s3 = sum(b,1,"b.gt.3") =  0  4 11
             s3 = sum(b,1,"b.gt.3") =  0  4 11
               s2 = sum(b,2,b.gt.3) =  5 10
      CAUTION:  If "sum(array[dim[,mask]])", CANT use zero (0) for [dim] for whole array when there is a mask.
                        imax = maxval(b)=   6
                    s3 = maxval(b,1)=  2  4  6
                    s2 = maxval(b,2)=  5  6
                        imin = minval(b)=   1
                   s2 = product(b,2)= 15 48
       idot = dot_product(b(1,:),b(2,:))=  44
      CAUTION:  Array syntax not allowed in actual arguments.
             c=matmul(b(:,1:2),b(:,2:3))=
     15 23
     22 34
                       ct = transpose(b)=
      1  2
      3  4
      5  6
                       a = cshift(a,1,2)=
      3  5  1
      4  6  2
                      a = cshift(b,-1,2)=
      5  1  3
      6  2  4
                 a = cshift(b,(/1,2/),2)=
      3  5  1
      6  2  4
     CAUTION: array sections not allowed in print
            u = laplace-shift(u)= ; iter= 4730; av-diff = 0.499E-05
      2.0000  2.0000  2.0000  2.0000  2.0000  2.0000  2.0000  1.0000
      2.0000  1.9762  1.9479  1.9090  1.8491  1.7440  1.5208  1.0000
      2.0000  1.9573  1.9068  1.8387  1.7387  1.5836  1.3402  1.0000
      2.0000  1.9469  1.8844  1.8014  1.6836  1.5141  1.2817  1.0000
      2.0000  1.9469  1.8844  1.8014  1.6836  1.5141  1.2817  1.0000
      2.0000  1.9573  1.9068  1.8387  1.7387  1.5836  1.3402  1.0000
      2.0000  1.9762  1.9479  1.9090  1.8491  1.7440  1.5208  1.0000
      2.0000  2.0000  2.0000  2.0000  2.0000  2.0000  2.0000  1.0000
                   a = eoshift(b,-1,0,1)=
      0  0  0
      1  3  5
       a = eoshift(b,(/-1,0/),(/7,8/),2)=
      7  1  3
      2  4  6
                      a = eoshift(b,2,2)=
      5  0
      0  6
      0  0
                                 as =  2  3  4  5
                     cs = spread(as,1,3)=
      2  3  4  5
      2  3  4  5
      2  3  4  5
                                 at =  2  3  4
                     cs = spread(at,2,4)=
      2  2  2  2
      3  3  3  3
      4  4  4  4
     F90 random_number still does work not on borg, Fall 1998
     function drand48() uniform random array:
      0.7469419493  0.5427079092  0.0258878556  0.5654528109
      0.2614139434  0.9823978669  0.8074451619  0.0462442146
      0.9162790193  0.1535050702  0.7112050429  0.3834904004
      0.6550487669  0.3811652680  0.6453757818  0.5827837354
      0.6270208712  0.2289106442  0.7934562939  0.0215042220
      0.1299298527  0.7670702197  0.4952551953  0.8667496918
    
    %%%%%%%%%%% end f90test.output %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    


    f90 and cc Compiler Optimization Directives/Pragmas

    Compiler directives can be used to force optimization or stop optimization for both `f90', `cc' and `cpp (C Preprocessor)' compilers. These statements are placed in the Fortran or C source just before the loop or other entity they are to effect. However, some come in single statements, while others come in pairs with one beginning directives and another ending directive. The compiler directives have different formats, i.e., for `f90' they have the form:

    while for `cc' they are called pragmas and have the form:

    where the old Convex marker `_CNX' is now optional. If several compatible directives are used on the same code fragment like a loop, then they may be combined, for example in the case of two directives, as

    Caution:  for `f90', the leading `C' of `C$DIR' must be in column 1 and a blank must be in column 6. The `#pragma' statement keyword must be in lower case for `cc', but the directives themselves as well as `C$DIR' can be in upper or lower case. Also, it is not wise to force optimization where inappropriate and risk synchronization errors.

    For HP Documentation on C Pragmas (use for Fortran Compiler Directives too, since there is very little HP Documentaion for them) see

    {May not be a stable link, so you may have to try the searching the following HP documents} in

    of

    For more General List of HP Documentation see

    %%%%%%%%%%% Begin Sample GETTIMEOFDAY C Code Fragments %%%%%%%%%%%%%%%%%%%%%%%%
    
    
    /* revised, simplified use of gettimeofday                                   */
    /* Compile/Link: cc  +O3 +L -lm -o [exec] [source].c >& [source].LIST &      */
    /* Execute:   lsrun  [source] >& s[source].output &                          */
    /* Ignore EXEC MSG:   "Exit 11   lsrun start >& start.output"                */
    #include 
    #include 
    #include 
    
    /* ... general Define Global Parameters deleted ... */
    /* Define Global Timer Parameters */
    #define NTime 20
    
    /* Main program */ 
    main()
    {
    /* ... general Local Variables  deleted ... */
    
    /* Time variables */
        long int tp[2], tzp[2];
        int gtod;
        long int tsecs[NTime], tmicrosecs[NTime];
        long int ttot[NTime], ttotmoh[NTime];
        float ts1, tt1, tu1, tu2, ts2, tt2, tu3, ts3, tt3;
        double ttotf[NTime];
    
    /* begin main code */
    /* start timer counter */
        kt = 0;
    /* gettimeofday = Microsecond Wall Timer C function;                         */
    /* WallTime = UserTime + SystemTime, Undecomposed;                           */
    /* gettimeofday returns gtod = 0 if successful;                              */
    /* tp[0] in secs since 1/1/70;                                               */
    /* tp[1]in added microseconds;                                               */
    /* tzp gives the timezone;                                                   */
        gtod=gettimeofday(tp,tzp);
        tsecs[0] = tp[0];
        tmicrosecs[0] = tp[1];
    /* get immediate second time for overhead measurement                        */
        ++kt;
        gtod=gettimeofday(tp,tzp);
        tsecs[1] = tp[0];
        tmicrosecs[1] = tp[1];
    /* start first loop to be timed                                              */
        for (i = 1; i <= n; ++i) {
    	for (j = 1; j <= 3; ++j) {
    	  /* loop work deleted /*
    	}
        }
        ++kt;
        gtod=gettimeofday(tp,tzp);
        tsecs[kt] = tp[0];
        tmicrosecs[kt] = tp[1];
    /* ...  more code deleted ...                                                */
    /* ...  more code deleted ...                                                */
    /* start loop `kt' to be timed                                               */
        for (i = 1; i <= n; ++i) {
    	for (j = 1; j <= n; ++j) {
    	  /* loop work deleted /*
    	}
        }
        ++kt;
        gtod=gettimeofday(tp,tzp);
        tsecs[kt] = tp[0];
        tmicrosecs[kt] = tp[1];
    /* ...  more code deleted ...                                                */
    /* ...  more code deleted ...                                                */
    /* get final code timing not counting output                                 */
        ++kt;
        gtod=gettimeofday(tp,tzp);
        tsecs[kt] = tp[0];
        tmicrosecs[kt] = tp[1];
    /* Total Elapsed Time Including Clock Overhead*/
        ttot[kt] = (tsecs[kt]-tsecs[1])*1000000+(tmicrosecs[kt]-tmicrosecs[1]);
    /* Total Elapsed Time Minus Clock Overhead */
        ttotmoh[kt] = ttot[kt] - (tmicrosecs[1] - tmicrosecs[0]);
        printf("\nIntermediate Raw Timing Output:\n");
        printf("\ntmicrosecs[(0,1,kt)]=(%12d,%12d,%12d), in microseconds\n",
        tmicrosecs[0],tmicrosecs[1],tmicrosecs[kt]);
        printf("\ntsecs[(0,1,kt)]=(%12d,%12d,%12d), in seconds\n",
        tsecs[0],tsecs[1],tsecs[kt]);
        printf("\n(ttot[kt],ttotmoh[kt])=(%12d,%12d), in microseconds\n",
        ttot[kt],ttotmoh[kt]);
    /* test for bad clock with integer overflow                                  */
        if (ttot[kt] < 0) 
           {printf("\n  Error:Negative Times:Bad Clock:Rerun Job\n");}
    /* get floating point value for total elapsed timiing minus clock overhead   */
        ttotf[kt] = ttotmoh[kt]/1.e6;
        printf("\n Borg Starter Problem Output\n");
        printf("\n  Timing Output:\n");
        printf("\n   final total time=%12.4e, in seconds\n",ttotf);
    /* get individual loop timings                                               */
        for (i=2; i
    %%%%%%%%%%% End Sample GETTIMEOFDAY C Code Fragments %%%%%%%%%%%%%%%%%%%%%%%%%%
    

    Return to TABLE OF CONTENTS?


    Table of HP9000 Timers.

    Notes:

    1. There are several other timers, but not appropriate for scientific computing.
    2. Most timers need the `+U77' option in the `f90' or `f77' commands.
    3. For actual use, consult the particular timer man page `man [timer]'.
    4. Fortran libraries are currently in the path `/usr/convex/fc9.5/lib' and are use with the `-l[library]' options of `f90', and are denoted by `f90:[library]' in brief in the table. `libU77.a' is the built-in convex fortran utility library.
    5. C libraries are currently in the path `/usr/convex/cc6.5/lib', but there is not much there compared to the several libraries in the `HP f90' fortran path.
    6. Ideally, a timer should give usertime in intervals a small as microseconds.
    7. `gettimeofday', using C routine, would be rough approximation; see the above code fragments showing how this built-in function the `' header, can be used for a timer.

    Return to TABLE OF CONTENTS?


    The best way to learn these commands is to use them in an actual computer session.

    Good luck.     

    FBH     


    Please report to Professor Hanson any problems or inaccuracies:

    Web Source: http://www.math.uic.edu/~hanson/mcs572/borgguide.html


    Fall 1998