Skip Nav

Engineering Tools Portal (ETP)
User Guide

Table of Contents

1. Introduction

1.1. Document Scope and Assumptions

This document provides an overview and introduction to the use of the Microsoft HPC Engineering Tools Portal (ETP) located at the ARL DSRC and a description of the specific computing environment on the Portal. The intent of this guide is to provide information to enable the average user to login to the system and run Portal applications. Currently MATLAB is the only published application.

The Engineering Tools Portal consists of a Microsoft HPC 2008 R2 server and supporting portal technologies. These technologies provide the mechanism for securely delivering HPC-aware applications to a user's Windows desktop without having to install any software on the desktop or using any network transport protocols other than HTTPS.

To receive the most benefit from the information provided here, you should be proficient in the following areas:

  • Use of the Microsoft Windows operating system.
  • Remote usage of computer systems via network or modem access.
System Requirements

Client computers must run Internet Explorer version 6.0 or later and a version of Remote Desktop Connection (RDC) that supports at least Remote Desktop Protocol (RDP) 6.1. These include:

  • Windows 7,
  • Windows Vista with Service Pack 1,
  • Windows XP with Service Pack 3,
  • Windows Server 2008,
  • Windows Server 2008 R2.

Additionally, the Remote Desktop Services ActiveX Client control must be enabled. The ActiveX control is included with Remote Desktop Connection (RDC) 6.1 and higher.

1.2. Policies to Review

The HPC Portal (ETP) is for UNCLASSIFIED USE ONLY; this includes but is not limited to data processing, transmission of data, and storage of data in any form. DO NOT access the HPC Portal (ETP) from a CLASSIFIED system.

All policies are discussed on the ARL DSRC webpage. All users running at the ARL DSRC are expected to know and understand the policies discussed there. If you have any questions about ARL DSRC's policies, please contact the CCAC.

1.3. Requirements for an Account

  • Have a valid DoD Common Access Card (CAC).
  • Have a need to run MATLAB in support of US DoD research and engineering.
  • Agree to use the Portal only for UNCLASSIFIED data processing only in support of US DoD research and engineering.

1.4. Obtaining an ETP Account

To obtain an account on the Microsoft HPC Portal, please send a CAC-signed email to dreportal@arl.army.mil with the following information:

  • Your DoD Agency or Company Name.
    • Contractors please indicate your sponsoring DoD agency.
  • Daytime Phone number.
  • A brief description of the DoD projects the portal will support.
  • If you are an existing HPCMP user please provide your HPC user account name.
  • Agree to and include the following statment in your email:

    As an Engineering Tools Portal user I, <your name>, understand and agree to use the Portal only in support of UNCLASSIFIED US DoD research and engineering. I understand that all data processing on the portal will be UNCLASSIFIED; this includes, but is not limited to data processing, transmission of data, and storage of data in any form. I will not access the Portal from a CLASSIFIED system.

1.5. Requesting Assistance

To request assistance please send email to dreportal@arl.army.mil.

2. System Configuration

2.1. System Summary

The Portal is built on an IBM x1350 Blade Cluster. The system consists of two IBM BladeCenter H Chassis with 14 HS22 blade servers each. Four blade servers are used as remote desktop session hosts where the published applications such as MATLAB run, 2 are used as part of the Portal infrastructure, and 22 blades servers are configured as compute nodes yielding 264 compute cores to support user jobs.

2.2. Processors

Each compute node has two 6-core Intel x5670 2.93 GHz processors for a total of 12 cores per node.

2.3. Memory

ETP uses both shared and distributed memory models. Memory is shared among all the cores on a node, but is not shared among the nodes across the cluster.

Each session host contains 24 GBytes of main memory. All memory and cores on the session host servers are shared among all users who are logged in. Therefore, users are expected to send large computational jobs to the cluster and limit the jobs run on the session hosts.

Each compute node contains 24 GBytes of user-accessible shared memory.

2.4. Operating System

The operating system on the Portal is Microsoft Windows Server 2008 R2. The compute nodes are part of a Microsoft HPC 2008 R2 server.

2.5. File Systems

The ETP system provides approximately 36 TBytes of fault-tolerant storage. This storage is not backed up so users must take steps against losing important files.

3. Accessing the System

3.1. Logging into the ETP Website

To access the portal web site, follow the steps below.

  1. In a browser, navigate to https://etp.arl.hpc.mil
  2. Select your "DOD EMAIL" certificate, see Figure 1, and click "OK".
  3. Type in your CAC pin if asked, see Figure 2, and click "OK". ActivClient caches your pin for a certain period of time and, therefore, may not ask for your pin each time.
Figure 1: Select the DoD Email certificate to logon to the Portal.

Figure 1: Select the DoD Email certificate to logon to the Portal.

Figure 2: ActivClient requests your CAC pin.

Figure 2: ActivClient requests your CAC pin.

Once you are authenticated, the Portal website will be displayed. Figure 3 shows the ETP web site. In addition to the MATLAB icon, each portal user will have access to a Portal Explorer window and the Portal HPC Job Manager.

Figure 3: Portal web site for MATLAB users.

Figure 3: Portal web site for MATLAB users.

3.2. File Transfers

Data is transferred to and from the Portal by "cutting and pasting" between the Portal Explorer window and a local desktop Explorer window. Please note that the "drag-and-drop" mechanism does not work.

  1. Single-click the Explorer icon in the Portal website.
  2. A Portal Explorer window will appear on your desktop, see Figure 4. If this is the first application run you will be asked to authenticate.
  3. Open an Explorer window on your local desktop.
  4. Highlight and copy the files or directories that you want to copy from the source Explorer window.
  5. Move to the destination Explorer window and paste the files into the desired location.
Figure 4: Portal Explorer Window.

Figure 4: Portal Explorer Window.

4. Job Scheduling

4.1. HPC Job Manager

The HPC Job Manager serves multiple functions; however, for MATLAB users it only serves as a way to check cluster status and job status outside of MATLAB. It is provided for convenience only. Follow the procedure below to run the HPC Job Manager.

  1. Single-click the HPC Job Manager icon in the Portal website.
  2. The HPC Job Manager will appear on your desktop, see Figure 5. If this is the first application run you will be asked to authenticate.
  3. In the Job Manager, you will be able to view the status of your jobs. The job states are: Configuring, Active, Finished, Failed, and Completed. Click on any of these under "My Jobs" to check the status of your job. You can do the same for all cluster jobs under "All Jobs".
  4. Highlight and copy the files or directories that you want to copy from the source Explorer window.
  5. Move to the destination Explorer window and paste the files into the desired location.
Figure 5: HPC Job Manager.

Figure 5: HPC Job Manager.

5. Software Resources

5.1. Application Softwware

MATLAB is the only application offered on the Portal at this time.

5.2. Running Applications

Applications are run by a single-click of the application icon in the web browser. Please note that double-clicking will run the application twice. The Portal requires authentication to establish a connection to an application server over which the applications are presented. Once this connection has been established, all subsequent applications use the same connection and do not have to be re-authenticated. If all applications are closed, then this connection is severed and authentication will be required when the next application is run. As long as there is at least one application running, for example Explorer, subsequent applications will not require authentication.

To run an application, follow the steps below:

  1. Click the application icon once.
  2. If no other application is running,
    1. You must enter your CAC pin when requested.
    2. A logon screen will appear as shown in Figure 6. No action is required by the user. It will disappear, and the application UI will appear.
  3. The application UI appears on the desktop. Figure 7 shows the MATLAB UI.
Figure 6: Logon Screen.

Figure 6: Logon Screen.

Figure 7: MATLAB User Interface.

Figure 7: MATLAB User Interface.

6. MATLAB

6.1. Running MATLAB jobs on the Windows Cluster

Running jobs on the cluster requires knowledge of the MATLAB Parallel Computing Toolbox (PCT). This user guide does not attempt to teach PCT. Information on the PCT can be found at http://www.mathworks.com/products/parallel-computing/.

One way to take advantage of the cluster is to create a task parallel job, i.e., create a number of independent tasks and submit them to the cluster. A simple example of how to do this is given in section 6.2. More information on this can be found at the MathWorks web site.

Another way to take advantage of the cluster is by running data parallel jobs. This requires parallelizing your MATLAB code using PCT. This approach will not be discussed here. It is well covered on the MathWorks web site.

The following are some guidelines for running MATLAB on the Portal and jobs on the Windows cluster:

  • Run only 1 instance of the MATLAB GUI from the Portal web site. Each instance requires a separate license and licenses are limited.
  • Computationally intensive MATLAB jobs need to be pushed to the HPC Server using the MATLAB Parallel Computing Toolbox. - Section 6.2 gives a simple example of how to do this. The nodes running the MATLAB clients are shared nodes and utilizing significant resources on these nodes will affect other users.
  • When a job is submitted to the cluster (e.g. using the PCT submit( job) command) it is recommended that you do not wait for job completion within your PCT script. This will allow you to continue using the MATLAB GUI and will prevent problems if your RDP session ends prior to job completion. It is also more convenient if you wish to log off the Portal before the job completes.
  • When returning to the Portal to post process a previously submitted job, the job object can be obtained in different ways. Some convenience functions are available for getting the job object. These are described in the next section.
  • If a job hangs for some reason it is good practice to destroy the job. The MATLAB destroy( job ) command can be used to do this, where "job" is the job object. The convenience functions allow you to get the job object or destroy groups of jobs.
  • It is not recommended that jobs be canceled from the HPC Job Manager. This will leave MATLAB in a state where it thinks the job is still running. Use the MATLAB cancel(job) command to do this.
  • It is good practice to destroy jobs that are no longer needed. The convenience functions in the next section can help with this.

6.2. Sending a Task or Tasks to the HPC Server

Below is a simple PCT script that sends a task or group of tasks to the HPC Server.

sched = findResource();
job = createJob(sched);
createTask(job, @foo, n, {inputargs});
submit(job);

When the job completes, the output arguments of the function "foo" can be retrieved with the following command:

out = job.getAllOutputArguments;

The functions findResource, createJob, createTask, submit, and getAllOutputArguments are part of the Parallel Computing Toolbox. Briefly, findResource returns the ETP HPC Server scheduler and createJob(sched) returns a job object that tasks can be assigned to and submitted to the scheduler. The function createTask creates the tasks and associates them with the job object. In the above example, the four arguments to createTask are:

  1. job: the job object created with createJob(sched).
  2. @foo: foo is the user defined function associated with the tasks.
  3. n: is the number of output arguments in function foo.
  4. inputargs: a cell array of arguments to function foo. If it is a cell array of cell arrays, then a separate task is created for each cell array. The tasks will be run in parallel.

The function foo must be on your MATLAB path to run on the cluster. If not it can be set in the job object with:

set(job,'PathDependencies',{'Y:\Profiles\<your name>\<path_to_foo>'});

Finally, submit(job) sends the job to the cluster.

Once the job object "job" is created, you can determine the status of the job at any time by typing the job object, "job" in the example, and return. You can also monitor your jobs using the HPC Job Manager published on the Portal.

Some additional PCT functions that are helpful are:

  • cancel(job): Cancels the job associated with job object "job" on the HPC Cluster
  • destroy(job): Destroys the job associated with job object "job"

6.3. Job Management Convenience Functions

The following convenience functions are provided on the Portal to help manage jobs.

  • getJob
  • getJobs
  • destroyJobs
getJob

getJob(id) gets the job object for a job with ID = id.

Syntax

job = getJob( id )

Arguments

id is an integer representing the job ID. The ID is provided when a job is first submitted or it can be obtained from getJobs or various MATLAB functions such as findJob.

Return Value

A job object associated with the job with ID id.

Description

getJob( id ) returns the job object associated with the job with ID id. It can be used for many purposes, eg, to get all output arguments of a completed job using the MATLAB function job.getAllOutputArguments or to delete the job with the MATLAB function destroy( job ).

getJobs

getJobs( type ) gets all the job objects of a given type.

Syntax

jobs = getJobs( 'type' )
jobs = getJobs

Arguments

type is one of: pending, queued, running, or finished. If no argument is supplied, all jobs objects are returned.

Return Value

An array of job objects of a given type or all job objects if no type is given.

Description

getJobs returns an array of job objects of a given type or all job objects if no type is given. It can be used to locate a particular job. Each job object in the array can be used for many purposes, eg to get output arguments or to destroy the job.

destroyJobs

destroyJobs( type ) destroys all the jobs of a given type. Use with CAUTION. A destroyed job cannot be recovered.

Syntax

destroyJobs( 'type' )

Arguments

type is one of: pending, queued, running, or finished.

Return Value

None.

Description

destroyJobs is a convenience function that can destroy all jobs of a given type and must be used cautiously. For example, destroyJobs('finished') will destroy all jobs that have finished.

7. Links to Vendor Documentation

MATLAB Home: http://www.mathworks.com/help/

Parallel Computing Toolbox: http://www.mathworks.com/products/parallel-computing