Note | ||
---|---|---|

| ||

This document is still under development and is being written/edited by RCSG and Mathworks staff. |

## Table of Contents

Table of Contents |
---|

## Introduction

There are several ways in which to submit Matlab jobs to a cluster. This document will cover the various ways to run Matlab compute jobs on the Shared Research Compute clusters, which will include using the Parallel Computing Toolbox (PCT) and the Matlab Distributed Compute Engine (MDCE) to submit many independent tasks and to submit a single task that has parallel components. Examples are included.

## Definitions

**Task Parallel** - Multiple independent iterations within your workflow.

**Data Parallel** - Single program working on a problem across multiple processors.

...

### Orchestrate Parallel Toolbox jobs from the cluster login nodes

Set up MATLAB Parallel Computing Toolbox from a cluster login node on DAVINCI

### Have GPU accelerated MATLAB code?

Running MATLAB on a GPGPU in DAVINCI

### Need to go even faster?

MATLAB is a big Java virtual machine, and although Java has made strides in recent years, it is still slower than running native code. Enter the MATLAB Compiler. This allows you to compile MATLAB scripts into binaries that can run natively in the cluster environment. We can direct you to resources for compiling frequently run MATLAB scripts into binaries and run them on the clusters.

## Definitions

**Task Parallel Application** - The same application that runs independently on several nodes, possibly with different input parameters. There is no communication, shared data, or synchronization points between the nodes.

**Data Parallel** **Application** - The same application that runs on several labs simultaneously, with communication, shared data, or synchronization points between the labs.

**Lab** - A MATLAB worker in a multicore (Data Parallel) job. One lab is assigned to one worker (core). Thus, a job with eight labs has eight processor cores allocated to it and will have eight workers each working together as peers.

**MDCS** - MATLAB Distributed Compute Server. This is a component of

...

MATLAB that allows our clusters to run

...

MATLAB jobs that exceed the size of a single compute node (multinode parallel jobs).

...

Calls to matlabpool should not be embedded in in the MATLAB code, but rather called at the MATLAB command prompt.

It also allows jobs to run even if there are not enough toolbox licenses available for a particular toolbox, so long as the university owns at least one license for the particular toolbox.

**Matlab Task** - An independent Matlab calculation.

**Matlab Job** - Submission from within the Matlab GUI that can contain one or more tasks.

**Matlab Worker** - Analogous to the number of processor cores assigned to a job. If a job needs 8 processor cores, then it must have 8 Matlab workers.

**Job** - Job submitted via the PBS job scheduler (also called PBS Job).

## Interactive Jobs

TBA take from old FAQ

## Running Jobs with PBS *qsub*

TBA Good for jobs that are single processor jobs and do not encounter any toolbox license issues. Take from old FAQ.

## Using MDCE for Task Parallel and Data Parallel Jobs

Using MDCE will allow you to submit multiple jobs with a single job submission (Task Parallel) or submit a single task that is a multiprocessor (and possibly multinode) job. In order to run this type of job you must first configure Matlab for this type of job submission by following these steps;

Info | ||
---|---|---|

| ||

These steps need to be performed only once. Subsequent runs of Matlab need not repeat these steps. |

### Configuring Matlab

1. In your home directory create the MdcsDataLocation/ClusterName subdirectory.

Code Block |
---|

```
mkdir -p ~/MdcsDataLocation/ClusterName
``` |

where *ClusterName* will be one of *sugar*, *stic*, *davinci*.

2. Load the Matlab 2011a environment:

Code Block |
---|

```
module load matlab/2011a
``` |

3. Run Matlab on the login node:

Code Block |
---|

```
matlab
``` |

4. In Matlab, add the ddd folder to your Matlab path so that Matlab will be able to find the scripts necessary to submit and schedule jobs.

- Click on File and then Set Path
- Click the Add Folder button
- Specify the following folder:

/opt/apps/matlab/2011a-scriptsNote title Error saving pathdef.m If Matlab reports that it is unable to save

*pathdef.m*in your current folder, then follow the prompts to select your home folder before saving the file.

5. Import the cluster configuration for *sugar*

- Click on Parallel and then Manage Configurations
- Click on File and then Import
- Navigate to /opt/apps/matlab/2011a-scripts and select the configuration for the system you are using, such as
*sugar.mat*,*davinci.mat*,*stic.mat*, and so forth. - Select the configuration for the system you are using and click on Start Validation
- All four stages should pass: Find Resources, Distributed Job, Parallel Job, Matlabpool
Note title Validation will fail on a busy cluster If the cluster is busy such that a job submission must wait before it will run then the validation steps will fail.

- All four stages should pass: Find Resources, Distributed Job, Parallel Job, Matlabpool

If all validation stages succeed, then you are ready to submit jobs with MDCE.

### Submitting Task Parallel Jobs

The following is an example of a Task Parallel job. The task-parallel example code, *frontDemo*, calculates the risk and return based on historical data from a collection of stock prices. The core of the code, *calcFrontier*, minimizes the equations for a set of returns. In order to parallelize the code, the *for* loop is converted into a parfor loop with each iteration of the loop becoming its own independent task. View the m code here.

To submit the job, copy submitParJobToCluster.m into your working directory, make the necessary modifications for your job environment, and then run the code from within Matlab. This will submit the job. The code can be downloaded from here. An explanation of the code follows:

Code Block | ||
---|---|---|

| ||

```
function job = submitParJobToCluster()
if nargin==0, sz = 3; end
% Set the walltime to 5 minutes
ClusterInfo.setWallTime('00:05:00'); % change this to the actual walltime that you need.
ClusterInfo.setEmailAddress('YourEmailAddressHere') % include your email address here.
job = batch(@frontDemo,2,{},'Matlabpool',sz,'CaptureDiary', true); % this submits the frontDemo.m job.
% @frontDemo is the function to submit.
% 2 is the number of output arguments
% {} is an empty array of input arguments
% sz is the number of processor cores (workers)
% CaptureDiary is set to true
job.wait % the Matlab GUI will pause here until the job finishes
try
error(job.Task.ErrorMessage)
out = job.getAllOutputArguments(); % get all output arguments from the completed job
r = out{1}; v = out{2};
plot(r,v) % plot the results
job.diary
catch ME
error(ME.message)
end
if nargout==0
job.destroy
clear job
end
``` |

When you run this code within Matlab, the *frontDemo* code will be submitted to the PBS job scheduler. Use the showq command from a cluster terminal window to look for your job in the job queue.

Tip | ||
---|---|---|

| ||

The above is only an example used to illustrate how to submit a job and retrieve the results from within the same Matlab session. In most cases using |

Info | ||
---|---|---|

| ||

The maximum number of workers per job submission is constrained by the queue policy on each cluster, with one worker per processor core. For example, Sugar will not accept more than 8 workers per submission. |

Tip | ||
---|---|---|

| ||

For more information on the batch() command and all of its input arguments and how to use the diary, please see Matlab's online help or the Mathworks website. |

...

Be sure to destroy your job after the job has finished.

### Submitting Data Parallel Jobs

The data-parallel example code calculates the area of pi under the curve. The non parallel version, *calcPiSerial*, calculates with a for loop, looping through discrete points. The parallel version, *calcPiSpmd*, uses the *spmd* construct to evaluate a port of the curve on each MATLAB instance. Each MATLAB instances uses its *labindex* (i.e. rank) to determine which portion of the curve to calculate. The calculations are then globally summed together and broadcasted back out. The code uses higher level routines, rather than lower level MPI calls. Once the summation has been calculated, it’s indexed into and communicated back to the local client MATLAB to calculate the total area. The example code for calcPiSerial and calcPiSpmd can be downloaded here.

To submit the job, copy submitSpmdJobToCluster.m into your working directory, make the necessary modifications for your job environment, and then run the code from within Matlab. This will submit the job. The code can be downloaded from here. An explanation of the code follows:

Code Block | ||
---|---|---|

| ||

```
function job = submitSpmdJobToCluster(sz)
if nargin==0, sz = 3; end
% Set the walltime to 5 minutes
ClusterInfo.setWallTime('00:05:00'); % change this to the actual walltime that you need.
ClusterInfo.setEmailAddress('YourEmailAddressHere') % include your email address here.
job = batch(@calcPiSpmd,1,{sz},'Matlabpool',sz); % this will submit the calicPiSpmd function
% @calcPiSpmd is the function to submit.
% 1 is the number of output arguments
% {sz} is an array of input arguments
% sz is the number of processor cores (workers)
job.wait % the Matlab GUI will pause here until the job finishes
try
error(job.Task.ErrorMessage)
out = job.getAllOutputArguments(); % get all output arguments from the completed job
p = out{1}
catch ME
error(ME.message)
end
if nargout==0
job.destroy
clear job
end
``` |

When you run this code within Matlab, the *calcPiSpmd* code will be submitted to the PBS job scheduler. Use the *showq* command from a cluster terminal window to look for your job in the job queue.

Tip | ||
---|---|---|

| ||

The above is only an example used to illustrate how to submit a job and retrieve the results from within the same Matlab session. In most cases using |

Info | ||
---|---|---|

| ||

The maximum number of workers per job submission is constrained by the queue policy on each cluster, with one worker per processor core. For example, Sugar will not accept more than 8 workers per submission. |

Tip | ||
---|---|---|

| ||

For more information on the batch() command and all of its input arguments and how to use the diary, please see Matlab's online help or the Mathworks website. |

### Job Dependencies

### Configuring Cluster Parameters with *ClusterInfo*

...

### Destroying a Job

When you submit a job with *batch*, you will notice that each submission is labeled *Job1*, *Job2,* and so forth. Temporary directories associated with each job can be found in ~/MdcsDataLocation as the jobs are running. When *job.destroy* is called, these temporary directories are deleted. The above examples call *job.destroy*. If you close your Matlab session before executing *job.destroy*, which is likely unless you using the *job.wait* example, you will need to manually cleanup temporary directories in ~/MdcsDataLocation.

## Running Locally on a Desktop

In order to run either the Task Parallel code example or Data Parallel Code example on your desktop locally, you must first start up up a MATLAB Pool, as such:

Code Block |
---|

```
matlabpool open local 8
``` |

where 8 is the number of MATLAB processes to attach to the job.

Info | ||
---|---|---|

| ||

This should not be more than nc-1, where nc is the number of cores on the local machine |

After running the code, close the MATLAB Pool:

Code Block |
---|

```
matlabpool close
``` |

...

**PCT** - Parallel Computing Toolbox.

**MATLAB Task** - One segment of a job to be evaluated by a worker.

**MATLAB Job** - The complete large-scale operation to perform in MATLAB, composed of a set of tasks.

**MATLAB Worker** - The MATLAB session that performs the task computations. If a job needs eight processor cores, then it must have eight workers.

**Job** - Job submitted via the SLURM job scheduler (also called SLURM Job).