Table of Contents
Introduction
There are several ways in which to submit Matlab jobs to a cluster. This document will cover the various ways to run Matlab compute jobs on the Shared Research Compute clusters, which will include using the Parallel Computing Toolbox (PCT) and the Matlab Distributed Compute Engine (MDCE) to submit many independent tasks and to submit a single task that has parallel components. Examples are included.
Definitions
Task Parallel - Multiple independent iterations within your workflow.
Data Parallel - Single program working on a problem across multiple processors.
MDCE - Matlab Distributed Compute Engine. This is a component of Matlab that allows our clusters to run Matlab jobs that exceed the size of a single compute node (multinode parallel jobs). it also allows jobs to run even if there are not enough toolbox licenses available for a particular toolbox, so long as the university owns at least one license for the particular toolbox.
Matlab Task - An independent Matlab calculation.
Matlab Job - Submission from within the Matlab GUI that can contain one or more tasks.
Matlab Worker - Analogous to the number of processor cores assigned to a job. If a job needs 8 processor cores, then it must have 8 Matlab workers.
Job - Job submitted via the PBS job scheduler (also called PBS Job).
Interactive Jobs
TBA take from old FAQ
Running Jobs with PBS qsub
TBA Good for jobs that are single processor jobs and do not encounter any toolbox license issues. Take from old FAQ.
Using MDCE for Task Parallel and Data Parallel Jobs
Using MDCE will allow you to submit multiple jobs with a single job submission (Task Parallel) or submit a single task that is a multiprocessor (and possibly multinode) job. In order to run this type of job you must first configure Matlab for this type of job submission by following these steps;
Configuring Matlab
1. In your home directory create the MdcsDataLocation/ClusterName subdirectory.
where ClusterName will be one of sugar, stic, davinci.
2. Load the Matlab 2011a environment:
3. Run Matlab on the login node:
4. In Matlab, add the ddd folder to your Matlab path so that Matlab will be able to find the scripts necessary to submit and schedule jobs.
- Click on File and then Set Path
- Click the Add Folder button
- Specify the following folder:
/opt/apps/matlab/2011a-scripts
5. Import the cluster configuration for sugar
- Click on Parallel and then Manage Configurations
- Click on File and then Import
- Navigate to /opt/apps/matlab/2011a-scripts and select the configuration for the system you are using, such as sugar.mat, davinci.mat, stic.mat, and so forth.
- Select the configuration for the system you are using and click on Start Validation
- All four stages should pass: Find Resources, Distributed Job, Parallel Job, Matlabpool
If all validation stages succeed, then you are ready to submit jobs with MDCE.
Submitting Task Parallel Jobs
The following is an example of a Task Parallel job. The task-parallel example code, frontDemo, calculates the risk and return based on historical data from a collection of stock prices. The core of the code, calcFrontier, minimizes the equations for a set of returns. In order to parallelize the code, the for loop is converted into a parfor loop with each iteration of the loop becoming its own independent task. View the m code here.
To submit the job, copy submitParJobToCluster.m into your working directory, make the necessary modifications for your job environment, and then run the code from within Matlab. This will submit the job. The code can be downloaded from here. An explanation of the code follows:
When you run this code within Matlab, the frontDemo code will be submitted to the PBS job scheduler. Use the showq command from a cluster terminal window to look for your job in the job queue.
Submitting Data Parallel Jobs
The data-parallel example code calculates the area of pi under the curve. The non parallel version, calcPiSerial, calculates with a for loop, looping through discrete points. The parallel version, calcPiSpmd, uses the spmd construct to evaluate a port of the curve on each MATLAB instance. Each MATLAB instances uses its labindex (i.e. rank) to determine which portion of the curve to calculate. The calculations are then globally summed together and broadcasted back out. The code uses higher level routines, rather than lower level MPI calls. Once the summation has been calculated, it’s indexed into and communicated back to the local client MATLAB to calculate the total area. The example code for calcPiSerial and calcPiSpmd can be downloaded here.
To submit the job, copy submitSpmdJobToCluster.m into your working directory, make the necessary modifications for your job environment, and then run the code from within Matlab. This will submit the job. The code can be downloaded from here. An explanation of the code follows:
When you run this code within Matlab, the calcPiSpmd code will be submitted to the PBS job scheduler. Use the showq command from a cluster terminal window to look for your job in the job queue.
Job Dependencies
Configuring Cluster Parameters with ClusterInfo
Destroying a Job
Running Locally on a Desktop
In order to run either the Task Parallel code example or Data Parallel Code example on your desktop locally, you must first start up up a MATLAB Pool, as such:
where 8 is the number of MATLAB processes to attach to the job.
After running the code, close the MATLAB Pool: