-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathintro.html
158 lines (158 loc) · 8.75 KB
/
intro.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<meta name="author" content="August 15, 2018" />
<title>Parallel MATLAB workshop: Savio-specific notes</title>
<style type="text/css">code{white-space: pre;}</style>
</head>
<body>
<div id="header">
<h1 class="title">Parallel MATLAB workshop: Savio-specific notes</h1>
<h2 class="author">August 15, 2018</h2>
<h3 class="date">Chris Paciorek</h3>
</div>
<h1 id="outline">Outline</h1>
<p>Berkeley Research IT helps researchers through its Berkeley Research Computing (BRC) and Research Data Management (RDM) programs.</p>
<p>This document contains information on:</p>
<ul>
<li>Savio campus cluster
<ul>
<li>Getting access to the system - faculty computing allowance, condo</li>
<li>Savio computing nodes</li>
<li>Cluster job submission/scheduling</li>
<li>MATLAB usage on Savio</li>
</ul></li>
<li>Disk storage at Berkeley
<ul>
<li>Where can I put my stuff</li>
<li>Data transfer</li>
</ul></li>
<li>How to get help</li>
</ul>
<h1 id="system-capabilities-and-hardware">System capabilities and hardware</h1>
<p>Berkeley Research Computing runs the campus cluster, Savio.</p>
<ul>
<li>Savio is a >380-node, >8000-core, >169000-gpu-core Linux cluster rated at >350 peak teraFLOPS.
<ul>
<li>about 174 compute nodes provided by the UC Berkeley for general access</li>
<li>about 211 compute nodes contributed by researchers in the Condo program</li>
</ul></li>
</ul>
<h1 id="getting-access-to-the-system---fca-and-condo">Getting access to the system - FCA and condo</h1>
<ul>
<li>All regular Berkeley faculty can request 300,000 service units (roughly core-hours) per year through the <a href="http://research-it.berkeley.edu/services/high-performance-computing/faculty-computing-allowance">Faculty Computing Allowance (FCA)</a></li>
<li>Researchers can also purchase nodes for their own priority access and gain access to the shared Savio infrastructure and to the ability to <em>burst</em> to additional nodes through the <a href="http://research-it.berkeley.edu/services/high-performance-computing/condo-cluster-program">condo cluster program</a></li>
<li>Instructors can request an <a href="http://research-it.berkeley.edu/programs/berkeley-research-computing/instructional-computing-allowance">Instructional Computing Allowance (ICA)</a>.</li>
</ul>
<p>Faculty/principal investigators can allow researchers working with them to get user accounts with access to the FCA or condo resources available to the faculty member.</p>
<h1 id="savio-computing-nodes">Savio computing nodes</h1>
<p><a href="https://research-it.berkeley.edu/services/high-performance-computing">Savio</a> provides access to the following types of computational resources:</p>
<ul>
<li>full 20/24/28 core nodes scheduled per-node</li>
<li>'htc' nodes scheduled per-core</li>
<li>GPU nodes (scheduled per GPU)</li>
<li>big-memory nodes (scheduled per node)</li>
<li>Jupyter notebooks</li>
<li>a visualization/remote desktop node</li>
</ul>
<p>Let's take a look at the hardware specifications of the computing nodes on the cluster <a href="https://research-it.berkeley.edu/services/high-performance-computing/user-guide/savio-user-guide">(see the <em>Hardware Configuration</em> section of this document)</a>.</p>
<p>The nodes are divided into several pools, called partitions. These partitions have different restrictions and costs associated with them <a href="https://research-it.berkeley.edu/services/high-performance-computing/user-guide/savio-user-guide">(see the <em>Scheduler Configuration</em> section of this document)</a>. Any job you submit must be submitted to a partition to which you have access.</p>
<h1 id="submitting-jobs-accounts-and-partitions">Submitting jobs: accounts and partitions</h1>
<p>All computations are done by submitting jobs to the scheduling software that manages jobs on the cluster, called SLURM.</p>
<ul>
<li>interactive jobs (via srun)</li>
<li>interactive jobs (with faster visualization capabilities via <a href="http://research-it.berkeley.edu/services/high-performance-computing/using-brc-visualization-node-realvnc">Savio viz node</a>)</li>
<li>batch/background jobs (via sbatch)</li>
</ul>
<p>Here's an example job script for a batch job. You'll need to modify the various "--" flags for your own work.</p>
<pre><code>#!/bin/bash
# Job name:
#SBATCH --job-name=test
#
# Account:
#SBATCH --account=fc_paciorek
#
# Partition:
#SBATCH --partition=savio2
#
# Number of tasks
#SBATCH --ntasks=1
#
# Number of cores per task
#SBATCH --cpus-per-task=24
#
# Wall clock limit (30 minutes here):
#SBATCH --time=00:30:00
#
## Command(s) to run:
module load matlab
matlab < run.m > output.txt</code></pre>
<p>Note that you'll be charged for a full node (except in the savio2_htc and savio2_gpu partitions), so ideally your MATLAB code should make use of the 24 cores on Savio2 node in this case.</p>
<h1 id="using-matlab-on-savio">Using MATLAB on Savio</h1>
<p>You need to show that you have a MATLAB license in order to use MATLAB on Savio.</p>
<p>For more details on showing you have a license and using MATLAB on Savio, see <a href="http://research-it.berkeley.edu/services/high-performance-computing/using-matlab-savio">here</a>.</p>
<p>One key difference from using MATLAB on your laptop is that you need to make sure that the number of cores you request from SLURM in your job script aligns with the number of cores that MATLAB will use.</p>
<h1 id="using-matlab-dcs-on-savio">Using MATLAB DCS on Savio</h1>
<p>MATLAB DCS allows one to use computational resources across multiple nodes. You can use up to 32 workers (based on our current DCS license) and one or more cores per worker. (Note that the 32 workers are shared across all Savio users at any given time.)</p>
<p>Details are <a href="http://research-it.berkeley.edu/services/high-performance-computing/using-matlab-savio/running-matlab-jobs-across-multiple-savio">here</a>.</p>
<p>Key items to remember:</p>
<ul>
<li>run <code>configCluster</code> in MATLAB once in your account</li>
<li>request as many MATLAB licenses as MATLAB workers you will use, e.g.,
<ul>
<li><code>#SBATCH --licenses=mdcs:28</code></li>
</ul></li>
<li>in each MATLAB DCS cluster job, do:
<ul>
<li><code>module load matlab</code></li>
<li><code>export MDCE_OVERRIDE_EXTERNAL_HOSTNAME=$(/bin/hostname -f)</code></li>
</ul></li>
<li>use the "savio" MATLAB parallel cluster profile</li>
</ul>
<p>Please see the <a href="matlab_sbatch_template.sh">example sbatch submission script</a> and <a href="example_parpool.m">example MATLAB parallel script</a>.</p>
<h1 id="disk-space-options-on-savio-and-on-campus-broadly">Disk space options on Savio and on campus broadly</h1>
<p>Here are some options for moderate-large disk storage options:</p>
<ul>
<li>Savio project storage: 200 GB, backed up</li>
<li>Savio scratch: 1.5 PB shared across all users, not backed up, subject to removal</li>
<li>Savio condo (purchase) storage: roughly $6000 per 42 TB</li>
<li>Berkeley Box: unlimited, 15 GB file size limit</li>
<li>bDrive (Berkeley Google drive): unlimited</li>
</ul>
<p>More details on Savio storage are here <a href="https://research-it.berkeley.edu/services/high-performance-computing/user-guide/savio-user-guide">here in the <em>Storage and Backup</em> section</a>.</p>
<h1 id="data-transfer-for-large-data">Data transfer for large data</h1>
<p>Some options include:</p>
<ul>
<li><a href="https://research-it.berkeley.edu/services/high-performance-computing/using-globus-connect-savio">Globus</a> (to/from Savio, laptop, XSEDE)</li>
<li><a href="https://research-it.berkeley.edu/services/research-data-management-service/take-advantage-unlimited-bdrive-storage-using-rclone">rclone</a> (to/from bDrive and Berkeley Box)</li>
</ul>
<h1 id="how-to-get-additional-help">How to get additional help</h1>
<ul>
<li>For technical issues and questions about using Savio:
<ul>
<li>brc-hpc-help@berkeley.edu</li>
</ul></li>
<li>For questions about computing resources in general, including cloud computing:
<ul>
<li>brc@berkeley.edu or research-it-consulting@berkeley.edu</li>
</ul></li>
<li>For questions about data management (including HIPAA-protected data):
<ul>
<li>researchdata@berkeley.edu or research-it-consulting@berkeley.edu</li>
</ul></li>
<li>Office hours for any of the above topics:
<ul>
<li>Tues. 10-12, Wed. 1:30-3, Thur. 9:30-11:30 in AIS (Dwinelle 117)</li>
</ul></li>
</ul>
<p>Don't hesitate to contact us; we're friendly, even with basic questions.</p>
<h1 id="upcoming-events">Upcoming events</h1>
<ul>
<li>Savio intro workshop Thursday September 17</li>
<li>Other trainings planned for the fall</li>
</ul>
</body>
</html>