forked from LLNL/magpie
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathNEWS
374 lines (312 loc) · 11.4 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
magpie 1.57
-----------
Use --time instead of SBATCH_TIMELIMIT in sbatch scripts
Fix corner case in MAGPIE_TIMELIMIT_MINUTES calculation if job < 1 hour
When resource manager is slurm, calculate time left in job dynamically instead of using fixed MAGPIE_STARTUP_TIME
In Hadoop terasort, check for old directories before removing them
Add various error checks for HDFS over Lustre/NetworkFS, including varying node counts and mounting with different Hadoop versions.
Fix up setup/cleanup/teardown code to fix various corner cases
Clean up setup code into a new script, reducing number of calls to srun, etc.
Fix msub-torque submission scripts to be able to use Tachyon.
Various README/documentation updates/upgrades.
Various other minor fixes
magpie 1.56
-----------
Various minor fixes.
magpie 1.55
-----------
Add basic acl checks for Hadoop.
Set default HDFS umask 077.
Add shared secret to Spark.
magpie 1.54
-----------
Add basic support for Tachyon (ALPHA).
Support HADOOP_MODE of "launch" for convenience.
Support MAGPIE_ENVIRONMENT_VARIABLE_SCRIPT_SHELL option.
Output additional environment variables into MAGPIE_ENVIRONMENT_VARIABLE_SCRIPT.
Update primary Spark to 1.3.0.
magpie 1.53
-----------
Update various path defaults in scripts and submission-scripts.
Remove MAGPIE_USERNAME, just use USER instead.
Use HOME instead of /home/${USER} in various locations.
Documentation updates.
magpie 1.52
-----------
Support MAGPIE_USERNAME environment variable.
Default most paths to use MAGPIE_USERNAME instead of generic 'username'.
No longer require users to set MAGPIE_TIMELIMIT_MINUTES when using Moab.
Make default SPARK_LOCAL_SCRATCH_DIR a Lustre path.
In Spark w/o HDFS submission script, default to setting up network
based scratch space.
magpie 1.51
-----------
Support Storm 0.9.3.
Update primary Hadoop support to 2.6.0.
Update primary Pig to 0.14.0.
Update primary Spark to 1.2.0. Add appropriate patches for support.
Update primary Hbase to 0.98.9.
Add new magpie-apache-download-and-setup.sh convenience script.
Various re-org of script files and script directories.
magpie 1.50
-----------
Support SPARK_DEPLOY_SPREADOUT option.
magpie 1.49
-----------
Support Moab scheduler w/ Torque resource manager through msubtorque submission type.
- See new submission scripts in script-msub-torque
Fix scripts/magpie-gather-config-files-and-logs-script.sh to gather all Spark work stderr/stdout.
magpie 1.48
-----------
Fix build for correct release.
magpie 1.47
-----------
Support HADOOP_SLAVE_CORE_COUNT and SPARK_SLAVE_CORE_COUNT environment
variables.
Support 'hdfsonly' option to HADOOP_MODE for clarity.
Support convenience environment variables HADOOP_NAMENODE & HADOOP_NAMENODE_PORT.
Add additional environment variables into MAGPIE_ENVIRONMENT_VARIABLE_SCRIPT.
Support HDFS federation (experimental)
Support Spark configuration of spark storage memory fraction.
Support Spark configuration of spark shuffle memory fraction.
Default akkathreads = core count in Spark setup.
magpie 1.46
-----------
Support Spark wordcount test.
Various default submission file template cleanup.
- Remove Hadoop job info from Hbase & Spark + HDFS files
- Default MAGPIE_JOB_TYPE is now 'script'
msub-slurm scripts require MAGPIE_TIMELIMIT_MINUTES now, instead of SBATCH_TIMELIMIT.
Remove required configuration of SLURM_JOB_NAME in msub-slurm scriptts.
Remove 'intellustre' and 'magpienetworkfs' config options from templates by default.
magpie 1.45
-----------
Support MAGPIE_ENVIRONMENT_VARIABLE_SCRIPT option.
Support HADOOP_NAMENODE_DAEMON_HEAP_MAX option.
magpie 1.44
-----------
Support 'upgradehdfs' HADOOP_MODE to update HDFS as you move to newer
versions.
Support Storm 0.9.2.
magpie 1.43
-----------
Default Spark worker directory is now SPARK_LOCAL_DIR/work.
Make default Spark akka.threads is processor count / 2.
Update Magpie to support Spark 1.0.0.
magpie 1.42
-----------
Configure/output info on Spark job application dashboard.
Support configuration of number of slices for SparkPi test.
magpie 1.41
-----------
Fix bug in spark local scratch dir calculation.
magpie 1.40
-----------
Make default storm worker heap 1024.
Make default storm slots 50% of cores.
magpie 1.39
-----------
Support Storm in Magpie (Beta Support)
magpie 1.38
-----------
Rename 'testzookeeper' to 'zookeeperruok'
magpie 1.37
-----------
Support Zookeeper run mode and 'testzookeeper' sanity check job.
Support 'testall' job mode for basic testing/sanity checks.
Various documentation updates.
magpie 1.36
-----------
Re-arch Zookeeper code to be start/stop-able on the master node.
magpie 1.35
-----------
Support SPARK_LOCAL_SCRATCH_DIR_TYPE to indicate if
SPARK_LOCAL_SCRATCH_DIR points to a network drive or local drive.
magpie 1.34
-----------
Support Spark in Magpie
magpie 1.33
-----------
Create new convenience script escripts/magpie-output-config-files-script.sh.
Various minor bug fixes / cleanup fixes.
magpie 1.32
-----------
Support HADOOP_HDFSOVERLUSTRE_REMOVE_LOCKS and
HADOOP_HDFSOVERNETWORKFS_REMOVE_LOCKS, ability to cleanup lock files
in HDFS if necessary.
Various documentation and code organization cleanup.
magpie 1.31
-----------
Support PIG_MODE variable for consistency to Hadoop/Hbase.
Minor bug fixes.
magpie 1.30
-----------
Support sequential vs random Hbase write/read tests.
Support threaded vs mapreduce Hbase write/read tests.
Fix several corner cases.
magpie 1.29
-----------
Support binary bin/magpie-zookeeper.sh, to emulate start/stop scripts
in Hadoop/Hbase.
Support 'interactive' mode in MAGPIE_JOB_TYPE.
Decrease default hbase.server.thread.wakefrequency to 500ms per online
suggestions.
Configure hbase.regionserver.handler.count appropriately depending on
node count.
Support HBASE_PERFORMANCEEVAL_MODE, to allow sequential or random
read/write performance evaluation.
Adjust tasks/per node default if Hbase is running.
Minor cleanup of help output.
magpie 1.28
-----------
Support Hbase in Magpie
magpie 1.27
-----------
Fix corner cases with IntelLustre shuffling and UDA shuffling.
Make default slowstart .05 just like default Hadoop.
Require MAGPIE_SHUTDOWN_TIME >= 10 minutes if MAGPIE_POST_JOB_RUN is set
Add check for MAGPIE_STARTUP_TIME and MAGPIE_SHUTDOWN_TIME to be minimum 5 minutes.
Rename scripts/hadoop-rebalance-hdfs-over-lustre-if-increasing-nodes-script.sh to
scripts/hadoop-rebalance-hdfs-over-lustre-or-hdfs-over-networkfs-if-increasing-nodes-script.sh.
Rename scripts/hadoop-hdfs-over-lustre-nodes-decomission-script.sh to
scripts/hadoop-hdfs-over-lustre-or-hdfs-over-networkfs-nodes-decomission-script.sh.
Add hdfs over networkfs support to scripts/hadoop-hdfs-over-lustre-or-hdfs-over-networkfs-nodes-decomission-script.sh.
magpie 1.26
-----------
Default intellustre stripesize now 128M.
magpie 1.25
-----------
Change default memory from 90% to 80% of system memory.
Determine default container size based on size of heap instead of fixed value.
Calculate default io.sort.mb based on mapheapsize instead of fixed value.
Default reducer heap size is now 2X the map heap size.
magpie 1.24
-----------
Turn off speculative execution by default.
Make parallel copies in Hadoop configurable.
Make default tasks per node 1.5X number of cores.
magpie 1.23
-----------
Support new 'hdfsovernetworkfs' filesystem option.
Re-architect internally to be scheduler/resource manager agnostic
- Will allow extensibility in the future.
magpie 1.22
-----------
Support UDA.
Change default memory from 80% to 90% of system memory.
Change default container to be extra 128M instead of extra 256M.
magpie 1.21
-----------
Fix Hadoop 1.2.1 issue w/ hosts-include usage.
magpie 1.20
-----------
Lower MAGPIE_SHUTDOWN_TIME default to 15 minutes.
Update default Hadoop version to 2.2.0.
magpie 1.19
-----------
Support changed configuration settings in yarn-site.xml for Hadoop 2.2.0.
Launch and run job history server when Hadoop setup is enabled.
Various code cleanup.
magpie 1.18
-----------
Add support for running Pig scripts.
magpie 1.17
-----------
Increase default MAGPIE_STARTUP_TIME & MAGPIE_SHUTDOWN_TIME.
Adjust stop timeout in hadoop stop-xxx.sh scripts.
magpie 1.16
-----------
Support new HADOOP_SETUP option, to run Hadoop or not.
Require new local dir variable MAGPIE_LOCAL_DIR.
Support new MAGPIE_JOB_TYPE and MAGPIE_SCRIPT_PATH options
Support new MAGPIE_STARTUP_TIME & MAGPIE_SHUTDOWN_TIME environment variables.
Add new example script hadoop-put-into-hdfs-over-lustre.sh.
Add convenience script zookeeper-ruok-script.sh.
Add convenience script hadoop-hdfs-fsck-cleanup-corrupted-blocks-script.sh.
Do not require ZOOKEEPER_SETUP to be set.
Enable autopurging in zookeeper by default.
Add documentation on using Moab.
Require interactive/setup job length to be a minimum of 30 minutes.
Before shutting down HDFS, execute dfsadmin -saveNamespace to prevent potential corruption on shutdown .
Do not start job until after namenode has exited safe mode.
Do not execute job if pre-run script fails with exit code 0.
Add various additional templates for easier setup.
Add msub files for those who use Slurm via Moab job submissions.
Various re-org and cleanup.
magpie 1.15
-----------
Increase file and process limit default calculation.
Support HADOOP_SETUP_TYPE HDFS1 or HDFS2.
magpie 1.14
-----------
Add hadoop-env.sh output to magpie-example-pre-job-script.
Fix corner case in HADOOP_ENVIRONMENT_EXTRA_PATH setting
Create new HADOOP_MASTER_NODE envrionment variable.
Add support for Zookeeper.
Update comments/documentation.
magpie 1.13
-----------
Split magpie-run into numerous run files.
Exit on input errors and don't run rest of job.
Rename some scripts/examples as needed.
Rename HADOOP_REMOTE_CMD to MAGPIE_REMOTE_CMD
Support MAGPIE_REMOTE_CMD_OPTS environment variable.
magpie 1.12
-----------
Clarify text and rename many files/environment files.
Add sbatch --output to make sbatch file.
magpie 1.11
-----------
Support magpie network fs scheme.
magpie 1.10
-----------
Support Intel Lustre scheme in IDH.
magpie 1.9
----------
Minor corner case fixes.
magpie 1.8
----------
General code cleanup/code-reorg and documentation fixes.
magpie 1.7
----------
Support HADOOP_RAWNETWORKFS_BLOCKSIZE option.
magpie 1.6
----------
Support inputting multiple paths for HDFS and local store fs.
Code re-org, move common exports to new file hadoop-common-exports.
Rename HADOOP_BUILD_HOME to HADOOP_HOME for legacy purposes.
magpie 1.5
----------
Increase HDFS bandwidthPerSec to 4G, which is ~QDR ipoib speed.
magpie 1.4
----------
Make default HDFS over Lustre replication 3
Document trade offs of replication.
Increase HDFS bandwidthPerSec.
Fix hadoop-hdfs-over-lustre-nodes-decomission-script.sh path bug.
Fix hadoop-create-files-script.sh to work w/ Hadoop 1.0 & 2.0.
Add convenience script hadoop-remove-all-files-script.sh.
magpie 1.3
----------
Remove slurm job name from hdfs over lustre created path. Use path
given directly by user.
magpie 1.2
----------
Move hadoop-gather into generic post run script
hadoop-gather-config-files-and-logs-script.sh.
Fix environment variable export in post scripts
General documentation updates.
magpie 1.1
----------
Add ability to rebalance HDFS data on HDFS over Lustre if user changes
node count.
Added convenience scripts
hadoop-create-files-script.sh,
hadoop-list-files-script.sh,
hadoop-rebalance-hdfs-over-lustre-if-increasing-nodes-script.sh, and
hadoop-hdfs-over-lustre-nodes-decomission-script.sh.
export new environment variable HADOOP_SLAVE_COUNT
Various code cleanup, file organization, and documentation updates
magpie 1.0
----------
Initial release