Skip to content

sara-nl/energy_efficiency_job_level

Repository files navigation

This repository is focused on benchmarking jobs and classifying them into relevant classes. The basic idea is that by feeding the traces that we have collected from Promethues after a job is done, we say that this job was a memory intensive, half memory intensive, CPU intensive or GPU intensive.

Notebooks

Exploratory Data Analysis (EDA) and Machine Learning (ML)

  • The EDA and ML notebooks utilize the data available in the local repository, allowing for immediate execution.

Deep Learning Approach

  • The deep learning repository extends this work by assuming that job traces remain almost constant.

  • Under this assumption, we achieved 65% accuracy.

  • An alternative approach involves treating each job trace as a matrix with dimensions:

    timesteps × number of signals

    This method may yield improved results. We need to point out that there are some challenges with this idea like the jobs don't have the smae length, so each matrix of data can have its own row numbers. In the initial version, we took the longest job and added zeros to the rows for all other jobs to get the same length.

Cleaning Benchmark Job Data

  • The cleaning_benchmark_job_data notebook processes JSON files typically generated by the HPC team.
  • It queries SLURM to retrieve the exact node and runtime of each job.

Prometheus Benchmark Job Extraction

  • The prom_benchmark_job_extraction notebook assumes that traces for all nodes have already been gathered and stored in Parquet format.
  • This notebook identifies the node and time for each job and extracts the corresponding signals or traces.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published