Skip to content

Commit

Permalink
Merge pull request #55 from HealthyPear/benchmarking-add_missing_note…
Browse files Browse the repository at this point in the history
…books

Add benchmark notebooks for medium and late stages
  • Loading branch information
HealthyPear authored Jun 15, 2020
2 parents 3b8088f + 7547b1a commit 0704409
Show file tree
Hide file tree
Showing 52 changed files with 3,001 additions and 5 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,9 @@ distribute-*.tar.gz
*.h5
*.hdf5
*.fits
*.fits.gz
irf.fits.gz
table_best_cutoff.fits
*.pkl.gz

*.pstats
target
Expand Down
292 changes: 292 additions & 0 deletions benchmarks/DL2/benchmarks_DL2_energy-estimation.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,292 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Energy estimation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**WARNING**\n",
"\n",
"This is still a work-in-progress, it will evolve with the pipeline comparisons and converge with ctaplot+cta-benchmarks.\n",
"\n",
"Part of this notebook is performed by `protopipe.scripts.model_diagnostics`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Author(s):**\n",
" \n",
"- Dr. Michele Peresano (CEA-Saclay/IRFU/DAp/LEPCHE), 2020\n",
"\n",
"based on previous work by J. Lefacheur.\n",
"\n",
"**Description:**\n",
"\n",
"This notebook contains benchmarks for the _protopipe_ pipeline regarding energy estimation.\n",
"Additional information is provided by protopipe.scripts.model_diagnostics, which will eventually merge here.\n",
"\n",
"Note that:\n",
" - a more general set of benchmarks is being defined in cta-benchmarks/ctaplot,\n",
" - follow [this](https://www.overleaf.com/16933164ghbhvjtchknf) document by adding new benchmarks or proposing new ones.\n",
"\n",
"**Requirements:**\n",
"\n",
"To run this notebook you will need a set of trained data produced on the grid with protopipe.\n",
"The MC production to be used and the appropriate set of files to use for this notebook can be found [here](https://forge.in2p3.fr/projects/step-by-step-reference-mars-analysis/wiki#The-MC-sample ).\n",
"\n",
"The data format required to run the notebook is the current one used by _protopipe_ .\n",
"Later on it will be the same as in _ctapipe_ + _pyirf_.\n",
"\n",
"**Development and testing:** \n",
"\n",
"For the moment this notebook is optimized to work only on files produced from LSTCam + NectarCam telescope configurations. \n",
"As with any other part of _protopipe_ and being part of the official repository, this notebook can be further developed by any interested contributor. \n",
"The execution of this notebook is not currently automatic, it must be done locally by the user - preferably _before_ pushing a pull-request. \n",
"**IMPORTANT:** Please, if you wish to contribute to this notebook, before pushing anything to your branch (better even before opening the PR) clear all the output and remove any local directory paths that you used for testing (leave empty strings).\n",
"\n",
"**TODO:** \n",
"* update everything...\n",
"* merge model diagnostics products\n",
"* add remaining benchmarks from CTA-MARS comparison\n",
"* same for EventDisplay"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.colors as colors\n",
"from matplotlib.colors import LogNorm, PowerNorm\n",
"count = 0\n",
"cmap = dict()\n",
"for key in colors.cnames:\n",
" if 'dark' in key:\n",
" #if key in key:\n",
" cmap[count] = key\n",
" count = count + 1\n",
"#cmap = {'black': 0, 'red': 1, 'blue': 2, 'green': 3}\n",
"cmap = {0: 'black', 1: 'red', 2: 'blue', 3: 'green'}\n",
"import os\n",
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def plot_profile(ax, data, xcol, ycol, n_xbin, x_range, logx=False, **kwargs):\n",
" color = kwargs.get('color', 'red')\n",
" label = kwargs.get('label', '')\n",
" fill = kwargs.get('fill', False)\n",
" alpha = kwargs.get('alpha', 1)\n",
" xlabel = kwargs.get('xlabel', '')\n",
" ylabel = kwargs.get('ylabel', '')\n",
" xlim = kwargs.get('xlim', None)\n",
" ms = kwargs.get('ms', 8)\n",
" \n",
" if logx is False:\n",
" bin_edges = np.linspace(x_range[0], x_range[-1], n_xbin, True)\n",
" bin_center = 0.5 * (bin_edges[1:] + bin_edges[:-1])\n",
" bin_width = bin_edges[1:] - bin_edges[:-1]\n",
" else:\n",
" bin_edges = np.logspace(np.log10(x_range[0]), np.log10(x_range[-1]), n_xbin, True)\n",
" bin_center = np.sqrt(bin_edges[1:] * bin_edges[:-1])\n",
" bin_width = bin_edges[1:] - bin_edges[:-1]\n",
" \n",
" y = []\n",
" yerr = []\n",
" for idx in range(len(bin_center)):\n",
" counts = data[ (data[xcol] > bin_edges[idx]) & (data[xcol] <= bin_edges[idx+1]) ][ycol]\n",
" y.append(counts.mean())\n",
" yerr.append(counts.std() / np.sqrt(len(counts)))\n",
" \n",
" ax.errorbar(x=bin_center, y=y, xerr=bin_width / 2., yerr=yerr, label=label, fmt='o', color=color, ms=ms)\n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)\n",
" if logx is True:\n",
" ax.set_xscale('log')\n",
" ax.legend(loc='upper right', framealpha=1, fontsize='medium')\n",
" #ax.grid(which='both')\n",
" return ax"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we check if a _plots_ folder exists already. \n",
"If not, we create it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Path(\"./plots_energy_estimation\").mkdir(parents=True, exist_ok=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Setup for data loading"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mode = 'tail'\n",
"\n",
"parent = '' # your analysis parent folder (better use an absolute path)\n",
"config = 'LaPalma_fullArray_zen20_az0' # analysis descriptor to be used as suffix in the plots\n",
"\n",
"# Load data\n",
"data_dir = f'{parent}/data/DL1/for_energy_estimation'\n",
"myfile = 'dl1_{}_gamma_merged.h5'.format(mode)\n",
"\n",
"data_image = pd.read_hdf(os.path.join(data_dir,myfile), key='feature_events_LSTCam')\n",
"print('#Images={}'.format(len(data_image)))\n",
"data_image['log10_charge'] = np.log10(data_image['sum_signal_cam'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_image.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Benchmarks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Charge profile"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tel_ids = [1, 2, 3, 4] # WARNING! These are only the LSTs!\n",
"n_feature = len(tel_ids)\n",
"nrows = int(n_feature / 2) if n_feature % 2 == 0 else int((n_feature + 1) / 2)\n",
"\n",
"emin = 0.03\n",
"emax = 10\n",
"nbin = 4\n",
"energy_range = np.logspace(np.log10(emin), np.log10(emax), nbin + 1, True)\n",
"\n",
"#plt.figure(figsize=(10,10))\n",
"#fig, axes = plt.subplots(nrows=nrows, ncols=2, figsize=(10, 10))\n",
"#axes = axes.flatten()\n",
"\n",
"plt.figure(figsize=(5,5))\n",
"ax = plt.gca()\n",
"for jdx in range(0, len(energy_range) - 1):\n",
" \n",
" data_sel = data_image[data_image['N_LST'] >= 2]\n",
" data_sel = data_sel[(data_sel['mc_energy'] >= energy_range[jdx]) & \n",
" (data_sel['mc_energy'] < energy_range[jdx + 1])]\n",
" \n",
" xbins = 10 + 1\n",
" xrange = [10, 2000]\n",
" opt = {'xlabel': 'Impact parameter [m]', 'ylabel': 'Charge [p.e.]', 'color': cmap[jdx],\n",
" 'label': 'E [{:.2f},{:.2f}] TeV'.format(energy_range[jdx], energy_range[jdx+1]),\n",
" 'ms': 6}\n",
" plot_profile(ax, data=data_sel,\n",
" xcol='impact_dist', ycol='sum_signal_cam',\n",
" n_xbin=xbins, x_range=xrange, logx=True, **opt)\n",
" #ax.grid(which='both')\n",
" ax.set_yscale('log')\n",
" ax.set_yscale('log')\n",
" ax.set_ylim([100, 2. * 100000.])\n",
" ax.set_xlim([10, 2000])\n",
"\n",
"ax.grid(which='both')\n",
"plt.tight_layout()\n",
"\n",
"fig.savefig(f\"./plots_energy_estimation/charge_profile_{config}.png\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 0704409

Please sign in to comment.