Committing first version of the XMCTSGuard python package

2026-04-16 15:52:12 +02:00
commit d5599f0c85
29 changed files with 5074 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,81 @@
+# XMCTS data cleaning
+This repository contains a project to resolve the faults in the data acquisition of the XMCTS diagnostic of W7-X. 
+
+## Getting started
+First of all you should create a python environment to install all of the needed dependencies. 
+The recommended way is to use venv, one can create a venv by executing the following command:
+
+`python -m venv path\to\venv\env-name`
+
+Once the virtual enviroment has been created, one can install the package by moving inside the project main folder and executing the following command:
+
+`python -m build`
+
+follwing this, a folder called dist is created and one can then install the package by runnning:
+
+`python -m pip install dist/xmctsguard-vnum.tar.gz`
+
+where vnum is the current version of the package, starting from 0.1.0.
+
+Once this operation is completed all of the required libraries should have been installed and the package can be used.
+For instance the gui can be called by executing the command XMCTSGuard-gui.
+
+## Structure
+The project is structured in the following way:
+
+XMCTSGuard/
+├── src/
+│   └── XMCTSGuard/
+│       ├── __init__.py          # Makes this a package
+│       ├── main.py              # Entry point to launch the GUI
+│       │
+│       ├── gui/                 # All Things Visual
+│       │   ├── __init__.py
+│       │   ├── main_window.py   # Your Main Class
+│       │   ├── widgets.py       # Custom buttons, sliders, etc.
+│       │   └── helpers.py       # GUI-only helper functions
+│       │
+│       ├── engine/              # The "Brain" (Neural Network)
+│       │   ├── __init__.py
+│       │   ├── model.py         # The NN class
+│       │   ├── trainer.py       # Training logic
+│       │   ├── database.py      # Data loading
+│       │   └── callbacks.py     # Training monitors
+│       │
+│       └── analysis/            # The Bridge
+│           ├── __init__.py
+│           └── processors.py    # Functions that use the NN to analyze data
+│
+├── data/                        # Local storage for datasets (git-ignored)
+├── tests/                       # Your test files
+├── pyproject.toml               # Build config
+└── README.md
+
+## Usage
+The project leverages a neural network (NN) based on a AutoEncoder (AE) architecture, contained in engine/model.py to give an ansatz of what the correct brightness "profile" should look like. Subsequently, the correlation between the ansatz and the measured profile is computed, this gives a metric to understand the distance of the measured profile from the usual distribution of profiles in W7-X. Moreover, the ansatz for the profile is also used to compute the distance of each diode from the reconstructed brightness. Via a threshold on the residuals between these two curves, it is possible to highlight the outliers in the measured profile and correct their values knowing the gain ratio between the old pulse and the new pulse.
+
+Once the analysis is completed one can save (cache) the "new" data, using the same format of the qxtdataaccess gui, so that it is usable for other purposes.
+
+One can install the package on 
+
+## Neural Network 
+The NN used in this project, previously introduced, is developed using the `lightning` python library ([lightning docs](https://lightning.ai/docs/overview/getting-started)), which is based on `pytorch`. 
+
+All of the necessary code for the NN is contained in the src/app/engine folder. The model.py file contains the base file with the network structure, trainer.py contains the functions used for the neural network training and database.py has the data loader and dataset classes for file reading and manipulation to make them NN compliant.
+
+### Training
+The training of the NN is done via running the script train.py in the enigine folder. There is a config dictionary with the various parameters it is possible to tweak for each running. Before training a database over which doing the training procedure should be created, this can be done by running the function `consolidate_pulses` contained in src/engine/pulse_dataset.py script The training procedure can be run by calling the `train_autoencoder` function in src/engine/train.py. Refer to this function documentation for more information. 
+
+#### Attention
+It may be that the trainig routine, if run locally on IPPs PCs, tries to select the available GPUs, even if it may not be posible to use them. 
+If this is the case, before running the training procedure, one should run the following command:
+
+`export CUDA_VISIBLE_DEVICES=''`
+
+in order to deselect the possibility of using said GPUs.
+
+### Jupyter Notebooks
+This is a useful tool for exploring the code and see hands-on examples on how the various steps work together, however it can be messy inside a .git repository, in order to avoid embedding in the version control a great amount of useless data and plot, the use of the nbstripout package is strongly recommended. This package, a possible implementation can be found [here](https://pypi.org/project/nbstripout/) under the 'Using as a Git filter' section.
+
+### Notice 
+For any problems, doubts or error with the code, contact luca.orlandi@igi.cnr.it