# Algorithm Infrastructure ## Overview An algorithm in the uLib infrastructure is a class for containing a functional that can be dynamically loaded into memory as a plug-in. It derives from the base `Object` class (`Core/Object.h`) and therefore can contain properties that define the serialization of operating parameters or the implementation of widgets for interactive parameter manipulation. The algorithm class is designed to be inserted into an `AlgorithmTask`, a class for managing the execution of scheduled operations. A task contains `Run` and `Stop` methods to start and stop execution. A task can be configured to work in two modes: - **Cyclic mode**: the algorithm is executed periodically with a configurable cycle time. - **Asynchronous mode**: the task waits for a trigger before each execution. Triggers can come from the uLib signal-slot system (`Object::connect`) or from a condition variable as defined in the monitor pattern (`Core/Monitor.h`). The algorithm is defined as a template class on two types `T_enc` and `T_dec`. The encoder is a type for data input or another algorithm that is chained with this one and outputs data in a compatible format. The decoder is the type of data output or a downstream algorithm compatible with it. ## Class Hierarchy ``` Object (Core/Object.h) | +-- Algorithm (Core/Algorithm.h) | | | +-- VoxImageFilter (Math/VoxImageFilter.h) | | | +-- VoxFilterAlgorithmLinear (Math/VoxImageFilterLinear.hpp) | +-- VoxFilterAlgorithmMedian (Math/VoxImageFilterMedian.hpp) | +-- VoxFilterAlgorithmAbtrim (Math/VoxImageFilterABTrim.hpp) | +-- VoxFilterAlgorithmSPR (Math/VoxImageFilterABTrim.hpp) | +-- VoxFilterAlgorithmThreshold (Math/VoxImageFilterThreshold.hpp) | +-- VoxFilterAlgorithmBilateral (Math/VoxImageFilterBilateral.hpp) | +-- VoxFilterAlgorithmBilateralTrim(Math/VoxImageFilterBilateral.hpp) | +-- VoxFilterAlgorithm2ndStat (Math/VoxImageFilter2ndStat.hpp) | +-- VoxFilterAlgorithmCustom (Math/VoxImageFilterCustom.hpp) | +-- Thread (Core/Threads.h) | +-- AlgorithmTask (Core/Algorithm.h) ``` ## Algorithm (`Core/Algorithm.h`) ### Template Parameters ```cpp template class Algorithm : public Object; ``` - **`T_enc`** (Encoder): the input data type. Can be a raw data type or a pointer to a data structure. When chaining algorithms, the upstream algorithm's `T_dec` must be compatible with this algorithm's `T_enc`. - **`T_dec`** (Decoder): the output data type. Produced by `Process()` and consumed by the next algorithm in the chain. ### Core Interface | Method | Description | |--------|-------------| | `virtual T_dec Process(const T_enc& input) = 0` | Pure virtual. Implement the algorithm logic here. | | `T_dec operator()(const T_enc& input)` | Calls `Process()`. Enables functional syntax: `result = alg(data)`. | ### Algorithm Chaining Algorithms can be linked in processing pipelines via encoder/decoder pointers: ```cpp Algorithm* upstream; // SetEncoder() / GetEncoder() Algorithm* downstream; // SetDecoder() / GetDecoder() ``` This allows building chains like: ``` [RawData] --> AlgorithmA --> AlgorithmB --> [Result] encoder decoder ``` ### Signals | Signal | Emitted when | |--------|-------------| | `Started()` | The algorithm begins processing (caller responsibility). | | `Finished()` | The algorithm completes processing (caller responsibility). | ### Device Preference (CUDA) Algorithms report their preferred execution device via `GetPreferredDevice()`: | Method | Description | |--------|-------------| | `virtual MemoryDevice GetPreferredDevice() const` | Returns `RAM` or `VRAM`. Subclasses override. | | `void SetPreferredDevice(MemoryDevice dev)` | Manually set the device preference. | | `bool IsGPU() const` | Shorthand for `GetPreferredDevice() == VRAM`. | GPU-based algorithms are responsible for calling `cudaDeviceSynchronize()` inside their `Process()` implementation before returning, so that results are available to the caller or downstream algorithm. ### Example: Defining a Custom Algorithm ```cpp class MyFilter : public Algorithm*, VoxImage*> { public: const char* GetClassName() const override { return "MyFilter"; } VoxImage* Process(VoxImage* const& image) override { // ... filter the image in-place ... return image; } }; ``` ## AlgorithmTask (`Core/Algorithm.h`) `AlgorithmTask` manages the execution of an `Algorithm` within a scheduled, threaded context. It inherits from `Thread` (`Core/Threads.h`) and uses `Mutex` (`Core/Monitor.h`) for synchronization. ### Template Parameters ```cpp template class AlgorithmTask : public Thread; ``` Must match the `Algorithm` it manages. ### Configuration | Method | Description | |--------|-------------| | `void SetAlgorithm(AlgorithmType* alg)` | Set the algorithm to execute. | | `void SetMode(Mode mode)` | `Cyclic` or `Async`. | | `void SetCycleTime(int ms)` | Period for cyclic mode (milliseconds). | ### Execution Modes #### Cyclic Mode The algorithm's `Process()` is called periodically. The cycle waits on a `condition_variable_any` with timeout, so `Stop()` can interrupt immediately without waiting for the full cycle. ```cpp AlgorithmTask task; task.SetAlgorithm(&myAlgorithm); task.SetMode(AlgorithmTask::Cyclic); task.SetCycleTime(100); // every 100ms task.Run(inputData); // ... later ... task.Stop(); ``` #### Asynchronous Mode The task thread blocks on a condition variable until `Notify()` is called. Each notification triggers exactly one `Process()` invocation. ```cpp task.SetMode(AlgorithmTask::Async); task.Run(inputData); // Trigger manually: task.Notify(); // Or connect to a signal: task.ConnectTrigger(sender, &SenderClass::DataReady); // Now each emission of DataReady() triggers one Process() call. ``` ### Lifecycle | Method | Description | |--------|-------------| | `void Run(const T_enc& input)` | Starts the background thread with the given input. | | `void Stop()` | Requests stop and joins the thread. | | `bool IsRunning()` | Inherited from `Thread`. | ### Signals | Signal | Emitted when | |--------|-------------| | `Stopped()` | The task thread has completed (after last `Process()` and before thread exit). | ### Signal-Slot Triggering `ConnectTrigger()` connects any uLib `Object` signal to the task's `Notify()` method: ```cpp task.ConnectTrigger(detector, &Detector::EventReady); ``` This uses the uLib signal system (`Core/Signal.h`), not Qt signals. The connection is type-safe and works with the `Object::connect` infrastructure. ## VoxImageFilter (`Math/VoxImageFilter.h`) `VoxImageFilter` specializes `Algorithm` for kernel-based volumetric image filtering. It uses CRTP (Curiously Recurring Template Pattern) so that concrete filters provide their `Evaluate()` method without virtual dispatch overhead in the inner loop. ### Template Parameters ```cpp template class VoxImageFilter : public Abstract::VoxImageFilter, public Algorithm*, VoxImage*>; ``` - **`VoxelT`**: the voxel data type (must satisfy `Interface::Voxel` — requires `.Value` and `.Count` fields). - **`CrtpImplT`**: the concrete filter subclass. Must implement: ```cpp float Evaluate(const VoxImage& buffer, int index); ``` ### How It Works 1. `Process(image)` creates a read-only buffer copy of the input image. 2. For each voxel in parallel (OpenMP), it calls `CrtpImplT::Evaluate(buffer, index)`. 3. `Evaluate()` reads from the buffer using the kernel offsets and writes the result. 4. The filtered image is returned (in-place modification). ``` Process(image) | +-- buffer = copy of image (read-only snapshot) | +-- #pragma omp parallel for | for each voxel i: | image[i].Value = CrtpImplT::Evaluate(buffer, i) | +-- return image ``` ### Kernel System The `Kernel` class stores convolution weights and precomputed index offsets: | Method | Description | |--------|-------------| | `SetKernelNumericXZY(values)` | Set kernel weights from a flat vector (XZY order). | | `SetKernelSpherical(shape)` | Set weights via a radial function `f(distance^2)`. | | `SetKernelWeightFunction(shape)` | Set weights via a 3D position function `f(Vector3f)`. | ### CUDA Support Concrete filters can override `Process()` with a CUDA implementation: ```cpp #if defined(USE_CUDA) && defined(__CUDACC__) VoxImage* Process(VoxImage* const& image) override { if (this->GetPreferredDevice() == MemoryDevice::VRAM) { // Launch CUDA kernel, synchronize, return } else { return BaseClass::Process(image); // CPU fallback } } #endif ``` The base class `GetPreferredDevice()` automatically returns `VRAM` when the image or kernel data resides on the GPU, enabling transparent device dispatch. Filters with CUDA implementations: `VoxFilterAlgorithmLinear`, `VoxFilterAlgorithmAbtrim`, `VoxFilterAlgorithmSPR`. ### Concrete Filters | Filter | File | Description | |--------|------|-------------| | `VoxFilterAlgorithmLinear` | `VoxImageFilterLinear.hpp` | Weighted linear convolution (FIR filter). CUDA-enabled. | | `VoxFilterAlgorithmMedian` | `VoxImageFilterMedian.hpp` | Median filter with kernel-weighted sorting. | | `VoxFilterAlgorithmAbtrim` | `VoxImageFilterABTrim.hpp` | Alpha-beta trimmed mean filter. CUDA-enabled. | | `VoxFilterAlgorithmSPR` | `VoxImageFilterABTrim.hpp` | Robespierre filter: trimmed mean applied only to outlier voxels. CUDA-enabled. | | `VoxFilterAlgorithmThreshold` | `VoxImageFilterThreshold.hpp` | Binary threshold filter. | | `VoxFilterAlgorithmBilateral` | `VoxImageFilterBilateral.hpp` | Edge-preserving bilateral filter (intensity-weighted Gaussian). | | `VoxFilterAlgorithmBilateralTrim` | `VoxImageFilterBilateral.hpp` | Bilateral filter with alpha-beta trimming. | | `VoxFilterAlgorithm2ndStat` | `VoxImageFilter2ndStat.hpp` | Local variance (second-order statistic). | | `VoxFilterAlgorithmCustom` | `VoxImageFilterCustom.hpp` | User-supplied evaluation function via function pointer. | ### Example: Using a Filter with AlgorithmTask ```cpp // Create filter and configure kernel VoxFilterAlgorithmLinear filter(Vector3i(3, 3, 3)); std::vector weights(27, 1.0f); // uniform 3x3x3 filter.SetKernelNumericXZY(weights); // Direct use filter.SetImage(&image); filter.Run(); // Or via Algorithm interface VoxImage* result = filter.Process(&image); // Or scheduled in a task AlgorithmTask*, VoxImage*> task; task.SetAlgorithm(&filter); task.SetMode(AlgorithmTask*, VoxImage*>::Cyclic); task.SetCycleTime(500); task.Run(&image); ``` ## Structural Benefits ### 1. Uniform Processing Interface Every algorithm — from a simple threshold to a GPU-accelerated convolution — exposes the same `Process(input) -> output` interface. Client code does not need to know the concrete type: ```cpp Algorithm*, VoxImage*>* alg = &anyFilter; alg->Process(&image); ``` ### 2. Pipeline Composition The encoder/decoder chaining allows building data processing pipelines where each stage transforms data and passes it to the next. Type safety is enforced at compile time through template parameters. ### 3. Scheduled and Event-Driven Execution `AlgorithmTask` decouples the algorithm from its execution schedule. The same algorithm can be: - Called directly (`Process()`) - Run periodically (Cyclic mode for monitoring/acquisition) - Triggered by events (Async mode for reactive processing) ### 4. Transparent CPU/GPU Dispatch The `MemoryDevice` preference and `GetPreferredDevice()` virtual allow the same algorithm interface to dispatch to CPU or GPU implementations. The `DataAllocator` transparently manages RAM/VRAM transfers, and concrete filters override `Process()` with CUDA kernels when data is on the GPU. ### 5. Integration with the Object System Since `Algorithm` inherits from `Object`, algorithms gain: - **Properties**: serializable parameters via the `Property` system, enabling persistent configuration and GUI widget generation. - **Signals**: `Started`/`Finished` notifications for connecting to monitoring or logging. - **Serialization**: save/load algorithm configuration via Boost archives. - **Instance naming**: `SetInstanceName()` for runtime identification in contexts. ### 6. CRTP Performance for Inner Loops `VoxImageFilter` uses CRTP to dispatch to `Evaluate()` without virtual function overhead. The per-voxel evaluation runs at full speed inside OpenMP parallel loops, while the outer `Process()` method remains virtual for polymorphic use through the Algorithm interface. ## Dependencies ``` Core/Object.h — base class, properties, signals, serialization Core/Signal.h — signal-slot connection infrastructure Core/Monitor.h — Mutex, condition variables, ULIB_MUTEX_LOCK Core/Threads.h — Thread base class for AlgorithmTask Core/DataAllocator.h — MemoryDevice enum, RAM/VRAM data management Math/VoxImage.h — volumetric image container Math/VoxImageFilter.h — kernel-based filter framework ```