algorithm chain for ram-vram

This commit is contained in:
AndreaRigoni
2026-03-28 08:22:14 +00:00
parent ec2027e980
commit 876b8f4592
8 changed files with 883 additions and 172 deletions

View File

@@ -1,7 +1,338 @@
# Aggoritm definition
# Algorithm Infrastructure
## Overview
An algorithm in the uLib infrastructure is a class for containing a functional that can be dynamically loaded into memory as a plug-in.
It derives from the base Object class and therefore can contain properties that define the serialization of operating parameters or the implementation of widgets for interactive parameter manipulation.
The algorithm class is designed to be inserted into a Task, i.e., a class for managing the execution of scheduled operations. A task contains a Run and Stop method to start and stop execution. Furthermore, a task can be configured to work in cyclic or asynchronous mode: in cyclic mode it will be possible to define a cycle time, while in asynchronous mode a task can be hooked to a signal-slot of the Object structure or to a condition variable defined in the monitor pattern (Monitor.h).
It derives from the base `Object` class (`Core/Object.h`) and therefore can contain properties that define the serialization of operating parameters or the implementation of widgets for interactive parameter manipulation.
The algorithm class is designed to be inserted into an `AlgorithmTask`, a class for managing the execution of scheduled operations. A task contains `Run` and `Stop` methods to start and stop execution. A task can be configured to work in two modes:
- **Cyclic mode**: the algorithm is executed periodically with a configurable cycle time.
- **Asynchronous mode**: the task waits for a trigger before each execution. Triggers can come from the uLib signal-slot system (`Object::connect`) or from a condition variable as defined in the monitor pattern (`Core/Monitor.h`).
The algorithm is defined as a template class on two types `T_enc` and `T_dec`. The encoder is a type for data input or another algorithm that is chained with this one and outputs data in a compatible format. The decoder is the type of data output or a downstream algorithm compatible with it.
## Class Hierarchy
```
Object (Core/Object.h)
|
+-- Algorithm<T_enc, T_dec> (Core/Algorithm.h)
| |
| +-- VoxImageFilter<VoxelT, CrtpImplT> (Math/VoxImageFilter.h)
| |
| +-- VoxFilterAlgorithmLinear (Math/VoxImageFilterLinear.hpp)
| +-- VoxFilterAlgorithmMedian (Math/VoxImageFilterMedian.hpp)
| +-- VoxFilterAlgorithmAbtrim (Math/VoxImageFilterABTrim.hpp)
| +-- VoxFilterAlgorithmSPR (Math/VoxImageFilterABTrim.hpp)
| +-- VoxFilterAlgorithmThreshold (Math/VoxImageFilterThreshold.hpp)
| +-- VoxFilterAlgorithmBilateral (Math/VoxImageFilterBilateral.hpp)
| +-- VoxFilterAlgorithmBilateralTrim(Math/VoxImageFilterBilateral.hpp)
| +-- VoxFilterAlgorithm2ndStat (Math/VoxImageFilter2ndStat.hpp)
| +-- VoxFilterAlgorithmCustom (Math/VoxImageFilterCustom.hpp)
|
+-- Thread (Core/Threads.h)
|
+-- AlgorithmTask<T_enc, T_dec> (Core/Algorithm.h)
```
## Algorithm (`Core/Algorithm.h`)
### Template Parameters
```cpp
template <typename T_enc, typename T_dec>
class Algorithm : public Object;
```
- **`T_enc`** (Encoder): the input data type. Can be a raw data type or a pointer to a data structure. When chaining algorithms, the upstream algorithm's `T_dec` must be compatible with this algorithm's `T_enc`.
- **`T_dec`** (Decoder): the output data type. Produced by `Process()` and consumed by the next algorithm in the chain.
### Core Interface
| Method | Description |
|--------|-------------|
| `virtual T_dec Process(const T_enc& input) = 0` | Pure virtual. Implement the algorithm logic here. |
| `T_dec operator()(const T_enc& input)` | Calls `Process()`. Enables functional syntax: `result = alg(data)`. |
### Algorithm Chaining
Algorithms can be linked in processing pipelines via encoder/decoder pointers:
```cpp
Algorithm* upstream; // SetEncoder() / GetEncoder()
Algorithm* downstream; // SetDecoder() / GetDecoder()
```
This allows building chains like:
```
[RawData] --> AlgorithmA --> AlgorithmB --> [Result]
encoder decoder
```
### Signals
| Signal | Emitted when |
|--------|-------------|
| `Started()` | The algorithm begins processing (caller responsibility). |
| `Finished()` | The algorithm completes processing (caller responsibility). |
### Device Preference (CUDA)
Algorithms report their preferred execution device via `GetPreferredDevice()`:
| Method | Description |
|--------|-------------|
| `virtual MemoryDevice GetPreferredDevice() const` | Returns `RAM` or `VRAM`. Subclasses override. |
| `void SetPreferredDevice(MemoryDevice dev)` | Manually set the device preference. |
| `bool IsGPU() const` | Shorthand for `GetPreferredDevice() == VRAM`. |
GPU-based algorithms are responsible for calling `cudaDeviceSynchronize()` inside their `Process()` implementation before returning, so that results are available to the caller or downstream algorithm.
### Example: Defining a Custom Algorithm
```cpp
class MyFilter : public Algorithm<VoxImage<Voxel>*, VoxImage<Voxel>*> {
public:
const char* GetClassName() const override { return "MyFilter"; }
VoxImage<Voxel>* Process(VoxImage<Voxel>* const& image) override {
// ... filter the image in-place ...
return image;
}
};
```
## AlgorithmTask (`Core/Algorithm.h`)
`AlgorithmTask` manages the execution of an `Algorithm` within a scheduled, threaded context. It inherits from `Thread` (`Core/Threads.h`) and uses `Mutex` (`Core/Monitor.h`) for synchronization.
### Template Parameters
```cpp
template <typename T_enc, typename T_dec>
class AlgorithmTask : public Thread;
```
Must match the `Algorithm<T_enc, T_dec>` it manages.
### Configuration
| Method | Description |
|--------|-------------|
| `void SetAlgorithm(AlgorithmType* alg)` | Set the algorithm to execute. |
| `void SetMode(Mode mode)` | `Cyclic` or `Async`. |
| `void SetCycleTime(int ms)` | Period for cyclic mode (milliseconds). |
### Execution Modes
#### Cyclic Mode
The algorithm's `Process()` is called periodically. The cycle waits on a `condition_variable_any` with timeout, so `Stop()` can interrupt immediately without waiting for the full cycle.
```cpp
AlgorithmTask<int, int> task;
task.SetAlgorithm(&myAlgorithm);
task.SetMode(AlgorithmTask<int, int>::Cyclic);
task.SetCycleTime(100); // every 100ms
task.Run(inputData);
// ... later ...
task.Stop();
```
#### Asynchronous Mode
The task thread blocks on a condition variable until `Notify()` is called. Each notification triggers exactly one `Process()` invocation.
```cpp
task.SetMode(AlgorithmTask<int, int>::Async);
task.Run(inputData);
// Trigger manually:
task.Notify();
// Or connect to a signal:
task.ConnectTrigger(sender, &SenderClass::DataReady);
// Now each emission of DataReady() triggers one Process() call.
```
### Lifecycle
| Method | Description |
|--------|-------------|
| `void Run(const T_enc& input)` | Starts the background thread with the given input. |
| `void Stop()` | Requests stop and joins the thread. |
| `bool IsRunning()` | Inherited from `Thread`. |
### Signals
| Signal | Emitted when |
|--------|-------------|
| `Stopped()` | The task thread has completed (after last `Process()` and before thread exit). |
### Signal-Slot Triggering
`ConnectTrigger()` connects any uLib `Object` signal to the task's `Notify()` method:
```cpp
task.ConnectTrigger(detector, &Detector::EventReady);
```
This uses the uLib signal system (`Core/Signal.h`), not Qt signals. The connection is type-safe and works with the `Object::connect` infrastructure.
## VoxImageFilter (`Math/VoxImageFilter.h`)
`VoxImageFilter` specializes `Algorithm` for kernel-based volumetric image filtering. It uses CRTP (Curiously Recurring Template Pattern) so that concrete filters provide their `Evaluate()` method without virtual dispatch overhead in the inner loop.
### Template Parameters
```cpp
template <typename VoxelT, typename CrtpImplT>
class VoxImageFilter : public Abstract::VoxImageFilter,
public Algorithm<VoxImage<VoxelT>*, VoxImage<VoxelT>*>;
```
- **`VoxelT`**: the voxel data type (must satisfy `Interface::Voxel` — requires `.Value` and `.Count` fields).
- **`CrtpImplT`**: the concrete filter subclass. Must implement:
```cpp
float Evaluate(const VoxImage<VoxelT>& buffer, int index);
```
### How It Works
1. `Process(image)` creates a read-only buffer copy of the input image.
2. For each voxel in parallel (OpenMP), it calls `CrtpImplT::Evaluate(buffer, index)`.
3. `Evaluate()` reads from the buffer using the kernel offsets and writes the result.
4. The filtered image is returned (in-place modification).
```
Process(image)
|
+-- buffer = copy of image (read-only snapshot)
|
+-- #pragma omp parallel for
| for each voxel i:
| image[i].Value = CrtpImplT::Evaluate(buffer, i)
|
+-- return image
```
### Kernel System
The `Kernel<VoxelT>` class stores convolution weights and precomputed index offsets:
| Method | Description |
|--------|-------------|
| `SetKernelNumericXZY(values)` | Set kernel weights from a flat vector (XZY order). |
| `SetKernelSpherical(shape)` | Set weights via a radial function `f(distance^2)`. |
| `SetKernelWeightFunction(shape)` | Set weights via a 3D position function `f(Vector3f)`. |
### CUDA Support
Concrete filters can override `Process()` with a CUDA implementation:
```cpp
#if defined(USE_CUDA) && defined(__CUDACC__)
VoxImage<VoxelT>* Process(VoxImage<VoxelT>* const& image) override {
if (this->GetPreferredDevice() == MemoryDevice::VRAM) {
// Launch CUDA kernel, synchronize, return
} else {
return BaseClass::Process(image); // CPU fallback
}
}
#endif
```
The base class `GetPreferredDevice()` automatically returns `VRAM` when the image or kernel data resides on the GPU, enabling transparent device dispatch.
Filters with CUDA implementations: `VoxFilterAlgorithmLinear`, `VoxFilterAlgorithmAbtrim`, `VoxFilterAlgorithmSPR`.
### Concrete Filters
| Filter | File | Description |
|--------|------|-------------|
| `VoxFilterAlgorithmLinear` | `VoxImageFilterLinear.hpp` | Weighted linear convolution (FIR filter). CUDA-enabled. |
| `VoxFilterAlgorithmMedian` | `VoxImageFilterMedian.hpp` | Median filter with kernel-weighted sorting. |
| `VoxFilterAlgorithmAbtrim` | `VoxImageFilterABTrim.hpp` | Alpha-beta trimmed mean filter. CUDA-enabled. |
| `VoxFilterAlgorithmSPR` | `VoxImageFilterABTrim.hpp` | Robespierre filter: trimmed mean applied only to outlier voxels. CUDA-enabled. |
| `VoxFilterAlgorithmThreshold` | `VoxImageFilterThreshold.hpp` | Binary threshold filter. |
| `VoxFilterAlgorithmBilateral` | `VoxImageFilterBilateral.hpp` | Edge-preserving bilateral filter (intensity-weighted Gaussian). |
| `VoxFilterAlgorithmBilateralTrim` | `VoxImageFilterBilateral.hpp` | Bilateral filter with alpha-beta trimming. |
| `VoxFilterAlgorithm2ndStat` | `VoxImageFilter2ndStat.hpp` | Local variance (second-order statistic). |
| `VoxFilterAlgorithmCustom` | `VoxImageFilterCustom.hpp` | User-supplied evaluation function via function pointer. |
### Example: Using a Filter with AlgorithmTask
```cpp
// Create filter and configure kernel
VoxFilterAlgorithmLinear<Voxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f); // uniform 3x3x3
filter.SetKernelNumericXZY(weights);
// Direct use
filter.SetImage(&image);
filter.Run();
// Or via Algorithm interface
VoxImage<Voxel>* result = filter.Process(&image);
// Or scheduled in a task
AlgorithmTask<VoxImage<Voxel>*, VoxImage<Voxel>*> task;
task.SetAlgorithm(&filter);
task.SetMode(AlgorithmTask<VoxImage<Voxel>*, VoxImage<Voxel>*>::Cyclic);
task.SetCycleTime(500);
task.Run(&image);
```
## Structural Benefits
### 1. Uniform Processing Interface
Every algorithm — from a simple threshold to a GPU-accelerated convolution — exposes the same `Process(input) -> output` interface. Client code does not need to know the concrete type:
```cpp
Algorithm<VoxImage<Voxel>*, VoxImage<Voxel>*>* alg = &anyFilter;
alg->Process(&image);
```
### 2. Pipeline Composition
The encoder/decoder chaining allows building data processing pipelines where each stage transforms data and passes it to the next. Type safety is enforced at compile time through template parameters.
### 3. Scheduled and Event-Driven Execution
`AlgorithmTask` decouples the algorithm from its execution schedule. The same algorithm can be:
- Called directly (`Process()`)
- Run periodically (Cyclic mode for monitoring/acquisition)
- Triggered by events (Async mode for reactive processing)
### 4. Transparent CPU/GPU Dispatch
The `MemoryDevice` preference and `GetPreferredDevice()` virtual allow the same algorithm interface to dispatch to CPU or GPU implementations. The `DataAllocator` transparently manages RAM/VRAM transfers, and concrete filters override `Process()` with CUDA kernels when data is on the GPU.
### 5. Integration with the Object System
Since `Algorithm` inherits from `Object`, algorithms gain:
- **Properties**: serializable parameters via the `Property<T>` system, enabling persistent configuration and GUI widget generation.
- **Signals**: `Started`/`Finished` notifications for connecting to monitoring or logging.
- **Serialization**: save/load algorithm configuration via Boost archives.
- **Instance naming**: `SetInstanceName()` for runtime identification in contexts.
### 6. CRTP Performance for Inner Loops
`VoxImageFilter` uses CRTP to dispatch to `Evaluate()` without virtual function overhead. The per-voxel evaluation runs at full speed inside OpenMP parallel loops, while the outer `Process()` method remains virtual for polymorphic use through the Algorithm interface.
## Dependencies
```
Core/Object.h — base class, properties, signals, serialization
Core/Signal.h — signal-slot connection infrastructure
Core/Monitor.h — Mutex, condition variables, ULIB_MUTEX_LOCK
Core/Threads.h — Thread base class for AlgorithmTask
Core/DataAllocator.h — MemoryDevice enum, RAM/VRAM data management
Math/VoxImage.h — volumetric image container
Math/VoxImageFilter.h — kernel-based filter framework
```
The algorithm in particular is defined as a template class on two types T_enc, T_dec. The encoder is a type for data input or another algorithm that is chained with this one that outputs data in the format compatible with input. The decoder is the type of data output or an algorithm compatible with it.

View File

@@ -30,14 +30,9 @@
#include <chrono>
#include <condition_variable>
#ifdef USE_CUDA
#include <cuda_runtime.h>
#endif
#include "Core/Object.h"
#include "Core/Monitor.h"
#include "Core/Threads.h"
#include "Core/Property.h"
#include "Core/DataAllocator.h"
namespace uLib {
@@ -51,6 +46,10 @@ namespace uLib {
* dynamically loaded as a plug-in. It derives from Object and supports
* properties for serialization and interactive parameter widgets.
*
* Algorithms are responsible for their own GPU synchronization: if Process()
* launches CUDA kernels, it must call cudaDeviceSynchronize() before returning
* so that the result is available to the caller or downstream algorithm.
*
* @tparam T_enc Encoder type: the input data type, or a chained algorithm
* whose output is compatible with this algorithm's input.
* @tparam T_dec Decoder type: the output data type, or a chained algorithm
@@ -72,37 +71,41 @@ public:
virtual const char* GetClassName() const override { return "Algorithm"; }
// Processing ///////////////////////////////////////////////////////////////
/**
* @brief Process input data and produce output.
* Override this in subclasses to implement the algorithm logic.
* GPU-based implementations must synchronize before returning.
*/
virtual T_dec Process(const T_enc& input) = 0;
/**
* @brief Operator form of Process for functional chaining.
*/
/** @brief Operator form of Process for functional chaining. */
T_dec operator()(const T_enc& input) { return Process(input); }
// Chaining /////////////////////////////////////////////////////////////////
void SetEncoder(Algorithm* enc) { m_Encoder = enc; }
Algorithm* GetEncoder() const { return m_Encoder; }
void SetDecoder(Algorithm* dec) { m_Decoder = dec; }
Algorithm* GetDecoder() const { return m_Decoder; }
// Device preference ////////////////////////////////////////////////////////
/**
* @brief Returns the preferred memory device for this algorithm.
* CUDA-capable algorithms should override to return VRAM when their
* data resides on the GPU. AlgorithmTask uses this to synchronize
* appropriately.
* data resides on the GPU.
*/
virtual MemoryDevice GetPreferredDevice() const { return m_PreferredDevice; }
void SetPreferredDevice(MemoryDevice dev) { m_PreferredDevice = dev; }
/**
* @brief Returns true if this algorithm prefers GPU execution.
*/
/** @brief Returns true if this algorithm prefers GPU execution. */
bool IsGPU() const { return GetPreferredDevice() == MemoryDevice::VRAM; }
// Signals //////////////////////////////////////////////////////////////////
signals:
virtual void Started() { ULIB_SIGNAL_EMIT(Algorithm::Started); }
virtual void Finished() { ULIB_SIGNAL_EMIT(Algorithm::Finished); }
@@ -121,9 +124,13 @@ protected:
/**
* @brief AlgorithmTask manages the execution of an Algorithm within a
* scheduled context. Uses uLib::Thread for execution and uLib::Mutex for
* synchronization. Supports cyclic mode (with configurable period) and
* asynchronous mode (triggered by Object signal-slot or condition variable
* from Monitor.h).
* synchronization.
*
* Two execution modes:
* - Cyclic: executes Process() periodically with configurable cycle time.
* - Async: waits for Notify() or a connected signal before each execution.
*
* GPU synchronization is the algorithm's responsibility (see Algorithm::Process).
*/
template <typename T_enc, typename T_dec>
class AlgorithmTask : public Thread {
@@ -145,6 +152,8 @@ public:
virtual const char* GetClassName() const override { return "AlgorithmTask"; }
// Configuration ////////////////////////////////////////////////////////////
void SetAlgorithm(AlgorithmType* alg) { m_Algorithm = alg; }
AlgorithmType* GetAlgorithm() const { return m_Algorithm; }
@@ -154,6 +163,8 @@ public:
void SetCycleTime(int milliseconds) { m_CycleTime_ms = milliseconds; }
int GetCycleTime() const { return m_CycleTime_ms; }
// Lifecycle ////////////////////////////////////////////////////////////////
/**
* @brief Start the task execution in a separate thread (via Thread::Start).
* In Cyclic mode, the algorithm is executed periodically.
@@ -167,19 +178,17 @@ public:
Start();
}
/**
* @brief Stop the task execution and join the thread.
*/
/** @brief Stop the task execution and join the thread. */
void Stop() {
m_StopRequested.store(true);
{
ULIB_MUTEX_LOCK(m_WaitMutex, -1) {
m_Condition.notify_all();
}
}
if (IsJoinable()) Join();
}
// Async triggering /////////////////////////////////////////////////////////
/**
* @brief Notify the task to execute one iteration (Async mode).
* Can be called from a signal-slot connection or externally.
@@ -200,36 +209,25 @@ public:
return Object::connect(sender, sigf, [this]() { Notify(); });
}
// Signals //////////////////////////////////////////////////////////////////
signals:
virtual void Stopped() { ULIB_SIGNAL_EMIT(AlgorithmTask::Stopped); }
protected:
/**
* @brief Thread entry point — dispatches to cyclic or async loop.
*/
/** @brief Thread entry point — dispatches to cyclic or async loop. */
void Run() override {
if (m_Mode == Cyclic) {
if (m_Mode == Cyclic)
RunCyclic();
} else {
else
RunAsync();
}
Stopped();
}
private:
void ExecuteAlgorithm() {
if (!m_Algorithm) return;
m_Algorithm->Process(m_Input);
#ifdef USE_CUDA
if (m_Algorithm->IsGPU()) {
cudaDeviceSynchronize();
}
#endif
}
void RunCyclic() {
while (!m_StopRequested.load()) {
ExecuteAlgorithm();
if (m_Algorithm) m_Algorithm->Process(m_Input);
std::unique_lock<std::timed_mutex> lock(m_WaitMutex.GetNative());
m_Condition.wait_for(lock,
std::chrono::milliseconds(m_CycleTime_ms),
@@ -245,7 +243,7 @@ private:
});
if (m_StopRequested.load()) break;
m_Triggered.store(false);
ExecuteAlgorithm();
if (m_Algorithm) m_Algorithm->Process(m_Input);
}
}

View File

@@ -34,6 +34,9 @@
namespace uLib {
////////////////////////////////////////////////////////////////////////////////
// Kernel shape interface (static check for operator()(float) and operator()(Vector3f))
namespace Interface {
struct VoxImageFilterShape {
template <class Self> void check_structural() {
@@ -43,30 +46,47 @@ struct VoxImageFilterShape {
};
} // namespace Interface
////////////////////////////////////////////////////////////////////////////////
// Forward declaration
template <typename VoxelT> class Kernel;
////////////////////////////////////////////////////////////////////////////////
// Abstract interface (type-erased, used by python bindings)
namespace Abstract {
class VoxImageFilter {
public:
virtual void Run() = 0;
virtual void SetImage(Abstract::VoxImage *image) = 0;
protected:
virtual ~VoxImageFilter() {}
};
} // namespace Abstract
template <typename VoxelT, typename AlgorithmT>
////////////////////////////////////////////////////////////////////////////////
// VoxImageFilter — kernel-based voxel filter using CRTP + Algorithm
//
// Template parameters:
// VoxelT — voxel data type (must satisfy Interface::Voxel)
// CrtpImplT — concrete filter subclass (CRTP), must provide:
// float Evaluate(const VoxImage<VoxelT>& buffer, int index)
//
// Inherits Algorithm<VoxImage<VoxelT>*, VoxImage<VoxelT>*> so that filters
// can be used with AlgorithmTask for scheduled/async execution, and chained
// via encoder/decoder.
template <typename VoxelT, typename CrtpImplT>
class VoxImageFilter : public Abstract::VoxImageFilter,
public Algorithm<VoxImage<VoxelT>*, VoxImage<VoxelT>*> {
public:
virtual const char * GetClassName() const { return "VoxImageFilter"; }
virtual const char* GetClassName() const { return "VoxImageFilter"; }
VoxImageFilter(const Vector3i &size);
// Algorithm interface ////////////////////////////////////////////////////////
/**
* @brief Process implements Algorithm::Process.
* Applies the filter in-place on the input image and returns it.
@@ -79,9 +99,9 @@ public:
*/
void Run();
/**
* @brief Returns VRAM if image or kernel data is on GPU, RAM otherwise.
*/
// Device awareness ///////////////////////////////////////////////////////////
/** @brief Returns VRAM if image or kernel data is on GPU, RAM otherwise. */
MemoryDevice GetPreferredDevice() const override {
if (m_Image && m_Image->Data().GetDevice() == MemoryDevice::VRAM)
return MemoryDevice::VRAM;
@@ -90,38 +110,31 @@ public:
return MemoryDevice::RAM;
}
// Kernel setup ///////////////////////////////////////////////////////////////
void SetKernelNumericXZY(const std::vector<float> &numeric);
void SetKernelSpherical(float (*shape)(float));
template <class ShapeT> void SetKernelSpherical(ShapeT shape);
void SetKernelWeightFunction(float (*shape)(const Vector3f &));
template <class ShapeT> void SetKernelWeightFunction(ShapeT shape);
inline const Kernel<VoxelT> &GetKernelData() const {
return this->m_KernelData;
}
inline Kernel<VoxelT> &GetKernelData() { return this->m_KernelData; }
// Accessors //////////////////////////////////////////////////////////////////
inline VoxImage<VoxelT> *GetImage() const { return this->m_Image; }
const Kernel<VoxelT> &GetKernelData() const { return m_KernelData; }
Kernel<VoxelT> &GetKernelData() { return m_KernelData; }
VoxImage<VoxelT> *GetImage() const { return m_Image; }
void SetImage(Abstract::VoxImage *image);
protected:
float Convolve(const VoxImage<VoxelT> &buffer, int index); // remove //
void SetKernelOffset();
float Distance2(const Vector3i &v);
// protected members for algorithm access //
Kernel<VoxelT> m_KernelData;
VoxImage<VoxelT> *m_Image;
private:
AlgorithmT *t_Algoritm;
CrtpImplT *m_CrtpImpl;
};
} // namespace uLib

View File

@@ -33,7 +33,9 @@
namespace uLib {
// KERNEL //////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
//// KERNEL ////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
template <typename T> class Kernel : public StructuredData {
typedef StructuredData BaseClass;
@@ -41,13 +43,12 @@ template <typename T> class Kernel : public StructuredData {
public:
Kernel(const Vector3i &size);
inline T &operator[](const Vector3i &id) { return m_Data[Map(id)]; }
inline T &operator[](const int &id) { return m_Data[id]; }
inline int GetCenterData() const;
T &operator[](const Vector3i &id) { return m_Data[Map(id)]; }
T &operator[](const int &id) { return m_Data[id]; }
int GetCenterData() const;
inline DataAllocator<T> &Data() { return this->m_Data; }
inline const DataAllocator<T> &ConstData() const { return this->m_Data; }
DataAllocator<T> &Data() { return m_Data; }
const DataAllocator<T> &ConstData() const { return m_Data; }
void PrintSelf(std::ostream &o) const;
@@ -60,12 +61,14 @@ Kernel<T>::Kernel(const Vector3i &size) : BaseClass(size), m_Data(size.prod()) {
Interface::IsA<T, Interface::Voxel>();
}
template <typename T> inline int Kernel<T>::GetCenterData() const {
template <typename T>
int Kernel<T>::GetCenterData() const {
static int center = Map(this->GetDims() / 2);
return center;
}
template <typename T> void Kernel<T>::PrintSelf(std::ostream &o) const {
template <typename T>
void Kernel<T>::PrintSelf(std::ostream &o) const {
o << " Filter Kernel Dump [XZ_Y]: \n";
Vector3i index;
o << "\n Value: \n\n"
@@ -96,33 +99,42 @@ template <typename T> void Kernel<T>::PrintSelf(std::ostream &o) const {
}
}
////////////////////////////////////////////////////////////////////////////////
//// VOXIMAGEFILTER IMPLEMENTATION /////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
#define _TPL_ template <typename VoxelT, typename AlgorithmT>
#define _TPLT_ VoxelT, AlgorithmT
template <typename VoxelT, typename CrtpImplT>
VoxImageFilter<VoxelT, CrtpImplT>::VoxImageFilter(const Vector3i &size)
: m_KernelData(size)
, m_Image(nullptr)
, m_CrtpImpl(static_cast<CrtpImplT *>(this))
{}
_TPL_
VoxImageFilter<_TPLT_>::VoxImageFilter(const Vector3i &size)
: m_KernelData(size), t_Algoritm(static_cast<AlgorithmT *>(this)) {}
_TPL_
VoxImage<VoxelT>* VoxImageFilter<_TPLT_>::Process(VoxImage<VoxelT>* const& image) {
template <typename VoxelT, typename CrtpImplT>
VoxImage<VoxelT>* VoxImageFilter<VoxelT, CrtpImplT>::Process(
VoxImage<VoxelT>* const& image) {
if (m_Image != image) SetImage(image);
VoxImage<VoxelT> buffer = *m_Image;
#pragma omp parallel for
for (int i = 0; i < m_Image->Data().size(); ++i)
m_Image->operator[](i).Value = this->t_Algoritm->Evaluate(buffer, i);
m_Image->operator[](i).Value = m_CrtpImpl->Evaluate(buffer, i);
#pragma omp barrier
return m_Image;
}
_TPL_
void VoxImageFilter<_TPLT_>::Run() {
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::Run() {
Process(m_Image);
}
_TPL_
void VoxImageFilter<_TPLT_>::SetKernelOffset() {
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetImage(Abstract::VoxImage *image) {
m_Image = reinterpret_cast<VoxImage<VoxelT> *>(image);
SetKernelOffset();
}
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelOffset() {
Vector3i id(0, 0, 0);
for (int z = 0; z < m_KernelData.GetDims()(2); ++z) {
for (int x = 0; x < m_KernelData.GetDims()(0); ++x) {
@@ -134,10 +146,10 @@ void VoxImageFilter<_TPLT_>::SetKernelOffset() {
}
}
_TPL_
float VoxImageFilter<_TPLT_>::Distance2(const Vector3i &v) {
template <typename VoxelT, typename CrtpImplT>
float VoxImageFilter<VoxelT, CrtpImplT>::Distance2(const Vector3i &v) {
Vector3i tmp = v;
const Vector3i &dim = this->m_KernelData.GetDims();
const Vector3i &dim = m_KernelData.GetDims();
Vector3i center = dim / 2;
tmp = tmp - center;
center = center.cwiseProduct(center);
@@ -147,12 +159,9 @@ float VoxImageFilter<_TPLT_>::Distance2(const Vector3i &v) {
0.25 * (3 - (dim(0) % 2) - (dim(1) % 2) - (dim(2) % 2)));
}
_TPL_
void VoxImageFilter<_TPLT_>::SetKernelNumericXZY(
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelNumericXZY(
const std::vector<float> &numeric) {
// set data order //
StructuredData::Order order = m_KernelData.GetDataOrder();
// m_KernelData.SetDataOrder(StructuredData::XZY);
Vector3i id;
int index = 0;
for (int y = 0; y < m_KernelData.GetDims()(1); ++y) {
@@ -163,38 +172,39 @@ void VoxImageFilter<_TPLT_>::SetKernelNumericXZY(
}
}
}
// m_KernelData.SetDataOrder(order);
}
_TPL_
void VoxImageFilter<_TPLT_>::SetKernelSpherical(float (*shape)(float)) {
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelSpherical(
float (*shape)(float)) {
Vector3i id;
for (int y = 0; y < m_KernelData.GetDims()(1); ++y) {
for (int z = 0; z < m_KernelData.GetDims()(2); ++z) {
for (int x = 0; x < m_KernelData.GetDims()(0); ++x) {
id << x, y, z;
m_KernelData[id].Value = shape(this->Distance2(id));
m_KernelData[id].Value = shape(Distance2(id));
}
}
}
}
_TPL_ template <class ShapeT>
void VoxImageFilter<_TPLT_>::SetKernelSpherical(ShapeT shape) {
template <typename VoxelT, typename CrtpImplT>
template <class ShapeT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelSpherical(ShapeT shape) {
Interface::IsA<ShapeT, Interface::VoxImageFilterShape>();
Vector3i id;
for (int y = 0; y < m_KernelData.GetDims()(1); ++y) {
for (int z = 0; z < m_KernelData.GetDims()(2); ++z) {
for (int x = 0; x < m_KernelData.GetDims()(0); ++x) {
id << x, y, z;
m_KernelData[id].Value = shape(this->Distance2(id));
m_KernelData[id].Value = shape(Distance2(id));
}
}
}
}
_TPL_
void VoxImageFilter<_TPLT_>::SetKernelWeightFunction(
template <typename VoxelT, typename CrtpImplT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelWeightFunction(
float (*shape)(const Vector3f &)) {
const Vector3i &dim = m_KernelData.GetDims();
Vector3i id;
@@ -202,20 +212,19 @@ void VoxImageFilter<_TPLT_>::SetKernelWeightFunction(
for (int y = 0; y < dim(1); ++y) {
for (int z = 0; z < dim(2); ++z) {
for (int x = 0; x < dim(0); ++x) {
// get voxels centroid coords from kernel center //
id << x, y, z;
pt << id(0) - dim(0) / 2 + 0.5 * !(dim(0) % 2),
id(1) - dim(1) / 2 + 0.5 * !(dim(1) % 2),
id(2) - dim(2) / 2 + 0.5 * !(dim(2) % 2);
// compute function using given shape //
m_KernelData[id].Value = shape(pt);
}
}
}
}
_TPL_ template <class ShapeT>
void VoxImageFilter<_TPLT_>::SetKernelWeightFunction(ShapeT shape) {
template <typename VoxelT, typename CrtpImplT>
template <class ShapeT>
void VoxImageFilter<VoxelT, CrtpImplT>::SetKernelWeightFunction(ShapeT shape) {
Interface::IsA<ShapeT, Interface::VoxImageFilterShape>();
const Vector3i &dim = m_KernelData.GetDims();
Vector3i id;
@@ -223,45 +232,16 @@ void VoxImageFilter<_TPLT_>::SetKernelWeightFunction(ShapeT shape) {
for (int y = 0; y < dim(1); ++y) {
for (int z = 0; z < dim(2); ++z) {
for (int x = 0; x < dim(0); ++x) {
// get voxels centroid coords from kernel center //
id << x, y, z;
pt << id(0) - dim(0) / 2 + 0.5 * !(dim(0) % 2),
id(1) - dim(1) / 2 + 0.5 * !(dim(1) % 2),
id(2) - dim(2) / 2 + 0.5 * !(dim(2) % 2);
// compute function using given shape //
m_KernelData[id].Value = shape(pt);
}
}
}
}
_TPL_
void VoxImageFilter<_TPLT_>::SetImage(Abstract::VoxImage *image) {
this->m_Image = reinterpret_cast<VoxImage<VoxelT> *>(image);
this->SetKernelOffset();
}
_TPL_
float VoxImageFilter<_TPLT_>::Convolve(const VoxImage<VoxelT> &buffer,
int index) {
const DataAllocator<VoxelT> &vbuf = buffer.ConstData();
const DataAllocator<VoxelT> &vker = m_KernelData.ConstData();
int vox_size = vbuf.size();
int ker_size = vker.size();
int pos;
float conv = 0, ksum = 0;
for (int ik = 0; ik < ker_size; ++ik) {
pos = index + vker[ik].Count - vker[m_KernelData.GetCenterData()].Count;
pos = (pos + vox_size) % vox_size;
conv += vbuf[pos].Value * vker[ik].Value;
ksum += vker[ik].Value;
}
return conv / ksum;
}
#undef _TPLT_
#undef _TPL_
} // namespace uLib
#endif // VOXIMAGEFILTER_HPP

View File

@@ -30,8 +30,6 @@
#include "VoxImageFilter.h"
#include <Math/Dense.h>
#define likely(expr) __builtin_expect(!!(expr), 1)
////////////////////////////////////////////////////////////////////////////////
///// VOXIMAGE FILTER CUSTOM /////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
@@ -50,7 +48,7 @@ public:
: BaseClass(size), m_CustomEvaluate(NULL) {}
float Evaluate(const VoxImage<VoxelT> &buffer, int index) {
if (likely(m_CustomEvaluate)) {
if (m_CustomEvaluate) {
const DataAllocator<VoxelT> &vbuf = buffer.ConstData();
const DataAllocator<VoxelT> &vker = this->m_KernelData.ConstData();
int vox_size = vbuf.size();

View File

@@ -23,8 +23,6 @@
//////////////////////////////////////////////////////////////////////////////*/
#ifndef VOXIMAGEFILTERTHRESHOLD_HPP
#define VOXIMAGEFILTERTHRESHOLD_HPP
@@ -39,40 +37,24 @@
namespace uLib {
template <typename VoxelT>
class VoxFilterAlgorithmThreshold :
public VoxImageFilter<VoxelT, VoxFilterAlgorithmThreshold<VoxelT> > {
class VoxFilterAlgorithmThreshold
: public VoxImageFilter<VoxelT, VoxFilterAlgorithmThreshold<VoxelT>> {
typedef VoxImageFilter<VoxelT, VoxFilterAlgorithmThreshold<VoxelT> > BaseClass;
// ULIB_OBJECT_PARAMETERS(BaseClass) {
// float threshold;
// };
typedef VoxImageFilter<VoxelT, VoxFilterAlgorithmThreshold<VoxelT>> BaseClass;
float m_threshold;
public:
VoxFilterAlgorithmThreshold(const Vector3i &size) : BaseClass(size)
{
// init_parameters();
m_threshold = 0;
VoxFilterAlgorithmThreshold(const Vector3i &size)
: BaseClass(size), m_threshold(0) {}
void SetThreshold(float th) { m_threshold = th; }
float Evaluate(const VoxImage<VoxelT> &buffer, int index) {
return static_cast<float>(buffer.ConstData().at(index).Value >= m_threshold);
}
inline void SetThreshold(float th) { m_threshold = th; }
float Evaluate(const VoxImage<VoxelT> &buffer, int index)
{
return static_cast<float>(buffer.ConstData().at(index).Value >=
// parameters().threshold);
m_threshold );
}
};
//template <typename VoxelT>
//inline void VoxFilterAlgorithmThreshold<VoxelT>::init_parameters()
//{
// parameters().threshold = 0;
//}
}
} // namespace uLib
#endif // VOXIMAGEFILTERTHRESHOLD_HPP

View File

@@ -0,0 +1,408 @@
/*//////////////////////////////////////////////////////////////////////////////
// CMT Cosmic Muon Tomography project //////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
Copyright (c) 2014, Universita' degli Studi di Padova, INFN sez. di Padova
All rights reserved
Authors: Andrea Rigoni Garola < andrea.rigoni@pd.infn.it >
------------------------------------------------------------------
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 3.0 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library.
//////////////////////////////////////////////////////////////////////////////*/
#include "testing-prototype.h"
#include "Core/Algorithm.h"
#include "Math/VoxImage.h"
#include "Math/VoxImageFilter.h"
#include <iostream>
#include <thread>
#include <chrono>
using namespace uLib;
struct TestVoxel {
Scalarf Value;
unsigned int Count;
};
int main() {
BEGIN_TESTING(AlgorithmCudaChain);
////////////////////////////////////////////////////////////////////////////
// TEST 1: Single filter — GetPreferredDevice reflects data location
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 1: GetPreferredDevice reflects data location ---\n";
VoxImage<TestVoxel> image(Vector3i(10, 10, 10));
image[Vector3i(5, 5, 5)].Value = 1;
VoxFilterAlgorithmLinear<TestVoxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f);
filter.SetImage(&image);
filter.SetKernelNumericXZY(weights);
// Before VRAM move: should prefer RAM
TEST1(filter.GetPreferredDevice() == MemoryDevice::RAM);
TEST1(!filter.IsGPU());
std::cout << " RAM mode: PreferredDevice=RAM, IsGPU=false OK\n";
// Move image data to VRAM
image.Data().MoveToVRAM();
// After VRAM move: should prefer VRAM
TEST1(filter.GetPreferredDevice() == MemoryDevice::VRAM);
TEST1(filter.IsGPU());
std::cout << " VRAM mode: PreferredDevice=VRAM, IsGPU=true OK\n";
// Move back to RAM
image.Data().MoveToRAM();
TEST1(filter.GetPreferredDevice() == MemoryDevice::RAM);
std::cout << " Back to RAM: PreferredDevice=RAM OK\n";
}
////////////////////////////////////////////////////////////////////////////
// TEST 2: Kernel data on VRAM also triggers GPU preference
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 2: Kernel on VRAM triggers GPU preference ---\n";
VoxImage<TestVoxel> image(Vector3i(8, 8, 8));
VoxFilterAlgorithmLinear<TestVoxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f);
filter.SetImage(&image);
filter.SetKernelNumericXZY(weights);
TEST1(filter.GetPreferredDevice() == MemoryDevice::RAM);
// Only kernel on VRAM
filter.GetKernelData().Data().MoveToVRAM();
TEST1(filter.GetPreferredDevice() == MemoryDevice::VRAM);
std::cout << " Kernel on VRAM: PreferredDevice=VRAM OK\n";
filter.GetKernelData().Data().MoveToRAM();
TEST1(filter.GetPreferredDevice() == MemoryDevice::RAM);
}
////////////////////////////////////////////////////////////////////////////
// TEST 3: Algorithm interface — Process through base pointer
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 3: Process through Algorithm base pointer ---\n";
VoxImage<TestVoxel> image(Vector3i(10, 10, 10));
image[Vector3i(5, 5, 5)].Value = 10;
VoxFilterAlgorithmLinear<TestVoxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f);
filter.SetImage(&image);
filter.SetKernelNumericXZY(weights);
// Use through Algorithm base class pointer
Algorithm<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>* alg = &filter;
VoxImage<TestVoxel>* result = alg->Process(&image);
TEST1(result == &image);
std::cout << " Process through base pointer returned correct image OK\n";
// Verify filter actually ran (center voxel should be averaged)
// With uniform 3x3x3 kernel and single non-zero voxel at center,
// the center value should be 10/27 ≈ 0.37
TEST1(image[Vector3i(5, 5, 5)].Value < 10.0f);
std::cout << " Filter modified voxel values OK\n";
}
////////////////////////////////////////////////////////////////////////////
// TEST 4: Encoder/decoder chain — two filters linked
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 4: Encoder/decoder chain ---\n";
VoxImage<TestVoxel> image(Vector3i(10, 10, 10));
image[Vector3i(5, 5, 5)].Value = 100;
// First filter: linear smoothing
VoxFilterAlgorithmLinear<TestVoxel> filter1(Vector3i(3, 3, 3));
std::vector<float> weights1(27, 1.0f);
filter1.SetImage(&image);
filter1.SetKernelNumericXZY(weights1);
// Second filter: threshold
VoxFilterAlgorithmThreshold<TestVoxel> filter2(Vector3i(1, 1, 1));
filter2.SetThreshold(0.5f);
filter2.SetImage(&image);
// 1x1x1 kernel with value 1
std::vector<float> weights2(1, 1.0f);
filter2.SetKernelNumericXZY(weights2);
// Chain: filter1 → filter2
filter1.SetDecoder(&filter2);
filter2.SetEncoder(&filter1);
TEST1(filter1.GetDecoder() == &filter2);
TEST1(filter2.GetEncoder() == &filter1);
std::cout << " Chain linked: filter1 -> filter2 OK\n";
// Execute chain manually (encoder first, then decoder)
filter1.Process(&image);
float smoothed_center = image[Vector3i(5, 5, 5)].Value;
std::cout << " After linear: center = " << smoothed_center << "\n";
filter2.Process(&image);
float thresholded_center = image[Vector3i(5, 5, 5)].Value;
std::cout << " After threshold: center = " << thresholded_center << "\n";
// After threshold, values should be 0 or 1
TEST1(thresholded_center == 0.0f || thresholded_center == 1.0f);
std::cout << " Chain execution produced valid results OK\n";
}
////////////////////////////////////////////////////////////////////////////
// TEST 5: CUDA chain — VRAM data through chained filters
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 5: VRAM data through chained filters ---\n";
VoxImage<TestVoxel> image(Vector3i(10, 10, 10));
image[Vector3i(5, 5, 5)].Value = 50;
VoxFilterAlgorithmLinear<TestVoxel> filter1(Vector3i(3, 3, 3));
std::vector<float> weights1(27, 1.0f);
filter1.SetImage(&image);
filter1.SetKernelNumericXZY(weights1);
VoxFilterAlgorithmAbtrim<TestVoxel> filter2(Vector3i(3, 3, 3));
std::vector<float> weights2(27, 1.0f);
filter2.SetImage(&image);
filter2.SetKernelNumericXZY(weights2);
filter2.SetABTrim(1, 1);
// Chain
filter1.SetDecoder(&filter2);
filter2.SetEncoder(&filter1);
// Move data to VRAM
image.Data().MoveToVRAM();
filter1.GetKernelData().Data().MoveToVRAM();
filter2.GetKernelData().Data().MoveToVRAM();
// Both filters should report VRAM preference
TEST1(filter1.GetPreferredDevice() == MemoryDevice::VRAM);
TEST1(filter2.GetPreferredDevice() == MemoryDevice::VRAM);
TEST1(filter1.IsGPU());
TEST1(filter2.IsGPU());
std::cout << " Both filters detect VRAM preference OK\n";
// Verify the chain's device consistency
auto* encoder = filter2.GetEncoder();
TEST1(encoder != nullptr);
TEST1(encoder->IsGPU());
std::cout << " Encoder in chain also reports GPU OK\n";
#ifdef USE_CUDA
// With CUDA: filters execute on GPU via Process()
image.Data().MoveToRAM(); // reset for clean test
image[Vector3i(5, 5, 5)].Value = 50;
image.Data().MoveToVRAM();
filter1.Process(&image);
TEST1(image.Data().GetDevice() == MemoryDevice::VRAM);
std::cout << " CUDA: data stays in VRAM after filter1 OK\n";
filter2.Process(&image);
TEST1(image.Data().GetDevice() == MemoryDevice::VRAM);
std::cout << " CUDA: data stays in VRAM after filter2 OK\n";
#else
// Without CUDA: verify Process still works via CPU fallback
image.Data().MoveToRAM();
image[Vector3i(5, 5, 5)].Value = 50;
filter1.GetKernelData().Data().MoveToRAM();
filter2.GetKernelData().Data().MoveToRAM();
filter1.Process(&image);
filter2.Process(&image);
std::cout << " No CUDA: CPU fallback executed correctly OK\n";
#endif
}
////////////////////////////////////////////////////////////////////////////
// TEST 6: AlgorithmTask with VRAM-aware filter
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 6: AlgorithmTask with VRAM-aware filter ---\n";
VoxImage<TestVoxel> image(Vector3i(8, 8, 8));
image[Vector3i(4, 4, 4)].Value = 20;
VoxFilterAlgorithmLinear<TestVoxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f);
filter.SetImage(&image);
filter.SetKernelNumericXZY(weights);
// Set up task
AlgorithmTask<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*> task;
task.SetAlgorithm(&filter);
task.SetMode(AlgorithmTask<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>::Cyclic);
task.SetCycleTime(50);
// Run task for a few cycles
task.Run(&image);
std::this_thread::sleep_for(std::chrono::milliseconds(200));
task.Stop();
// After cyclic execution, the filter should have smoothed values
TEST1(image[Vector3i(4, 4, 4)].Value < 20.0f);
std::cout << " Task cyclic execution modified image OK\n";
std::cout << " Center value after smoothing: "
<< image[Vector3i(4, 4, 4)].Value << "\n";
}
////////////////////////////////////////////////////////////////////////////
// TEST 7: AlgorithmTask async with chained filters
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 7: AlgorithmTask async with filter ---\n";
VoxImage<TestVoxel> image(Vector3i(8, 8, 8));
image[Vector3i(4, 4, 4)].Value = 30;
VoxFilterAlgorithmLinear<TestVoxel> filter(Vector3i(3, 3, 3));
std::vector<float> weights(27, 1.0f);
filter.SetImage(&image);
filter.SetKernelNumericXZY(weights);
AlgorithmTask<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*> task;
task.SetAlgorithm(&filter);
task.SetMode(AlgorithmTask<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>::Async);
float before = image[Vector3i(4, 4, 4)].Value;
task.Run(&image);
// Trigger one execution
task.Notify();
std::this_thread::sleep_for(std::chrono::milliseconds(100));
task.Stop();
float after = image[Vector3i(4, 4, 4)].Value;
TEST1(after < before);
std::cout << " Async trigger: value " << before << " -> " << after << " OK\n";
}
////////////////////////////////////////////////////////////////////////////
// TEST 8: Device preference propagation in chain
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 8: Device preference propagation check ---\n";
VoxImage<TestVoxel> image(Vector3i(8, 8, 8));
image[Vector3i(4, 4, 4)].Value = 10;
VoxFilterAlgorithmLinear<TestVoxel> filterA(Vector3i(3, 3, 3));
VoxFilterAlgorithmAbtrim<TestVoxel> filterB(Vector3i(3, 3, 3));
VoxFilterAlgorithmThreshold<TestVoxel> filterC(Vector3i(1, 1, 1));
std::vector<float> w27(27, 1.0f);
std::vector<float> w1(1, 1.0f);
filterA.SetImage(&image);
filterA.SetKernelNumericXZY(w27);
filterB.SetImage(&image);
filterB.SetKernelNumericXZY(w27);
filterB.SetABTrim(1, 1);
filterC.SetImage(&image);
filterC.SetKernelNumericXZY(w1);
filterC.SetThreshold(0.1f);
// Chain: A → B → C
filterA.SetDecoder(&filterB);
filterB.SetEncoder(&filterA);
filterB.SetDecoder(&filterC);
filterC.SetEncoder(&filterB);
// All on RAM
TEST1(!filterA.IsGPU());
TEST1(!filterB.IsGPU());
TEST1(!filterC.IsGPU());
std::cout << " All filters on RAM OK\n";
// Move image to VRAM — filters A and B should detect it
image.Data().MoveToVRAM();
TEST1(filterA.IsGPU());
TEST1(filterB.IsGPU());
// filterC with 1x1x1 kernel doesn't have CUDA override, but still detects VRAM
TEST1(filterC.IsGPU());
std::cout << " Image on VRAM: all filters report GPU OK\n";
// Can walk the chain and check device consistency
auto* step = static_cast<Algorithm<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>*>(&filterA);
bool all_gpu = true;
while (step) {
if (!step->IsGPU()) all_gpu = false;
step = static_cast<Algorithm<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>*>(step->GetDecoder());
}
TEST1(all_gpu);
std::cout << " Chain walk: all steps report GPU OK\n";
image.Data().MoveToRAM();
}
////////////////////////////////////////////////////////////////////////////
// TEST 9: Process through chain with Algorithm interface
////////////////////////////////////////////////////////////////////////////
{
std::cout << "\n--- Test 9: Sequential chain processing via Algorithm interface ---\n";
VoxImage<TestVoxel> image(Vector3i(10, 10, 10));
// Set a pattern: single bright voxel
image[Vector3i(5, 5, 5)].Value = 100;
VoxFilterAlgorithmLinear<TestVoxel> filterA(Vector3i(3, 3, 3));
std::vector<float> w(27, 1.0f);
filterA.SetImage(&image);
filterA.SetKernelNumericXZY(w);
VoxFilterAlgorithmLinear<TestVoxel> filterB(Vector3i(3, 3, 3));
filterB.SetImage(&image);
filterB.SetKernelNumericXZY(w);
// Chain
filterA.SetDecoder(&filterB);
filterB.SetEncoder(&filterA);
// Process chain through base pointer
using AlgType = Algorithm<VoxImage<TestVoxel>*, VoxImage<TestVoxel>*>;
AlgType* chain = &filterA;
// Walk and process
AlgType* current = chain;
while (current) {
current->Process(&image);
current = static_cast<AlgType*>(current->GetDecoder());
}
// After two rounds of smoothing, the peak should be smaller than original
float final_val = image[Vector3i(5, 5, 5)].Value;
TEST1(final_val < 100.0f);
std::cout << " Two-stage smoothing: peak = " << final_val << " OK\n";
}
END_TESTING;
}

View File

@@ -16,6 +16,7 @@ set(TESTS
QuadMeshTest
BitCodeTest
UnitsTest
AlgorithmCudaChainTest
)
set(LIBRARIES
@@ -28,6 +29,6 @@ set(LIBRARIES
uLib_add_tests(Math)
if(USE_CUDA)
set_source_files_properties(VoxImageTest.cpp VoxImageCopyTest.cpp VoxImageFilterTest.cpp VoxRaytracerTest.cpp VoxRaytracerTestExtended.cpp PROPERTIES LANGUAGE CUDA)
set_source_files_properties(VoxImageTest.cpp VoxImageCopyTest.cpp VoxImageFilterTest.cpp VoxRaytracerTest.cpp VoxRaytracerTestExtended.cpp AlgorithmCudaChainTest.cpp PROPERTIES LANGUAGE CUDA)
set_source_files_properties(VoxRaytracerTest.cpp VoxRaytracerTestExtended.cpp PROPERTIES CXX_STANDARD 17 CUDA_STANDARD 17)
endif()