# Usage and Installation Guide ## Requirements ### Compiling with CUDA Support The library supports running VoxImage filtering operations directly on CUDA cores via transparent RAM/VRAM memory transfers. By default, the `CMakeLists.txt` build system sets `USE_CUDA=ON` and will attempt to locate `nvcc` and the NVIDIA CUDA Toolkit. If the toolkit is missing, `CMake` will fail unless you explicitly configure the project with `-DUSE_CUDA=OFF`. ### 1. Installing CUDA Environment via Micromamba If you are developing inside an isolated Conda/Micromamba environment (e.g., `mutom`), you can inject the CUDA compilers directly into your environment rather than relying on global system dependencies: ```bash # Add the conda-forge channel if not already available micromamba config append channels conda-forge # Install nvcc and the necessary CUDA toolkit components micromamba install cuda-nvcc ``` Verify your installation: ```bash nvcc --version ``` ### 2. Building the Project Configure and compile the project using standard CMake flows: ```bash mkdir -p build && cd build # Configure CMake # (Optional) Explicitly toggle CUDA: cmake -DUSE_CUDA=ON .. cmake .. # Compile the project and tests make -j $(nproc) ``` ### 3. Validating CUDA Support You can verify that the CUDA kernels are launching correctly and allocating device memory through `DataAllocator` by running the mathematical unit tests. ```bash # From the build directory ./src/Math/testing/VoxImageFilterTest # Output should show: # "Data correctly stayed in VRAM after CUDA execution!" ``` ## How It Works Under The Hood The `DataAllocator` container automatically wraps memory allocations to transparently map to CPU RAM, or GPU VRAM. Standard iteration automatically pulls data backwards using implicit `MoveToRAM()` calls. Filters using `#ifdef USE_CUDA` explicitly dictate `.MoveToVRAM()` allocating directly on device bounds seamlessly. Fallbacks to Host compute iterations handle themselves automatically. Chaining specific filters together safely chains continuous VRAM operations avoiding costly Host copies in between iterations.