5 Commits

Author SHA1 Message Date
Dantali0n
5f740f5fbf AVX2 instead of AVX__2
I swear I had fixed this already mmh

Co-authored-by: Bram Veenboer <bram.veenboer@gmail.com>
2025-10-28 20:54:09 +01:00
lukken
03723c2d3b 32: Update flags for Intel compiler 2025-10-28 15:52:02 +01:00
lukken
027457f560 32: Set mavx and mavx2 based on CMake checks 2025-10-22 19:42:48 +02:00
Wiebe van Breukelen
5f00c5d304 Add README.md (#38)
* Add README.md

* Improve README description of TrigDx library

* Apply suggestion from @mickveldhuis

Co-authored-by: Mick Veldhuis <mickveldhuis@hotmail.nl>

* Apply suggestion from @mickveldhuis

Co-authored-by: Mick Veldhuis <mickveldhuis@hotmail.nl>

* Apply suggestion from @mickveldhuis

Co-authored-by: Mick Veldhuis <mickveldhuis@hotmail.nl>

---------

Co-authored-by: Mick Veldhuis <mickveldhuis@hotmail.nl>
2025-10-22 16:55:37 +02:00
Wiebe van Breukelen
f85e67e669 Fix compiler warnings (#37) 2025-10-22 16:48:26 +02:00
3 changed files with 82 additions and 0 deletions

54
README.md Normal file
View File

@@ -0,0 +1,54 @@
# TrigDx
Highperformance C++ library offering multiple implementations of transcendental trigonometric functions (e.g., sin, cos, tan and their variants), designed for numerical, signalprocessing, and realtime systems where trading a small loss of accuracy for significantly higher throughput on modern CPUs (scalar and SIMD) and NVIDIA GPUs is acceptable.
## Why TrigDx?
Many applications use the standard library implementations, which prioritise correctness but are not always optimal for throughput on vectorized or GPU hardware. TrigDx gives you multiple implementations so you can:
- Replace `std::sin` / `std::cos` calls with faster approximations when a small, bounded reduction in accuracy is acceptable.
- Use SIMD/vectorized implementations and compact lookup tables for high throughput lookups.
- Run massively parallel kernels that take advantage of a GPU's _Special Function Units_ (SFUs).
## Requirements
- A C++ compiler with at least C++17 support (GCC, Clang)
- CMake 3.15+
- Optional: NVIDIA CUDA Toolkit 11+ to build GPU kernels
- Optional: GoogleTest (for unit tests) and GoogleBenchmark (for microbenchmarks)
## Building
```bash
git clone https://github.com/astron-rd/TrigDx.git
cd TrigDx
mkdir build && cd build
# CPU-only:
cmake -DCMAKE_BUILD_TYPE=Release -DTRIGDX_USE_XSIMD=ON ..
cmake --build . -j
# Enable CUDA (if available):
cmake -DCMAKE_BUILD_TYPE=Release -DTRIGDX_USE_GPU=ON ..
cmake --build . -j
# Run tests:
ctest --output-on-failure -j
```
Common CMake options:
- `TRIGDX_USE_GPU=ON/OFF` — build GPU support.
- `TRIGDX_BUILD_TESTS=ON/OFF` — build tests.
- `TRIGDX_BUILD_BENCHMARKS=ON/OFF` — build benchmarks.
- `TRIGDX_BUILD_PYTHON` — build Python interface.
## Contributing
- Fork → create a feature branch → open a PR.
- Include unit tests for correctnesssensitive changes and benchmark results for performance changes.
- Follow project style (clangformat) and run tests locally before submitting.
## Reporting issues
When opening an issue for incorrect results or performance regressions, please include:
- Platform and CPU/GPU model.
- Compiler and version with exact compile flags.
- Small reproducer (input data and the TrigDx implementation used).
## License
See the LICENSE file in the repository for licensing details.

View File

@@ -2,6 +2,24 @@ include(FetchContent)
include(FindAVX)
add_library(trigdx reference.cpp lookup.cpp)
if(HAVE_AVX2)
target_compile_definitions(trigdx PUBLIC HAVE_AVX2)
if(CMAKE_CXX_COMPILER_ID STREQUAL "Intel" OR CMAKE_CXX_COMPILER_ID STREQUAL
"IntelLLVM")
target_compile_options(trigdx PUBLIC -xCORE-AVX2)
else()
target_compile_options(trigdx PUBLIC -mavx2)
endif()
elseif(HAVE_AVX)
target_compile_definitions(trigdx PUBLIC HAVE_AVX)
if(CMAKE_CXX_COMPILER_ID STREQUAL "Intel" OR CMAKE_CXX_COMPILER_ID STREQUAL
"IntelLLVM")
target_compile_options(trigdx PUBLIC -xAVX)
else()
target_compile_options(trigdx PUBLIC -mavx)
endif()
endif()
target_include_directories(trigdx PUBLIC ${PROJECT_SOURCE_DIR}/include)
if(HAVE_AVX)

View File

@@ -6,6 +6,16 @@
#include "trigdx/lookup_avx.hpp"
#if defined(HAVE_AVX) && !defined(__AVX__)
static_assert(HAVE_AVX == 0, "__AVX__ should be defined when HAVE_AVX is "
"defined");
#endif
#if defined(HAVE_AVX2) && !defined(__AVX2__)
static_assert(HAVE_AVX2 == 0, "__AVX2__ should be defined when HAVE_AVX2 is "
"defined");
#endif
template <std::size_t NR_SAMPLES> struct LookupAVXBackend<NR_SAMPLES>::Impl {
std::vector<float> lookup;
static constexpr std::size_t MASK = NR_SAMPLES - 1;