Highly Efficient FFT for Exascale: HeFFTe v2.4
|
Enumerations | |
enum class | heffte::direction { heffte::direction::forward , heffte::direction::backward } |
Indicates the direction of the FFT (internal use only). More... | |
Functions | |
template<typename index > | |
std::ostream & | heffte::operator<< (std::ostream &os, box3d< index > const box) |
Debugging info, writes out the box to a stream. | |
int | heffte::get_area (std::array< int, 2 > const &dims) |
Get the surface area of a processor grid. | |
std::array< int, 2 > | heffte::make_procgrid (int const num_procs) |
Factorize the MPI ranks into a 2D grid. | |
template<typename index > | |
std::array< int, 3 > | heffte::make_procgrid2d (box3d< index > const world, int direction_1d, std::array< int, 2 > const candidate_grid) |
Factorize the MPI ranks into a 2D grid with specific constraints. | |
template<class T , class U = T> | |
T | heffte::c11_exchange (T &obj, U &&new_value) |
Replace with the C++ 2014 std::exchange later. | |
template<typename some_class > | |
int | heffte::get_last_active (std::array< std::unique_ptr< some_class >, 4 > const &shaper) |
Return the index of the last active (non-null) unique_ptr. | |
template<typename some_class > | |
int | heffte::count_active (std::array< std::unique_ptr< some_class >, 4 > const &shaper) |
Return the number of active (non-null) unique_ptr. | |
template<typename some_class > | |
size_t | heffte::get_max_box_size (std::array< some_class, 3 > const &executors) |
Returns the max of the box_size() for each of the executors. | |
template<typename some_class > | |
size_t | heffte::get_max_box_size_r2c (std::array< some_class, 3 > const &executors) |
Returns the max of the box_size() for each of the executors. | |
template<typename some_class > | |
size_t | heffte::get_max_work_size (std::array< some_class, 3 > const &executors) |
Returns the max of the workspace_size() for each of the executors. | |
template<typename some_class_r2c , typename some_class > | |
size_t | heffte::get_max_work_size (some_class_r2c const &executors_r2c, std::array< some_class, 2 > const &executors) |
Returns the max of the workspace_size() for each of the executors. | |
Simple helper templates.
|
strong |
Get the surface area of a processor grid.
For a three dimensional grid with size dims[0] by dims[1] by 1, returns the surface area. Useful for optimizing average communication cost.
Factorize the MPI ranks into a 2D grid.
Considers all possible factorizations of the total number of processors and select the one with the lowest area which heuristically reduces the number of ranks that need to communicate in each of the all-to-all operations.
num_procs | is the total number of processors to factorize |
|
inline |
Factorize the MPI ranks into a 2D grid with specific constraints.
The constraints satisfied by the grid will be as follow:
int heffte::get_last_active | ( | std::array< std::unique_ptr< some_class >, 4 > const & | shaper | ) |
Return the index of the last active (non-null) unique_ptr.
The method returns -1 if all shapers are null.