OpenCL kernel template for reductions resulting in a vector. Example: Computing the row norms of a matrix concurrently.
More...
|
| vector_reduction (unsigned int vectorization, unsigned int m, unsigned int k, unsigned int num_groups) |
| The user constructor. More...
|
|
std::string | csv_representation () const |
| csv representation of an operation More...
|
|
unsigned int | m () const |
|
unsigned int | k () const |
|
unsigned int | num_groups () const |
|
void | configure_range_enqueue_arguments (vcl_size_t kernel_id, statements_type const &statements, viennacl::ocl::kernel &kernel, unsigned int &n_arg) const |
| Configures the range and enqueues the arguments associated with the profile. More...
|
|
void | kernel_arguments (statements_type const &, std::string &arguments_string) const |
|
| profile_base (unsigned int vectorization, vcl_size_t local_size_1, vcl_size_t local_size_2, vcl_size_t num_kernels) |
| The constructor. More...
|
|
virtual | ~profile_base () |
| The destructor. More...
|
|
unsigned int | vector_size () const |
| Get the vector size of the kernel. More...
|
|
bool | is_slow (viennacl::ocl::device const &dev) const |
| returns whether or not the profile is likely to be slow on a particular device More...
|
|
bool | is_invalid (viennacl::ocl::device const &dev, vcl_size_t scalartype_size) const |
| returns whether or not the profile leads to undefined behavior on particular device More...
|
|
vcl_size_t | num_kernels () const |
| Returns the number of kernels needed by this operation. More...
|
|
virtual void | operator() (utils::kernel_generation_stream &stream, vcl_size_t device_offset, statements_type const &statements) const |
| Generates the code associated with this profile onto the provided stream Redirects to the virtual core() method. More...
|
|
OpenCL kernel template for reductions resulting in a vector. Example: Computing the row norms of a matrix concurrently.