List of gpuarray Ops implemented¶
Normally you should not call directly those Ops! Theano should automatically transform CPU ops to their GPU equivalent. So this list is just useful to let people know what is implemented on the GPU.
Basic Op¶
-
class
theano.gpuarray.basic_ops.
CGpuKernelBase
(func_files: Union[str, List[str]], func_name: Optional[str] = None)[source]¶ Class to combine GpuKernelBase and ExternalCOp.
It adds a new section type ‘kernels’ where you can define kernels with the ‘#kernel’ tag
-
c_code_cache_version_apply
(node)[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
Notes
This function overrides c_code_cache_version unless it explicitly calls c_code_cache_version. The default implementation simply calls c_code_cache_version and ignores the node argument.
-
-
class
theano.gpuarray.basic_ops.
GpuAlloc
(context_name, memset_0=False)[source]¶ Allocate initialized memory on the GPU.
Parameters: - context_name (str) – The name of the context in which to allocate memory
- memset_0 (bool) – It’s only an optimized version. True, it means the value is always 0, so the c code call memset as it is faster.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed trade-off. Also, it could make things faster as constants can’t be used for in-place operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
make_node
(value, *shape)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, outs, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
class
theano.gpuarray.basic_ops.
GpuAllocEmpty
(dtype, context_name)[source]¶ Allocate uninitialized memory on the GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed trade-off. Also, it could make things faster as constants can’t be used for in-place operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
grad
(*args)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(*shape)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, out_, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuContiguous
[source]¶ Return a C contiguous version of the input.
This may either pass the object as-is (if already C contiguous) or make a copy.
-
grad
(inputs, dout)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(input)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out_)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuEye
(dtype=None, context_name=None)[source]¶ Eye for GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
gpu_kernels
(node, name)[source]¶ This is the method to override. This should return an iterable of Kernel objects that describe the kernels this op will need.
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.basic_ops.
GpuFromHost
(context_name)[source]¶ Transfer data to GPU.
-
R_op
(inputs, eval_points)[source]¶ Construct a graph for the R-operator.
This method is primarily used by Rop
Suppose the op outputs
[ f_1(inputs), …, f_n(inputs) ]
Parameters: - inputs (a Variable or list of Variables) –
- eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.
Returns: - rval[i] should be Rop(f=f_i(inputs),
wrt=inputs, eval_points=eval_points)
Return type: list of n elements
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(x)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out, ctx)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuJoin
(view=- 1)[source]¶ Join for GPU.
-
c_code
(node, name, inputs, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
make_node
(axis, *tensors)[source]¶ Parameters: - axis (an Int or integer-valued Variable) –
- tensors – A variable number (but not zero) of tensors to concatenate along the specified axis. These tensors must have the same shape along all dimensions other than this axis.
Returns: It has the same ndim as the input tensors, and the most inclusive dtype.
Return type: A symbolic Variable
-
perform
(node, axis_and_tensors, out_, ctx)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuKernelBase
[source]¶ Base class for operations that need to compile kernels.
It is not mandatory to use this class, but it helps with a lot of the small things that you have to pay attention to.
-
class
theano.gpuarray.basic_ops.
GpuKernelBaseExternalCOp
(func_files: Union[str, List[str]], func_name: Optional[str] = None)[source]¶
-
class
theano.gpuarray.basic_ops.
GpuReshape
(ndim, name=None)[source]¶ Reshape for GPU variables.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
make_node
(x, shp)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out_, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuSplit
(len_splits)[source]¶ Split for GPU.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.basic_ops.
GpuToGpu
(context_name)[source]¶ Transfer data between GPUs.
-
R_op
(inputs, eval_points)[source]¶ Construct a graph for the R-operator.
This method is primarily used by Rop
Suppose the op outputs
[ f_1(inputs), …, f_n(inputs) ]
Parameters: - inputs (a Variable or list of Variables) –
- eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.
Returns: - rval[i] should be Rop(f=f_i(inputs),
wrt=inputs, eval_points=eval_points)
Return type: list of n elements
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(x)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out, ctx)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
GpuTri
(dtype=None, context_name=None)[source]¶ Tri for GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
gpu_kernels
(node, name)[source]¶ This is the method to override. This should return an iterable of Kernel objects that describe the kernels this op will need.
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.basic_ops.
HostFromGpu
[source]¶ Transfer data to CPU.
-
R_op
(inputs, eval_points)[source]¶ Construct a graph for the R-operator.
This method is primarily used by Rop
Suppose the op outputs
[ f_1(inputs), …, f_n(inputs) ]
Parameters: - inputs (a Variable or list of Variables) –
- eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.
Returns: - rval[i] should be Rop(f=f_i(inputs),
wrt=inputs, eval_points=eval_points)
Return type: list of n elements
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(x)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.basic_ops.
Kernel
(code, params, name, flags, codevar=None, objvar=None, fname=None, sname=None)[source]¶ This class groups together all the attributes of a gpu kernel.
params should contain the data type for each argument. Buffer arguments should use the GpuArray class as the data type and scalar should use their equivalent numpy dtype. For ga_size and ga_ssize, use gpuarray.SIZE and gpuarray.SSIZE.
If the ctypes flags is set to True then it should be a C string which represent the typecode to use.
flags can contain the following keys whose values are booleans:
- have_double
- the kernel uses double-typed variables somewhere
- have_small
- the kernel uses variables whose type takes less than 4 bytes somewhere
- have_complex
- the kernel uses complex values somewhere
- have_half
- the kernel uses half-floats somewhere
- ctypes
- the params list consists of C typecodes
It can also have the key cflags which is a string of C flag values like this “GA_USE_DOUBLE|GA_USE_SMALL”.
Parameters: - code (str) – The source code of the kernel.
- params (list) – list of parameter types.
- name (str) – the name of the kernel function in the source.
- flags (dict) – dictionary of flags
- codevar (str) – the name of the variable for the code object. (defaults to kcode_ + name)
- objvar (str) – the name of the variable for the kernel object. (defaults to k_ + name)
- fname (str) – the name of the function wrapper. (defaults to name + _call)
- sname (str) – the name of the scheduled call function (defaults to name _ _scall)
-
theano.gpuarray.basic_ops.
as_gpuarray_variable
(x, context_name)[source]¶ This will attempt to convert x into a variable on the GPU.
It can take either a value of another variable. If x is already suitable, it will be returned as-is.
Parameters: - x – Object to convert
- context_name (str or None) – target context name for the result
Blas Op¶
-
class
theano.gpuarray.blas.
BaseGpuCorr3dMM
(border_mode='valid', subsample=(1, 1, 1), filter_dilation=(1, 1, 1), num_groups=1)[source]¶ Base class for GpuCorr3dMM, GpuCorr3dMM_gradWeights and GpuCorr3dMM_gradInputs. Cannot be used directly.
Parameters: - border_mode ({'valid', 'full', 'half'}) – Additionally, the padding size could be directly specified by an integer or a pair of integers
- subsample – Perform subsampling of the output (default: (1, 1, 1)).
- filter_dilation – Perform subsampling of the input, also known as dilation (default: (1, 1, 1)).
- num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately (default : 1).
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_code_helper
(bottom, weights, top, direction, sub, height=None, width=None, depth=None)[source]¶ This generates the C code for GpuCorr3dMM (direction=”forward”), GpuCorr3dMM_gradWeights (direction=”backprop weights”), and GpuCorr3dMM_gradInputs (direction=”backprop inputs”). Depending on the direction, one of bottom, weights, top will receive the output, while the other two serve as inputs.
Parameters: - bottom – Variable name of the input images in the forward pass, or the gradient of the input images in backprop wrt. inputs
- weights – Variable name of the filters in the forward pass, or the gradient of the filters in backprop wrt. weights
- top – Variable name of the output images / feature maps in the forward pass, or the gradient of the outputs in the backprop passes
- direction ({'forward', 'backprop weights', 'backprop inputs'}) – “forward” to correlate bottom with weights and store results in top, “backprop weights” to do a valid convolution of bottom with top (swapping the first two dimensions) and store results in weights, and “backprop inputs” to do a full convolution of top with weights (swapping the first two dimensions) and store results in bottom.
- sub – Dictionary of substitutions useable to help generating the C code.
- height – Required if self.subsample[0] != 1, a variable giving the height of the filters for direction=”backprop weights” or the height of the input images for direction=”backprop inputs”. Required if self.border_mode == ‘half’, a variable giving the height of the filters for direction=”backprop weights”. Not required otherwise, but if a value is given this will be checked.
- width – Required if self.subsample[1] != 1, a variable giving the width of the filters for direction=”backprop weights” or the width of the input images for direction=”backprop inputs”. Required if self.border_mode == ‘half’, a variable giving the width of the filters for direction=”backprop weights”. Not required otherwise, but if a value is given this will be checked.
- depth – Required if self.subsample[2] != 1, a variable giving the depth of the filters for direction=”backprop weights” or the depth of the input images for direction=”backprop inputs”. Required if self.border_mode == ‘half’, a variable giving the depth of the filters for direction=”backprop weights”. Not required otherwise, but if a value is given this will be checked.
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
class
theano.gpuarray.blas.
BaseGpuCorrMM
(border_mode='valid', subsample=(1, 1), filter_dilation=(1, 1), num_groups=1, unshared=False)[source]¶ Base class for GpuCorrMM, GpuCorrMM_gradWeights and GpuCorrMM_gradInputs. Cannot be used directly.
Parameters: - border_mode ({'valid', 'full', 'half'}) – Additionally, the padding size could be directly specified by an integer, a pair of integers, or two pairs of integers.
- subsample – Perform subsampling of the output (default: (1, 1)).
- filter_dilation – Perform subsampling of the input, also known as dilation (default: (1, 1)).
- num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately (default : 1).
- unshared – Perform unshared correlation (default: False)
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_code_helper
(bottom, weights, top, direction, sub, height=None, width=None)[source]¶ This generates the C code for GpuCorrMM (direction=”forward”), GpuCorrMM_gradWeights (direction=”backprop weights”), and GpuCorrMM_gradInputs (direction=”backprop inputs”). Depending on the direction, one of bottom, weights, top will receive the output, while the other two serve as inputs.
Parameters: - bottom – Variable name of the input images in the forward pass, or the gradient of the input images in backprop wrt. inputs
- weights – Variable name of the filters in the forward pass, or the gradient of the filters in backprop wrt. weights
- top – Variable name of the output images / feature maps in the forward pass, or the gradient of the outputs in the backprop passes
- direction ({'forward', 'backprop weights', 'backprop inputs'}) – “forward” to correlate bottom with weights and store results in top, “backprop weights” to do a valid convolution of bottom with top (swapping the first two dimensions) and store results in weights, and “backprop inputs” to do a full convolution of top with weights (swapping the first two dimensions) and store results in bottom.
- sub – Dictionary of substitutions useable to help generating the C code.
- height – Required if self.subsample[0] != 1, a variable giving the height of the filters for direction=”backprop weights” or the height of the input images for direction=”backprop inputs”. Required if self.border_mode == ‘half’, a variable giving the height of the filters for direction=”backprop weights”. Not required otherwise, but if a value is given this will be checked.
- width – Required if self.subsample[1] != 1, a variable giving the width of the filters for direction=”backprop weights” or the width of the input images for direction=”backprop inputs”. Required if self.border_mode == ‘half’, a variable giving the width of the filters for direction=”backprop weights”. Not required otherwise, but if a value is given this will be checked.
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
class
theano.gpuarray.blas.
BlasOp
[source]¶ -
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.blas.
GpuCorr3dMM
(border_mode='valid', subsample=(1, 1, 1), filter_dilation=(1, 1, 1), num_groups=1)[source]¶ GPU correlation implementation using Matrix Multiplication.
Parameters: - border_mode – The width of a border of implicit zeros to pad the
input with. Must be a tuple with 3 elements giving the width of
the padding on each side, or a single integer to pad the same
on all sides, or a string shortcut setting the padding at runtime:
'valid'
for(0, 0, 0)
(valid convolution, no padding),'full'
for(kernel_rows - 1, kernel_columns - 1, kernel_depth - 1)
(full convolution),'half'
for(kernel_rows // 2, kernel_columns // 2, kernel_depth // 2)
(same convolution for odd-sized kernels). Note that the three widths are each applied twice, once per side (left and right, top and bottom, front and back). - subsample – The subsample operation applied to each output image. Should be a tuple with 3 elements. (sv, sh, sl) is equivalent to GpuCorrMM(…)(…)[:,:,::sv, ::sh, ::sl], but faster. Set to (1, 1, 1) to disable subsampling.
- filter_dilation – The filter dilation operation applied to each input image. Should be a tuple with 3 elements. Set to (1, 1, 1) to disable filter dilation.
- num_groups – The number of distinct groups the image and kernel must be divided into. should be an int set to 1 to disable grouped convolution
Notes
Currently, the Op requires the inputs, filters and outputs to be C-contiguous. Use
gpu_contiguous
on these arguments if needed.You can either enable the Theano flag optimizer_including=conv_gemm to automatically replace all convolution operations with GpuCorr3dMM or one of its gradients, or you can use it as a replacement for
conv2d
, called as GpuCorr3dMM(subsample=…)(image, filters). The latter is currently faster, but note that it computes a correlation – if you need to compute a convolution, flip the filters as filters[:,:,::-1,::-1,::-1].-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
- border_mode – The width of a border of implicit zeros to pad the
input with. Must be a tuple with 3 elements giving the width of
the padding on each side, or a single integer to pad the same
on all sides, or a string shortcut setting the padding at runtime:
-
class
theano.gpuarray.blas.
GpuCorr3dMM_gradInputs
(border_mode='valid', subsample=(1, 1, 1), filter_dilation=(1, 1, 1), num_groups=1)[source]¶ Gradient wrt. inputs for GpuCorr3dMM.
Notes
You will not want to use this directly, but rely on Theano’s automatic differentiation or graph optimization to use it as needed.
-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.blas.
GpuCorr3dMM_gradWeights
(border_mode='valid', subsample=(1, 1, 1), filter_dilation=(1, 1, 1), num_groups=1)[source]¶ Gradient wrt. filters for GpuCorr3dMM.
Notes
You will not want to use this directly, but rely on Theano’s automatic differentiation or graph optimization to use it as needed.
-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.blas.
GpuCorrMM
(border_mode='valid', subsample=(1, 1), filter_dilation=(1, 1), num_groups=1, unshared=False)[source]¶ GPU correlation implementation using Matrix Multiplication.
Parameters: - border_mode – The width of a border of implicit zeros to pad the
input with. Must be a tuple with 2 elements giving the numbers of rows
and columns to pad on each side, or a single integer to pad the same
on all sides, or a string shortcut setting the padding at runtime:
'valid'
for(0, 0)
(valid convolution, no padding),'full'
for(kernel_rows - 1, kernel_columns - 1)
(full convolution),'half'
for(kernel_rows // 2, kernel_columns // 2)
(same convolution for odd-sized kernels). If it is a tuple containing 2 pairs of integers, then these specify the padding to be applied on each side ((left, right), (top, bottom)). Otherwise, each width is applied twice, once per side (left and right, top and bottom). - subsample – The subsample operation applied to each output image. Should be a tuple with 2 elements. (sv, sh) is equivalent to GpuCorrMM(…)(…)[:,:,::sv, ::sh], but faster. Set to (1, 1) to disable subsampling.
- filter_dilation – The filter dilation operation applied to each input image. Should be a tuple with 2 elements. Set to (1, 1) to disable filter dilation.
- num_groups – The number of distinct groups the image and kernel must be divided into. should be an int set to 1 to disable grouped convolution
- unshared – Perform unshared correlation (default: False)
Notes
Currently, the Op requires the inputs, filters and outputs to be C-contiguous. Use
gpu_contiguous
on these arguments if needed.You can either enable the Theano flag optimizer_including=conv_gemm to automatically replace all convolution operations with GpuCorrMM or one of its gradients, or you can use it as a replacement for
conv2d
, called as GpuCorrMM(subsample=…)(image, filters). The latter is currently faster, but note that it computes a correlation – if you need to compute a convolution, flip the filters as filters[:,:,::-1,::-1].-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
- border_mode – The width of a border of implicit zeros to pad the
input with. Must be a tuple with 2 elements giving the numbers of rows
and columns to pad on each side, or a single integer to pad the same
on all sides, or a string shortcut setting the padding at runtime:
-
class
theano.gpuarray.blas.
GpuCorrMM_gradInputs
(border_mode='valid', subsample=(1, 1), filter_dilation=(1, 1), num_groups=1, unshared=False)[source]¶ Gradient wrt. inputs for GpuCorrMM.
Notes
You will not want to use this directly, but rely on Theano’s automatic differentiation or graph optimization to use it as needed.
-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.blas.
GpuCorrMM_gradWeights
(border_mode='valid', subsample=(1, 1), filter_dilation=(1, 1), num_groups=1, unshared=False)[source]¶ Gradient wrt. filters for GpuCorrMM.
Notes
You will not want to use this directly, but rely on Theano’s automatic differentiation or graph optimization to use it as needed.
-
c_code
(node, nodename, inp, out_, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
-
class
theano.gpuarray.blas.
GpuDot22
[source]¶ Dot22 on the GPU.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
make_node
(x, y)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, outputs)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.blas.
GpuGemm
(inplace=False)[source]¶ Gemm on the GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
make_node
(C, alpha, A, B, beta)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, outputs, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.blas.
GpuGemmBatch
(inplace=False)[source]¶ -
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.blas.
GpuGemv
(inplace=False)[source]¶ Gemv on the GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
make_node
(y, alpha, A, x, beta)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, out_storage, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.blas.
GpuGer
(inplace=False)[source]¶ Ger on the GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
make_node
(A, alpha, x, y)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
Elemwise Op¶
-
class
theano.gpuarray.elemwise.
GpuCAReduceCPY
(scalar_op, axis=None, dtype=None, acc_dtype=None)[source]¶ CAReduce that reuse the python code from gpuarray.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version_apply
(node)[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
Notes
This function overrides c_code_cache_version unless it explicitly calls c_code_cache_version. The default implementation simply calls c_code_cache_version and ignores the node argument.
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
gpu_kernels
(node, name)[source]¶ This is the method to override. This should return an iterable of Kernel objects that describe the kernels this op will need.
-
make_node
(input)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out, ctx)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
prepare_node
(node, storage_map, compute_map, impl)[source]¶ Make any special modifications that the Op needs before doing Op.make_thunk.
This can modify the node inplace and should return nothing.
It can be called multiple time with different impl. It is the op responsibility to don’t re-prepare the node when it isn’t good to do so.
-
-
class
theano.gpuarray.elemwise.
GpuCAReduceCuda
(scalar_op, axis=None, reduce_mask=None, dtype=None, acc_dtype=None, pre_scalar_op=None)[source]¶ GpuCAReduceCuda is a Reduction along some dimensions by a scalar op.
Parameters: - reduce_mask – The dimensions along which to reduce. The reduce_mask is a tuple of booleans (actually integers 0 or 1) that specify for each input dimension, whether to reduce it (1) or not (0).
- pre_scalar_op – If present, must be a scalar op with only 1 input. We will execute it on the input value before reduction.
Examples
When scalar_op is a theano.scalar.basic.Add instance:
- reduce_mask == (1,) sums a vector to a scalar
- reduce_mask == (1,0) computes the sum of each column in a matrix
- reduce_mask == (0,1) computes the sum of each row in a matrix
- reduce_mask == (1,1,1) computes the sum of all elements in a 3-tensor.
Notes
Any reduce_mask of all zeros is a sort of ‘copy’, and may be removed during graph optimization.
This Op is a work in progress.
This op was recently upgraded from just GpuSum a general CAReduce. Not many code cases are supported for scalar_op being anything other than scalar.Add instances yet.
Important note: if you implement new cases for this op, be sure to benchmark them and make sure that they actually result in a speedup. GPUs are not especially well-suited to reduction operations so it is quite possible that the GPU might be slower for some cases.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version_apply
(node)[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
Notes
This function overrides c_code_cache_version unless it explicitly calls c_code_cache_version. The default implementation simply calls c_code_cache_version and ignores the node argument.
-
c_code_reduce_01X
(sio, node, name, x, z, fail, N)[source]¶ Parameters: N – The number of 1 in the pattern N=1 -> 01, N=2 -> 011 N=3 ->0111 Work for N=1,2,3.
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
gpu_kernels
(node, nodename)[source]¶ This is the method to override. This should return an iterable of Kernel objects that describe the kernels this op will need.
-
class
theano.gpuarray.elemwise.
GpuDimShuffle
(input_broadcastable, new_order, inplace=True)[source]¶ DimShuffle on the GPU.
-
make_node
(input)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.elemwise.
GpuElemwise
(scalar_op, inplace_pattern=None, name=None, nfunc_spec=None, openmp=None)[source]¶ Elemwise on the GPU.
-
c_cleanup_code_struct
(node, name)[source]¶ Return an Apply-specific code string to be inserted in the struct cleanup code.
Parameters: - node (Apply) – The node in the graph being compiled
- name (str) – A unique name to distinguish variables from those of other nodes.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_init_code_struct
(node, name, sub)[source]¶ Return an Apply-specific code string to be inserted in the struct initialization code.
Parameters: - node (Apply) – The node in the graph being compiled.
- name (str) – A unique name to distinguish variables from those of other nodes.
- sub (dict of str) – A dictionary of values to substitute in the code.
Most notably it contains a
'fail'
entry that you should place in your code after setting a Python exception to indicate an error.
-
c_support_code_struct
(node, name)[source]¶ Return Apply-specific utility code for use by an Op that will be inserted at struct scope.
Parameters: - node (Apply) – The node in the graph being compiled
- name (str) – A unique name to distinguish you variables from those of other nodes.
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
-
class
theano.gpuarray.elemwise.
GpuErfcinv
(output_types_preference=None, name=None)[source]¶ Inverse complementary error function for GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.elemwise.
GpuErfinv
(output_types_preference=None, name=None)[source]¶ Inverse error function for GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
Subtensor Op¶
-
class
theano.gpuarray.subtensor.
GpuAdvancedIncSubtensor
(inplace=False, set_instead_of_inc=False)[source]¶ Implement AdvancedIncSubtensor on the gpu.
-
class
theano.gpuarray.subtensor.
GpuAdvancedIncSubtensor1
(inplace=False, set_instead_of_inc=False)[source]¶ Implement AdvancedIncSubtensor1 on the gpu.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_init_code_struct
(node, name, sub)[source]¶ Return an Apply-specific code string to be inserted in the struct initialization code.
Parameters: - node (Apply) – The node in the graph being compiled.
- name (str) – A unique name to distinguish variables from those of other nodes.
- sub (dict of str) – A dictionary of values to substitute in the code.
Most notably it contains a
'fail'
entry that you should place in your code after setting a Python exception to indicate an error.
-
c_support_code_struct
(node, nodename)[source]¶ Return Apply-specific utility code for use by an Op that will be inserted at struct scope.
Parameters: - node (Apply) – The node in the graph being compiled
- name (str) – A unique name to distinguish you variables from those of other nodes.
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
make_node
(x, y, ilist)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out_, params=None)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuAdvancedIncSubtensor1_dev20
(inplace=False, set_instead_of_inc=False)[source]¶ Implement AdvancedIncSubtensor1 on the gpu with atomics
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_support_code_struct
(node, nodename)[source]¶ Return Apply-specific utility code for use by an Op that will be inserted at struct scope.
Parameters: - node (Apply) – The node in the graph being compiled
- name (str) – A unique name to distinguish you variables from those of other nodes.
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
gpu_kernels
(node, nodename)[source]¶ This is the method to override. This should return an iterable of Kernel objects that describe the kernels this op will need.
-
make_node
(x, y, ilist)[source]¶ It differs from GpuAdvancedIncSubtensor1 in that it makes sure the indexes are of type long.
-
perform
(node, inp, out, params)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuAdvancedSubtensor1
(sparse_grad=False)[source]¶ AdvancedSubrensor1 on the GPU.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
make_node
(x, ilist)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inp, out_)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuAllocDiag
(offset=0, axis1=0, axis2=1)[source]¶ -
grad
(inputs, gout)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(diag)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, outputs)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuExtractDiag
(offset=0, axis1=0, axis2=1, view=False)[source]¶ -
grad
(inputs, gout)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters: - inputs (list of Variable) – The input variables.
- output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable
-
make_node
(_x)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by sub-classes.
Returns: node – The constructed Apply node. Return type: Apply
-
perform
(node, inputs, outputs)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuIncSubtensor
(idx_list, inplace=False, set_instead_of_inc=False, destroyhandler_tolerate_aliased=None)[source]¶ Implement IncSubtensor on the gpu.
Notes
The optimization to make this inplace is in tensor/opt. The same optimization handles IncSubtensor and GpuIncSubtensor. This Op has c_code too; it inherits IncSubtensor’s c_code. The helper methods like
do_type_checking()
,copy_of_x()
, etc. specialize the c_code for this Op.-
add_to_zview
(nodename, x, fail)[source]¶ Return C code to add x to zview. Should DECREF zview if the add fails.
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_init_code_struct
(node, name, sub)[source]¶ Return an Apply-specific code string to be inserted in the struct initialization code.
Parameters: - node (Apply) – The node in the graph being compiled.
- name (str) – A unique name to distinguish variables from those of other nodes.
- sub (dict of str) – A dictionary of values to substitute in the code.
Most notably it contains a
'fail'
entry that you should place in your code after setting a Python exception to indicate an error.
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
c_support_code_struct
(node, nodename)[source]¶ Return Apply-specific utility code for use by an Op that will be inserted at struct scope.
Parameters: - node (Apply) – The node in the graph being compiled
- name (str) – A unique name to distinguish you variables from those of other nodes.
-
copy_into
(view, source)[source]¶ Parameters: - view (string) – C code expression for an array.
- source (string) – C code expression for an array.
Returns: C code expression to copy source into view, and 0 on success.
Return type: str
-
copy_of_x
(x)[source]¶ Parameters: x – A string giving the name of a C variable pointing to an array. Returns: C code expression to make a copy of x. Return type: str Notes
Base class uses PyArrayObject *, subclasses may override for different types of arrays.
-
do_type_checking
(node)[source]¶ Should raise NotImplementedError if c_code does not support the types involved in this node.
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-
make_node
(x, y, *inputs)[source]¶ Parameters: - x – The tensor to increment.
- y – The value to increment by.
- inputs (TODO WRITEME) –
-
make_view_array
(x, view_ndim)[source]¶ //TODO
Parameters: - x – A string identifying an array to be viewed.
- view_ndim – A string specifying the number of dimensions to have in the view. This doesn’t need to actually set up the view with the right indexing; we’ll do that manually later.
-
perform
(node, inputs, out_, ctx)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
class
theano.gpuarray.subtensor.
GpuSubtensor
(idx_list)[source]¶ Subtensor on the GPU.
-
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
make_node
(x, *inputs)[source]¶ Parameters: - x – The tensor to take a subtensor of.
- inputs – A list of theano Scalars.
-
perform
(node, inputs, out_)[source]¶ Calculate the function on the inputs and put the variables in the output storage.
Parameters: - node (Apply) – The symbolic Apply node that represents this computation.
- inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
- output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
- params (tuple) – A tuple containing the values of each entry in __props__.
Notes
The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform; they could’ve been allocated by another Op’s perform method. A Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.
-
-
theano.gpuarray.subtensor.
check_and_convert_boolean_masks
(input, idx_list)[source]¶ This function checks if the boolean mask arrays in the index have the right shape and converts them to index arrays by calling nonzero. For each boolean mask, we check if the mask has the same shape as the input. This is enforced in NumPy 0.13.0 and newer, but not by earlier versions. If the size is not the same, this method raises an IndexError.
Nnet Op¶
-
class
theano.gpuarray.nnet.
GpuCrossentropySoftmax1HotWithBiasDx
[source]¶ Implement CrossentropySoftmax1HotWithBiasDx on the gpu.
Gradient wrt x of the CrossentropySoftmax1Hot Op.
-
c_code
(node, nodename, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.nnet.
GpuCrossentropySoftmaxArgmax1HotWithBias
[source]¶ Implement CrossentropySoftmaxArgmax1HotWithBias on the gpu.
-
c_code
(node, nodename, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get -I prefixed in the compiler command line arguments.
Examples
- def c_header_dirs(self, **kwargs):
- return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.nnet.
GpuSoftmax
[source]¶ Implement Softmax on the gpu.
-
c_code
(node, nodename, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.nnet.
GpuSoftmaxWithBias
[source]¶ Implement SoftmaxWithBias on the gpu.
-
c_code
(node, nodename, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
-
class
theano.gpuarray.neighbours.
GpuImages2Neibs
(mode='valid')[source]¶ Images2Neibs for the GPU.
-
c_code
(node, name, inp, out, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters: - node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
- name (str) – A name that is automatically assigned and guaranteed to be unique.
- inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
- outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.
- sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME
-
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply
-
c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in double-quotes.Examples
- def c_headers(self, **kwargs):
- return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]
-
c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str
-
get_params
(node)[source]¶ Try to detect params from the op if Op.params_type is set to a ParamsType.
-