theano.gpuarray.dnn
– cuDNN¶
cuDNN is an NVIDIA library with functionality used by deep neural networks. It provides optimized versions of some operations like the convolution. cuDNN is not currently installed with CUDA. You must download and install it yourself.
To install it, decompress the downloaded file and make the *.h
and
*.so*
files available to the compilation environment.
There are at least three possible ways of doing so:
The easiest is to include them in your CUDA installation. Copy the
*.h
files toCUDA_ROOT/include
and the*.so*
files toCUDA_ROOT/lib64
(by default,CUDA_ROOT
is/usr/local/cuda
on Linux).Alternatively, on Linux, you can set the environment variables
LD_LIBRARY_PATH
,LIBRARY_PATH
andCPATH
to the directory extracted from the download. If needed, separate multiple directories with:
as in thePATH
environment variable.example:
export LD_LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH export CPATH=/home/user/path_to_CUDNN_folder/include:$CPATH export LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH
And as a third way, also on Linux, you can copy the
*.h
files to/usr/include
and the*.so*
files to/lib64
.
By default, Theano will detect if it can use cuDNN. If so, it will use it. If not, Theano optimizations will not introduce cuDNN ops. So Theano will still work if the user did not introduce them manually.
To get an error if Theano can not use cuDNN, use this Theano flag:
optimizer_including=cudnn
.
Note
cuDNN v5.1 is supported in Theano master version. So it dropped cuDNN v3 support. Theano 0.8.0 and 0.8.1 support only cuDNN v3 and v4. Theano 0.8.2 will support only v4 and v5.
Note
Starting in cuDNN v3, multiple convolution implementations are offered and it is possible to use heuristics to automatically choose a convolution implementation well suited to the parameters of the convolution.
The Theano flag dnn__conv__algo_fwd
allows to specify the cuDNN
convolution implementation that Theano should use for forward convolutions.
Possible values include :
small
(default) : use a convolution implementation with small memory usagenone
: use a slower implementation with minimal memory usagelarge
: use a sometimes faster implementation with large memory usagefft
: use the Fast Fourier Transform implementation of convolution (very high memory usage)guess_once
: the first time a convolution is executed, the implementation to use is chosen according to cuDNN’s heuristics and reused for every subsequent execution of the convolution.guess_on_shape_change
: likeguess_once
but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.time_once
: the first time a convolution is executed, every convolution implementation offered by cuDNN is executed and timed. The fastest is reused for every subsequent execution of the convolution.time_on_shape_change
: liketime_once
but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.
The Theano flag dnn.conv.algo_bwd
allows to specify the cuDNN
convolution implementation that Theano should use for gradient convolutions.
Possible values include :
none
(default) : use the default nondeterministic convolution implementationdeterministic
: use a slower but deterministic implementationfft
: use the Fast Fourier Transform implementation of convolution (very high memory usage)guess_once
: the first time a convolution is executed, the implementation to use is chosen according to cuDNN’s heuristics and reused for every subsequent execution of the convolution.guess_on_shape_change
: likeguess_once
but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.time_once
: the first time a convolution is executed, every convolution implementation offered by cuDNN is executed and timed. The fastest is reused for every subsequent execution of the convolution.time_on_shape_change
: liketime_once
but a new convolution implementation selected every time the shapes of the inputs and kernels don’t match the shapes from the last execution.
guess_*
and time_*
flag values take into account the amount of
available memory when selecting an implementation. This means that slower
implementations might be selected if not enough memory is available for the
faster implementations.
Note
Normally you should not call GPU Ops directly, but the CPU interface currently does not allow all options supported by cuDNN ops. So it is possible that you will need to call them manually.
Note
The documentation of CUDNN tells that, for the 2 following operations, the reproducibility is not guaranteed with the default implementation: cudnnConvolutionBackwardFilter and cudnnConvolutionBackwardData. Those correspond to the gradient wrt the weights and the gradient wrt the input of the convolution. They are also used sometimes in the forward pass, when they give a speed up.
The Theano flag dnn.conv.algo_bwd
can be use to force the use of a
slower but deterministic convolution implementation.
Note
There is a problem we do not understand yet when cudnn paths are used with symbolic links. So avoid using that.
Note
cudnn.so* must be readable and executable by everybody. cudnn.h must be readable by everybody.
 Pooling:
 Softmax:
 You can manually use the op
GpuDnnSoftmax
to use its extra feature.
 You can manually use the op
 Spatial Transformer:
cuDNN RNN Example¶
This is a code example of using the cuDNN RNN functionality. We present the code with some commentary in between to explain some peculiarities.
The terminology here assumes that you are familiar with RNN structure.
dtype = 'float32'
input_dim = 32
hidden_dim = 16
batch_size = 2
depth = 3
timesteps = 5
To clarify the rest of the code we define some variables to hold sizes.
from theano.tensor.type import tensor3
X = tensor3('X')
Y = tensor3('Y')
h0 = tensor3('h0')
We also define some Theano variables to work with. Here X is input, Y is output (as in expected output) and h0 is the initial state for the recurrent inputs.
rnnb = dnn.RNNBlock(dtype, hidden_dim, depth, 'gru')
This defines an RNNBlock. This is a departure from usual Theano operations in that it has the structure of a layer more than a separate operation. This is constrained by the underlying API.
psize = rnnb.get_param_size([batch_size, input_dim])
params_cudnn = gpuarray_shared_constructor(
np.zeros((psize,), dtype=theano.config.floatX))
Here we allocate space for the trainable parameters of the RNN. The first function tells us how many elements we will need to store the parameters. This space if for all the parameters of all the layers inside the RNN and the layout is opaque.
layer = 0
= rnnb.split_params(params_cudnn, layer,
[batch_size, input_dim])
If you need to access the parameters individually, you can call split_params on your shared variable to get all the parameters for a single layer. The order and number of returned items depends on the type of RNN.
 rnn_relu, rnn_tanh
 input, recurrent
 gru
 input reset, input update, input newmem, recurrent reset, recurrent update, recurrent newmem
 lstm
 input input gate, input forget gate, input newmem gate, input output gate, recurrent input gate, recurrent update gate, recurrent newmem gate, recurrent output gate
All of these elements are composed of a weights and bias (matrix and vector).
y, hy = rnnb.apply(params_cudnn, X, h0)
This is more akin to an op in Theano in that it will apply the RNN operation to a set of symbolic inputs and return symbolic outputs. y is the output, hy is the final state for the recurrent inputs.
After this, the gradient works as usual so you can treat the returned symbolic outputs as normal Theano symbolic variables.
List of Implemented Operations¶

class
theano.gpuarray.dnn.
CDataMaker
(rtype)[source]¶ This is the equally lame Op that accompanies MakerCDataType.

c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters:  node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
 name (str) – A name that is automatically assigned and guaranteed to be unique.
 inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
 outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be prefilled. The value for an unallocated output is typedependent.
 sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply

do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed tradeoff. Also, it could make things faster as constants can’t be used for inplace operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool


class
theano.gpuarray.dnn.
DnnBase
(files=None, c_func=None)[source]¶ Creates a handle for cudnn and pulls in the cudnn libraries and headers.

c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply

c_compile_args
(**kwargs)[source]¶ Return a list of recommended compile arguments for code returned by other methods in this class.
Compiler arguments related to headers, libraries and search paths should be provided via the functions c_headers, c_libraries, c_header_dirs, and c_lib_dirs.
Examples
 def c_compile_args(self, **kwargs):
 return [‘ffastmath’]

c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get I prefixed in the compiler command line arguments.
Examples
 def c_header_dirs(self, **kwargs):
 return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]

c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in doublequotes.Examples
 def c_headers(self, **kwargs):
 return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]

c_lib_dirs
(**kwargs)[source]¶ Return a list of library search paths required by code returned by this class.
Provides search paths for libraries, in addition to those in any relevant environment variables (e.g.
LD_LIBRARY_PATH
).Note: for Unix compilers, these are the things that get
L
prefixed in the compiler command line arguments.Examples
 def c_lib_dirs(self, **kwargs):
 return [‘/usr/local/lib’, ‘/opt/weirdpath/build/libs’].

c_libraries
(**kwargs)[source]¶ Return a list of libraries required by code returned by this class.
The compiler will search the directories specified by the environment variable LD_LIBRARY_PATH in addition to any returned by c_lib_dirs.
Note: for Unix compilers, these are the things that get
l
prefixed in the compiler command line arguments.Examples
 def c_libraries(self, **kwargs):
 return [‘gsl’, ‘gslcblas’, ‘m’, ‘fftw3’, ‘g2c’].


class
theano.gpuarray.dnn.
DnnVersion
[source]¶ 
c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters:  node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
 name (str) – A name that is automatically assigned and guaranteed to be unique.
 inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
 outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be prefilled. The value for an unallocated output is typedependent.
 sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply

c_compile_args
(**kwargs)[source]¶ Return a list of recommended compile arguments for code returned by other methods in this class.
Compiler arguments related to headers, libraries and search paths should be provided via the functions c_headers, c_libraries, c_header_dirs, and c_lib_dirs.
Examples
 def c_compile_args(self, **kwargs):
 return [‘ffastmath’]

c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get I prefixed in the compiler command line arguments.
Examples
 def c_header_dirs(self, **kwargs):
 return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]

c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in doublequotes.Examples
 def c_headers(self, **kwargs):
 return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]

c_lib_dirs
(**kwargs)[source]¶ Return a list of library search paths required by code returned by this class.
Provides search paths for libraries, in addition to those in any relevant environment variables (e.g.
LD_LIBRARY_PATH
).Note: for Unix compilers, these are the things that get
L
prefixed in the compiler command line arguments.Examples
 def c_lib_dirs(self, **kwargs):
 return [‘/usr/local/lib’, ‘/opt/weirdpath/build/libs’].

c_libraries
(**kwargs)[source]¶ Return a list of libraries required by code returned by this class.
The compiler will search the directories specified by the environment variable LD_LIBRARY_PATH in addition to any returned by c_lib_dirs.
Note: for Unix compilers, these are the things that get
l
prefixed in the compiler command line arguments.Examples
 def c_libraries(self, **kwargs):
 return [‘gsl’, ‘gslcblas’, ‘m’, ‘fftw3’, ‘g2c’].

c_support_code
(**kwargs)[source]¶ Return utility code for use by a Variable or Op.
This is included at global scope prior to the rest of the code for this class.
Question: How many times will this support code be emitted for a graph with many instances of the same type?
Returns: Return type: str

do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed tradeoff. Also, it could make things faster as constants can’t be used for inplace operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool


class
theano.gpuarray.dnn.
GpuDnnBatchNorm
(mode='peractivation', running_averages=False, inplace_running_mean=False, inplace_running_var=False, inplace_output=False)[source]¶ Base Op for cuDNN Batch Normalization.
Parameters:  mode ({'peractivation', 'spatial'}) – Whether to normalize per activation (in this mode, bias and scale tensor dimensions are 1xCxHxW) or share normalization factors across spatial dimensions (in this mode, bias and scale tensor dimensions are 1xCx1x1).
 epsilon – Epsilon value used in the batch normalization formula. Minimum allowed value is 1e5 (imposed by cuDNN).
 running_average_factor (float) – Factor for updating the values or running_mean and running_var. If the factor is close to one, the running averages will update quickly, if the factor is close to zero it will update slowly.
 running_mean (tensor or None) – Previous value of the running mean. If this is given, the new value
running_mean * (1  r_a_factor) + batch mean * r_a_factor
will be returned as one of the outputs of this function. running_mean and running_var should either both be given or both be None.  running_var (tensor or None) – Previous value of the running variance. If this is given, the new value
running_var * (1  r_a_factor) + (m / (m  1)) * batch var * r_a_factor
will be returned as one of the outputs of this function, where m is the product of lengths of the averagedover dimensions. running_mean and running_var should either both be given or both be None.

L_op
(inputs, outputs, grads)[source]¶ Construct a graph for the Loperator.
This method is primarily used by Lop and dispatches to Op.grad by default.
The Loperator computes a row vector times the Jacobian. The mathematical relationship is . The Loperator is also supported for generic tensors (not only for vectors).
Parameters:  inputs (list of Variable) –
 outputs (list of Variable) –
 output_grads (list of Variable) –

make_node
(x, scale, bias, epsilon=0.0001, running_average_factor=0.1, running_mean=None, running_var=None)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by subclasses.
Returns: node – The constructed Apply node. Return type: Apply

class
theano.gpuarray.dnn.
GpuDnnBatchNormInference
(mode='peractivation', inplace=False)[source]¶ Base Op for cuDNN Batch Normalization.
Parameters:  mode ({'peractivation', 'spatial'}) – Whether to normalize per activation (in this mode, bias and scale tensor dimensions are 1xCxHxW) or share normalization factors across spatial dimensions (in this mode, bias and scale tensor dimensions are 1xCx1x1).
 epsilon – Epsilon value used in the batch normalization formula. Minimum allowed value is 1e5 (imposed by cuDNN).

grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

class
theano.gpuarray.dnn.
GpuDnnConv
(algo=None, inplace=False, num_groups=1)[source]¶ The forward convolution.
Parameters:  image –
 kernel –
 descr – The convolution descriptor.
 algo ({'small', 'none', 'large', 'fft', 'fft_tiling', 'winograd', 'guess_once',) – ‘guess_on_shape_change’, ‘time_once’, ‘time_on_shape_change’}
Default is the value of
config.dnn__conv__algo_fwd
.  num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately

static
get_out_shape
(ishape, kshape, border_mode, subsample, dilation)[source]¶ This function computes the output shape for a convolution with the specified parameters. ishape and kshape can be symbolic or scalar.

grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

class
theano.gpuarray.dnn.
GpuDnnConvDesc
(border_mode, subsample=(1, 1), dilation=(1, 1), conv_mode='conv', precision='float32', num_groups=1)[source]¶ This Op builds a convolution descriptor for use in the other convolution operations.
See the doc of
dnn_conv()
for a description of the parameters
c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply

c_compile_args
(**kwargs)[source]¶ Return a list of recommended compile arguments for code returned by other methods in this class.
Compiler arguments related to headers, libraries and search paths should be provided via the functions c_headers, c_libraries, c_header_dirs, and c_lib_dirs.
Examples
 def c_compile_args(self, **kwargs):
 return [‘ffastmath’]

c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get I prefixed in the compiler command line arguments.
Examples
 def c_header_dirs(self, **kwargs):
 return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]

c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in doublequotes.Examples
 def c_headers(self, **kwargs):
 return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]

c_lib_dirs
(**kwargs)[source]¶ Return a list of library search paths required by code returned by this class.
Provides search paths for libraries, in addition to those in any relevant environment variables (e.g.
LD_LIBRARY_PATH
).Note: for Unix compilers, these are the things that get
L
prefixed in the compiler command line arguments.Examples
 def c_lib_dirs(self, **kwargs):
 return [‘/usr/local/lib’, ‘/opt/weirdpath/build/libs’].

c_libraries
(**kwargs)[source]¶ Return a list of libraries required by code returned by this class.
The compiler will search the directories specified by the environment variable LD_LIBRARY_PATH in addition to any returned by c_lib_dirs.
Note: for Unix compilers, these are the things that get
l
prefixed in the compiler command line arguments.Examples
 def c_libraries(self, **kwargs):
 return [‘gsl’, ‘gslcblas’, ‘m’, ‘fftw3’, ‘g2c’].

do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed tradeoff. Also, it could make things faster as constants can’t be used for inplace operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool


class
theano.gpuarray.dnn.
GpuDnnConvGradI
(inplace=False, algo=None, num_groups=1)[source]¶ The convolution gradient with respect to the inputs.
Parameters:  image –
 kernel –
 descr – The convolution descriptor.
 algo ({'none', 'deterministic', 'fft', 'fft_tiling', 'winograd', 'guess_once',) – ‘guess_on_shape_change’, ‘time_once’, ‘time_on_shape_change’}
Default is the value of
config.dnn__conv__algo_bwd_data
.  num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately

grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

class
theano.gpuarray.dnn.
GpuDnnConvGradW
(inplace=False, algo=None, num_groups=1)[source]¶ The convolution gradient with respect to the weights.
Parameters:  image –
 kernel –
 descr – The convolution descriptor.
 algo ({'none', 'deterministic', 'fft', 'small', 'guess_once',) – ‘guess_on_shape_change’, ‘time_once’, ‘time_on_shape_change’}
Default is the value of
config.dnn__conv__algo_bwd_filter
.  num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately

grad
(inp, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

class
theano.gpuarray.dnn.
GpuDnnDropoutOp
(inplace=False)[source]¶ 
make_node
(inp, descriptor, state)[source]¶ Construct an Apply node that represent the application of this operation to the given inputs.
This must be implemented by subclasses.
Returns: node – The constructed Apply node. Return type: Apply

prepare_node
(node, storage_map, compute_map, impl)[source]¶ Make any special modifications that the Op needs before doing Op.make_thunk.
This can modify the node inplace and should return nothing.
It can be called multiple time with different impl. It is the op responsibility to don’t reprepare the node when it isn’t good to do so.


class
theano.gpuarray.dnn.
GpuDnnPool
(mode='max')[source]¶ Parameters:  img (tensor) – The image 4d or 5d tensor.
 ws (tensor) – Window size.
 stride (tensor) – (dx, dy) or (dx, dy, dz).
 mode ({'max', 'average_inc_pad', 'average_exc_pad'}) – The old deprecated name ‘average’ corresponds to ‘average_inc_pad’.
 pad (tensor) – (padX, padY) or (padX, padY, padZ)

L_op
(inp, outputs, grads)[source]¶ Construct a graph for the Loperator.
This method is primarily used by Lop and dispatches to Op.grad by default.
The Loperator computes a row vector times the Jacobian. The mathematical relationship is . The Loperator is also supported for generic tensors (not only for vectors).
Parameters:  inputs (list of Variable) –
 outputs (list of Variable) –
 output_grads (list of Variable) –

class
theano.gpuarray.dnn.
GpuDnnPoolBase
(mode='max')[source]¶ Abstract base class for GpuDnnPool and GpuDnnPoolGrad.

class
theano.gpuarray.dnn.
GpuDnnPoolDesc
(ws=(1, 1), stride=(1, 1), mode='max', pad=(0, 0))[source]¶ This Op builds a pooling descriptor for use in the other pooling operations.
ws, stride and pad must have the same length.
Parameters:  ws (tuple) – Window size.
 stride (tuple) – (dx, dy) or (dx, dy, dz).
 mode ({'max', 'average_inc_pad', 'average_exc_pad'}) – The old deprecated name ‘average’ corresponds to ‘average_inc_pad’.
 pad (tuple) – (padX, padY) or (padX, padY, padZ)
Notes
Not used anymore. Only needed to reload old pickled files.

c_code
(node, name, inputs, outputs, sub)[source]¶ Return the C implementation of an Op.
Returns C code that does the computation associated to this Op, given names for the inputs and outputs.
Parameters:  node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.
 name (str) – A name that is automatically assigned and guaranteed to be unique.
 inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.
 outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be prefilled. The value for an unallocated output is typedependent.
 sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

c_code_cache_version
()[source]¶ Return a tuple of integers indicating the version of this Op.
An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.
The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.
See also
c_code_cache_version_apply

c_header_dirs
(**kwargs)[source]¶ Return a list of header search paths required by code returned by this class.
Provides search paths for headers, in addition to those in any relevant environment variables.
Note: for Unix compilers, these are the things that get I prefixed in the compiler command line arguments.
Examples
 def c_header_dirs(self, **kwargs):
 return [‘/usr/local/include’, ‘/opt/weirdpath/src/include’]

c_headers
(**kwargs)[source]¶ Return a list of header files required by code returned by this class.
These strings will be prefixed with
#include
and inserted at the beginning of the C source code.Strings in this list that start neither with
<
nor"
will be enclosed in doublequotes.Examples
 def c_headers(self, **kwargs):
 return [‘<iostream>’, ‘<math.h>’, ‘/full/path/to/header.h’]

c_lib_dirs
(**kwargs)[source]¶ Return a list of library search paths required by code returned by this class.
Provides search paths for libraries, in addition to those in any relevant environment variables (e.g.
LD_LIBRARY_PATH
).Note: for Unix compilers, these are the things that get
L
prefixed in the compiler command line arguments.Examples
 def c_lib_dirs(self, **kwargs):
 return [‘/usr/local/lib’, ‘/opt/weirdpath/build/libs’].

c_libraries
(**kwargs)[source]¶ Return a list of libraries required by code returned by this class.
The compiler will search the directories specified by the environment variable LD_LIBRARY_PATH in addition to any returned by c_lib_dirs.
Note: for Unix compilers, these are the things that get
l
prefixed in the compiler command line arguments.Examples
 def c_libraries(self, **kwargs):
 return [‘gsl’, ‘gslcblas’, ‘m’, ‘fftw3’, ‘g2c’].

do_constant_folding
(fgraph, node)[source]¶ Determine whether or not constant folding should be performed for the given node.
This allows each Op to determine if it wants to be constant folded when all its inputs are constant. This allows it to choose where it puts its memory/speed tradeoff. Also, it could make things faster as constants can’t be used for inplace operations (see *IncSubtensor).
Parameters: node (Apply) – The node for which the constant folding determination is made. Returns: res Return type: bool

class
theano.gpuarray.dnn.
GpuDnnPoolGrad
(mode='max')[source]¶ The pooling gradient.
Parameters:  inp – The input of the pooling.
 out – The output of the pooling in the forward.
 out_grad – Same size as out, but is the corresponding gradient information.
 ws (tensor variable) – Window size.
 stride (tensor variable) – (dx, dy) or (dx, dy, dz).
 mode ({'max', 'average_inc_pad', 'average_exc_pad'}) – The old deprecated name ‘average’ corresponds to ‘average_inc_pad’.
 pad (tensor) – (padX, padY) or (padX, padY, padZ)

class
theano.gpuarray.dnn.
GpuDnnRNNGradInputs
(rnn_mode, grad_h, grad_c)[source]¶

class
theano.gpuarray.dnn.
GpuDnnRNNOp
(rnn_mode, direction_mode)[source]¶ 
L_op
(inputs, outputs, output_grads)[source]¶ Construct a graph for the Loperator.
This method is primarily used by Lop and dispatches to Op.grad by default.
The Loperator computes a row vector times the Jacobian. The mathematical relationship is . The Loperator is also supported for generic tensors (not only for vectors).
Parameters:  inputs (list of Variable) –
 outputs (list of Variable) –
 output_grads (list of Variable) –


class
theano.gpuarray.dnn.
GpuDnnSoftmax
(algo, mode)[source]¶ Op for the cuDNN Softmax.
 algo{‘fast’, ‘accurate’, ‘log’}
 Indicating whether, respectively, computations should be optimized for speed, for accuracy, or if cuDNN should rather compute the logsoftmax instead.
 mode{‘instance’, ‘channel’}
 Indicating whether the softmax should be computed per image across ‘c01’ or per spatial location ‘01’ per image across ‘c’.

L_op
(inp, outputs, grads)[source]¶ Construct a graph for the Loperator.
This method is primarily used by Lop and dispatches to Op.grad by default.
The Loperator computes a row vector times the Jacobian. The mathematical relationship is . The Loperator is also supported for generic tensors (not only for vectors).
Parameters:  inputs (list of Variable) –
 outputs (list of Variable) –
 output_grads (list of Variable) –

class
theano.gpuarray.dnn.
GpuDnnSoftmaxBase
(algo, mode)[source]¶ Op for the cuDNN Softmax.
Parameters:  algo ({'fast', 'accurate', 'log'}) – Indicating whether, respectively, computations should be optimized for speed, for accuracy, or if cuDNN should rather compute the logsoftmax instead.
 mode ({'instance', 'channel'}) – Indicating whether the softmax should be computed per image across ‘c01’ or per spatial location ‘01’ per image across ‘c’.

class
theano.gpuarray.dnn.
GpuDnnSoftmaxGrad
(algo, mode)[source]¶ Op for the cuDNN SoftmaxGrad.
Parameters:  algo – ‘fast’, ‘accurate’ or ‘log’ indicating whether, respectively, computations should be optimized for speed, for accuracy, or if cuDNN should rather compute the gradient of the logsoftmax instead.
 mode – ‘instance’ or ‘channel’ indicating whether the softmax should be computed per image across ‘c01’ or per spatial location ‘01’ per image across ‘c’.

class
theano.gpuarray.dnn.
GpuDnnTransformerGradI
[source]¶ Gradient of inputs Op for cuDNN Spatial Transformer.

class
theano.gpuarray.dnn.
GpuDnnTransformerGradT
[source]¶ Gradient of affine transformations Op for cuDNN Spatial Transformer.

class
theano.gpuarray.dnn.
GpuDnnTransformerGrid
[source]¶ Grid generator Op for cuDNN Spatial Transformer.

grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

make_node
(theta, out_dims)[source]¶ Create a grid generator node for a cuDNN Spatial Transformer
Parameters:  theta (tensor) – Affine transformation tensor containing one affine transformation
matrix per image.
theta
is usually generated by the localization network.  out_dims (tuple) – Dimensions of the transformed inputs, containing four elements, and is given by (N, C, H, W), where N is the number of inputs, C the number of channels, H and W are the height and width of each input.
 theta (tensor) – Affine transformation tensor containing one affine transformation
matrix per image.


class
theano.gpuarray.dnn.
GpuDnnTransformerSampler
[source]¶ Grid sampler Op for cuDNN Spatial Transformer.

grad
(inputs, grads)[source]¶ Construct a graph for the gradient with respect to each input variable.
Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.
Parameters:  inputs (list of Variable) – The input variables.
 output_grads (list of Variable) – The gradients of the output variables.
Returns: grads – The gradients with respect to each Variable in inputs.
Return type: list of Variable

make_node
(img, grid)[source]¶ Create a grid sampler node for a cuDNN Spatial Transformer
Parameters:  img (tensor) – Images from which the pixels will be sampled. The implementation assumes the tensor is in NCHW format, where N is the number of images, C is the number of color channels, H is the height of the inputs, and W is width of the inputs.
 grid (GpuDnnTransformerGrid) – Grid that contains the coordinates of the pixels to be sampled from the inputs images.


class
theano.gpuarray.dnn.
MakerCDataType
(ctype, freefunc=None, headers=(), header_dirs=(), libraries=(), lib_dirs=(), compile_args=(), extra_support_code='', version=None)[source]¶ This CDataType provides a make_value method.
It also has CDataType._fn field that caches a compiled function used by CDataType.make_value.
This was a very lame hack that was removed from CDataType itself.

class
theano.gpuarray.dnn.
RNNBlock
(dtype, hidden_size, num_layers, rnn_mode, input_mode='linear', direction_mode='unidirectional', context_name=None)[source]¶ An object that allow us to use CuDNN RNN implementation. TODO: make an example how to use. You can check Theano tests test_dnn_rnn_gru() and test_dnn_rnn_lstm() in the file theano/gpuarray/tests/test_dnn.py for now.
Parameters:  dtype (data type of computation) –
 hidden_size (int) – hidden layer dimension.
 num_layers (int) – number of the recurrent layer you want to set.
 rnn_mode ({'rnn_relu', 'rnn_tanh', 'lstm', 'gru'}) –
rnn_relu: A singlegate recurrent neural network with a ReLU activation function.
h_t=ReLU(W_ix_t+U_ih_{t1}+b_{wi}+b_{Ri}) rnn_tanh: A singlegate recurrent neural network with a tanh activation function.
h_t=tanh(W_ix_t+U_ih_{t1}+b_{wi}+b_{Ri})
lstm: A fourgate Long ShortTerm Memory network with no peephole connections. gru: A threegate network consisting of Gated Recurrent Units.
 input_mode ({'linear', 'skip'}) – linear: input will be multiplied by a biased matrix skip: No operation is performed on the input. The size must match the hidden size.
 direction_mode ({'unidirectional', 'bidirectional'}) – unidirectional: The network operates recurrently from the first input to the last. bidirectional: The network operates from first to last then from last to first and concatenates the results at each layer.

apply
(w, x, hx, cx=None)[source]¶ Apply the RNN to some data
Parameters:  w – opaque parameter block
 x – input
 hx – initial hidden state
 cx – initial cell state (for LSTM)

get_param_size
(input_size)[source]¶ Get the size of the shared variable for the parameters of the RNN.
This will return a size (in items) necessary to store all the parameters for the RNN. You should allocate a variable of that size to store those parameters. The order and layout of the parameters is opaque.
Parameters: input_size ((int, int)) – Size of the input blocks

split_params
(w, layer, input_size)[source]¶ Split the opaque parameter block into components.
Parameters:  w (GpuArraySharedVariable) – opaque parameter block
 layer (int) – ID of the layer
 input_size ((int, int)) – Size of the input blocks

theano.gpuarray.dnn.
dnn_batch_normalization_test
(inputs, gamma, beta, mean, var, mode='peractivation', epsilon=0.0001)[source]¶ Performs batch normalization of the given inputs, using the given mean and variance.
Parameters:  mode ({'peractivation', 'spatial'}) – Whether to normalize per activation or share normalization factors across spatial dimensions (i.e., all dimensions past the second).
 gamma (tensor) – Scale factors. Must match the dimensionality of inputs, but have
sizes of 1 for all axes normalized over (i.e., in the first dimension
for
mode='peractivation'`, and additionally in all dimensions past the second for ``mode='spatial'
).  beta (tensor) – Biases. Must match the tensor layout of gamma.
 mean (tensor) – Means. Usually these are running averages computed during training. Must match the tensor layout of gamma.
 var (tensor) – Variances. Usually these are running averages computed during training. Must match the tensor layout of gamma.
 epsilon (float) – Epsilon value used in the batch normalization formula. Minimum allowed value is 1e5 (imposed by cuDNN).
Returns: out – Batchnormalized inputs.
Return type: tensor
Notes
Requires cuDNN 5 and Theano 0.9dev2 or more recent.
For 4d tensors, the returned value is equivalent to:
axes = (0,) if mode == 'peractivation' else (0, 2, 3) gamma, beta, mean, var = (T.addbroadcast(t, *axes) for t in (gamma, beta, mean, var)) out = (inputs  mean) * gamma / T.sqrt(var + epsilon) + beta
For 5d tensors, the axes would be (0, 2, 3, 4).

theano.gpuarray.dnn.
dnn_batch_normalization_train
(inputs, gamma, beta, mode='peractivation', epsilon=0.0001, running_average_factor=0.1, running_mean=None, running_var=None)[source]¶ Performs batch normalization of the given inputs, using the mean and variance of the inputs.
Parameters:  mode ({'peractivation', 'spatial'}) – Whether to normalize per activation or share normalization factors across spatial dimensions (i.e., all dimensions past the second).
 gamma (tensor) – Learnable scale factors. Must match the dimensionality of inputs,
but have sizes of 1 for all axes normalized over (i.e., in the first
dimension for
mode='peractivation'`, and additionally in all dimensions past the second for ``mode='spatial'
).  beta (tensor) – Learnable biases. Must match the tensor layout of gamma.
 epsilon (float) – Epsilon value used in the batch normalization formula. Minimum allowed value is 1e5 (imposed by cuDNN).
 running_average_factor (float) – Factor for updating the values or running_mean and running_var. If the factor is close to one, the running averages will update quickly, if the factor is close to zero it will update slowly.
 running_mean (tensor or None) – Previous value of the running mean. If this is given, the new value
running_mean * (1  r_a_factor) + batch mean * r_a_factor
will be returned as one of the outputs of this function. running_mean and running_var should either both be given or both be None.  running_var (tensor or None) – Previous value of the running variance. If this is given, the new value
running_var * (1  r_a_factor) + (m / (m  1)) * batch var * r_a_factor
will be returned as one of the outputs of this function, where m is the product of lengths of the averagedover dimensions. running_mean and running_var should either both be given or both be None.
Returns:  out (tensor) – Batchnormalized inputs.
 mean (tensor) – Means of inputs across the normalization axes.
 invstd (tensor) – Inverse standard deviations of inputs across the normalization axes.
 new_running_mean (tensor) – New value of the running mean (only if both running_mean and running_var were given).
 new_running_var (tensor) – New value of the running variance (only if both running_var and running_mean were given).
Notes
Requires cuDNN 5 and Theano 0.9dev2 or more recent.
For 4d tensors, returned values are equivalent to:
axes = 0 if mode == 'peractivation' else (0, 2, 3) mean = inputs.mean(axes, keepdims=True) var = inputs.var(axes, keepdims=True) invstd = T.inv(T.sqrt(var + epsilon)) out = (inputs  mean) * gamma * invstd + beta m = T.cast(T.prod(inputs.shape) / T.prod(mean.shape), 'float32') running_mean = running_mean * (1  running_average_factor) + \ mean * running_average_factor running_var = running_var * (1  running_average_factor) + \ (m / (m  1)) * var * running_average_factor
For 5d tensors, the axes are (0, 2, 3, 4).

theano.gpuarray.dnn.
dnn_conv
(fgraph, img, kerns, border_mode='valid', subsample=(1, 1), dilation=(1, 1), conv_mode='conv', direction_hint=None, workmem=None, algo=None, precision=None, num_groups=1)[source]¶ GPU convolution using cuDNN from NVIDIA.
The memory layout to use is ‘bc01’, that is ‘batch’, ‘channel’, ‘first dim’, ‘second dim’ in that order.
Parameters:  fgraph (FunctionGraph) – The function graph containing img.
 img – Images to do the convolution over.
 kerns – Convolution filters.
 border_mode – One of ‘valid’, ‘full’, ‘half’; additionally, the padding size could be directly specified by an integer or a pair of integers.
 subsample – Perform subsampling of the output (default: (1, 1)).
 dilation – Filter dilation factor. A dilation factor of d is equivalent to a convolution with d  1 zeros inserted between neighboring filter values.
 conv_mode – Perform convolution (kernels flipped) or crosscorrelation. One of ‘conv’, ‘cross’ (default: ‘conv’).
 direction_hint – Used by graph optimizers to change algorithm choice. By default, GpuDnnConv will be used to carry out the convolution. If border_mode is ‘valid’, subsample is (1, 1), dilation is (1, 1), and direction_hint is ‘bprop weights’, it will use GpuDnnConvGradW. If border_mode is ‘full’, subsample is (1, 1), dilation is (1, 1), and direction_hint is not ‘forward!’, it will use GpuDnnConvGradI. This parameter is used internally by graph optimizers and may be removed at any time without a deprecation period. You have been warned.
 algo ({'none', 'small', 'large', 'fft', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'}) – Convolution implementation to use. Some of its values may
require certain versions of cuDNN to be installed. Default is
the value of
config.dnn__conv__algo_fwd
.  precision ({'as_input_f32', 'as_input', 'float16', 'float32', 'float64'}) – Description of the dtype in which the computation of the convolution
should be done. Possible values are ‘as_input’, ‘float16’, ‘float32’
and ‘float64’. Default is the value of
config.dnn__conv__precision
.  num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately
Warning
The cuDNN library only works with GPUs that have a compute capability of 3.0 or higher. This means that older GPUs will not work with this Op.

theano.gpuarray.dnn.
dnn_conv3d
(fgraph, img, kerns, border_mode='valid', subsample=(1, 1, 1), dilation=(1, 1, 1), conv_mode='conv', direction_hint=None, algo=None, precision=None, num_groups=1)[source]¶ GPU convolution using cuDNN from NVIDIA.
The memory layout to use is ‘bc012’, that is ‘batch’, ‘channel’, ‘first dim’, ‘second dim’, ‘third dim’ in that order.
Parameters:  fgraph (FunctionGraph) – The FunctionGraph containing img.
 img – Images to do the convolution over.
 kerns – Convolution filters.
 border_mode – One of ‘valid’, ‘full’, ‘half’; additionally, the padding size could be directly specified by an integer or a pair of integers.
 subsample – Perform subsampling of the output (default: (1, 1, 1)).
 dilation – Filter dilation factor. A dilation factor of d is equivalent to a convolution with d  1 zeros inserted between neighboring filter values.
 conv_mode – Perform convolution (kernels flipped) or crosscorrelation. One of ‘conv’, ‘cross’ (default: ‘conv’).
 direction_hint – Used by graph optimizers to change algorithm choice. By default, GpuDnnConv will be used to carry out the convolution. If border_mode is ‘valid’, subsample is (1, 1, 1), dilation is (1, 1, 1), and direction_hint is ‘bprop weights’, it will use GpuDnnConvGradW. If border_mode is ‘full’, subsample is (1, 1, 1), dilation is (1, 1, 1), and direction_hint is not ‘forward!’, it will use GpuDnnConvGradI. This parameter is used internally by graph optimizers and may be removed at any time without a deprecation period. You have been warned.
 algo (convolution implementation to use. Only 'none' is implemented) – for the conv3d. Default is the value of
config.dnn__conv__algo_fwd
.  precision ({'as_input_f32', 'as_input', 'float16', 'float32', 'float64'}) – Description of the dtype in which the computation of the convolution
should be done. Possible values are ‘as_input’, ‘float16’, ‘float32’
and ‘float64’. Default is the value of
config.dnn__conv__precision
.  num_groups – Divides the image, kernel and output tensors into num_groups separate groups. Each which carry out convolutions separately
Warning
The cuDNN library only works with GPUs that have a compute capability of 3.0 or higher. This means that older GPUs will not work with this Op.

theano.gpuarray.dnn.
dnn_gradinput
(kerns, topgrad, img_shp, border_mode='valid', subsample=(1, 1), dilation=(1, 1), conv_mode='conv', precision=None, algo=None, num_groups=1)[source]¶ TODO: document this

theano.gpuarray.dnn.
dnn_gradinput3d
(kerns, topgrad, img_shp, border_mode='valid', subsample=(1, 1, 1), dilation=(1, 1, 1), conv_mode='conv', precision=None, algo=None, num_groups=1)[source]¶ 3d version of dnn_gradinput.

theano.gpuarray.dnn.
dnn_gradweight
(img, topgrad, kerns_shp, border_mode='valid', subsample=(1, 1), dilation=(1, 1), conv_mode='conv', precision=None, algo=None, num_groups=1)[source]¶ TODO: document this

theano.gpuarray.dnn.
dnn_gradweight3d
(img, topgrad, kerns_shp, border_mode='valid', subsample=(1, 1, 1), dilation=(1, 1, 1), conv_mode='conv', precision=None, algo=None, num_groups=1)[source]¶ 3d version of dnn_gradweight

theano.gpuarray.dnn.
dnn_pool
(img, ws, stride=None, mode='max', pad=None)[source]¶ GPU pooling using cuDNN from NVIDIA.
The memory layout to use is ‘bc01’, that is ‘batch’, ‘channel’, ‘first dim’, ‘second dim’ in that order.
ws, stride and pad must have the same length.
Parameters:  img – Images to do the pooling over.
 ws (tuple) – Subsampling window size. Should have 2 or 3 elements.
 stride (tuple) – Subsampling stride (default: (1, 1) or (1, 1, 1)).
 mode ({'max', 'average_inc_pad', 'average_exc_pad', 'sum', 'max_deterministic'}) – NB: ‘max_deterministic’ is supported since cuDNN v6.
 pad (tuple) – (padX, padY) or (padX, padY, padZ) default: (0, 0) or (0, 0, 0)
Warning
The cuDNN library only works with GPU that have a compute capability of 3.0 or higher. This means that older GPU will not work with this Op.
Notes
This Op implements the ignore_border=True of max_pool_2d.

theano.gpuarray.dnn.
dnn_spatialtf
(img, theta, scale_width=1, scale_height=1)[source]¶ GPU spatial transformer using cuDNN from NVIDIA.
Parameters:  img (tensor) – Images to which the transformations will be applied. The implementation assumes the tensor is in NCHW format, where N is the number of images, C is the number of color channels, H is the height of the inputs, and W is width of the inputs.
 theta (tensor) – Affine transformation tensor containing one affine transformation
matrix per image.
theta
is usually generated by the localization network.  scale_height (float) – A float specifying the scaling factor for the height of the output image. A value of 1 will keep the original height of the input. Values larger than 1 will upsample the input. Values below 1 will downsample the input.
 scale_width (float) – A float specifying the scaling factor for the width of the output image. A value of 1 will keep the original width of the input. Values larger than 1 will upsample the input. Values below 1 will downsample the input.
Returns: out – Transformed images with width and height properly scaled.
Return type: tensor
Notes
Currently, cuDNN only supports 2D transformations with 2x3 affine transformation matrices.
Bilinear interpolation is the only grid sampler method available.

theano.gpuarray.dnn.
version
(raises=True)[source]¶ Return the current cuDNN version we link with.
This also does a check that the header version matches the runtime version.
Raises: If True, raise an exception if cuDNN is not present. Otherwise, return 1. It always raise an RuntimeError if the header and library version are not the same.