# `tensor.elemwise` – Tensor Elemwise¶

class `theano.tensor.elemwise.``All`(axis=None)[source]

Applies logical and to all the values of a tensor along the specified axis(es).

`make_node`(input)[source]

Create a “apply” nodes for the inputs in that order.

class `theano.tensor.elemwise.``Any`(axis=None)[source]

Applies bitwise or to all the values of a tensor along the specified axis(es).

`make_node`(input)[source]

Create a “apply” nodes for the inputs in that order.

class `theano.tensor.elemwise.``CAReduce`(scalar_op, axis=None)[source]

CAReduce = Commutative Associative Reduce Reduces a scalar operation along the specified axis(es). (The scalar op should be both commutative and assocative)

The output will have the same shape as the input minus the reduced dimensions. It will contain the variable of accumulating all values over the reduced dimensions using the specified scalar op.

Parameters
• scalar_op – A binary scalar op with only one output. It must be commutative and associative.

• axis

• The dimension along which we want to reduce

• List of dimensions that we want to reduce

• If None, all dimensions are reduced

Notes

```CAReduce(add)      # sum (ie, acts like the numpy sum operation)
CAReduce(mul)      # product
CAReduce(maximum)  # max
CAReduce(minimum)  # min
CAReduce(or_)      # any # not lazy
CAReduce(and_)     # all # not lazy
CAReduce(xor)      # a bit at 1 tell that there was an odd number of
# bit at that position that where 1. 0 it was an
# even number ...
```

In order to (eventually) optimize memory usage patterns, CAReduce makes zero guarantees on the order in which it iterates over the dimensions and the elements of the array(s). Therefore, to ensure consistent variables, the scalar operation represented by the reduction must be both commutative and associative (eg add, multiply, maximum, binary or/and/xor - but not subtract, divide or power).

`c_code`(node, name, inames, onames, sub)[source]

Required: return the C implementation of an Op.

Returns C code that does the computation associated to this Op, given names for the inputs and outputs.

Parameters
• node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.

• name (str) – A name that is automatically assigned and guaranteed to be unique.

• inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.

• outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.

• sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

Raises

MethodNotDefined – The subclass does not override this method.

`c_code_cache_version_apply`(node)[source]

Return a tuple of integers indicating the version of this Op.

An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.

The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.

Notes

This function overrides c_code_cache_version unless it explicitly calls c_code_cache_version. The default implementation simply calls c_code_cache_version and ignores the node argument.

`c_headers`()[source]

Optional: Return a list of header files required by code returned by this class.

Examples

These strings will be prefixed with “#include ” and inserted at the beginning of the c source code.

Strings in this list that start neither with ‘<’ nor ‘”’ will be enclosed in double-quotes.

Raises

MethodNotDefined – Subclass does not implement this method.

`make_node`(input)[source]

Create a “apply” nodes for the inputs in that order.

`perform`(node, inp, out)[source]

Required: Calculate the function on the inputs and put the variables in the output storage. Return None.

Parameters
• node (Apply instance) – Contains the symbolic inputs and outputs.

• inputs (list) – Sequence of inputs (immutable).

• output_storage (list) – List of mutable 1-element lists (do not change the length of these lists)

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a Numpy ndarray, with the right number of dimensions, and the correct dtype. Its shape and stride pattern, can be arbitrary. It not is guaranteed that it was produced by a previous call to impl. It could be allocated by another Op impl is free to reuse it as it sees fit, or to discard it and allocate new memory.

Raises

MethodNotDefined – The subclass does not override this method.

class `theano.tensor.elemwise.``CAReduceDtype`(scalar_op, axis=None, dtype=None, acc_dtype=None)[source]

Reduces a scalar operation along the specified axis(es).

This subclass of CAReduce accepts an additional “dtype” parameter, that specifies which dtype the output should be.

It also accepts an optional “acc_dtype”, which specify the dtype that will be used for the accumulation.

So, the accumulation will be done into a tensor of dtype “acc_dtype”, then it will be casted into “dtype” and returned.

If no dtype is provided, one will be inferred so as not to lose too much precision.

Parameters
• scalar_op – A binary scalar op with only one output. It must be commutative and associative.

• axis

• the dimension along which we want to reduce

• list of dimensions that we want to reduce

• if None, all dimensions are reduced

• dtype

The dtype of the returned tensor. If None, then we use the default dtype which is the same as the input tensor’s dtype except when:

• the input dtype is a signed integer of precision < 64 bit, in which case we use int64

• the input dtype is an unsigned integer of precision < 64 bit, in which case we use uint64

This default dtype does _not_ depend on the value of “acc_dtype”. This behavior is similar in spirit to that of numpy (except numpy uses the default machine integer while we always use 64 bit integers to avoid platform-dependent behavior).

• acc_dtype

The dtype of the internal accumulator. If None (default), we use the dtype in the list below, or the input dtype if its precision is higher:

• for int dtypes, we use at least int64;

• for uint dtypes, we use at least uint64;

• for float dtypes, we use at least float64;

• for complex dtypes, we use at least complex128.

`make_node`(input)[source]

Create a “apply” nodes for the inputs in that order.

class `theano.tensor.elemwise.``DimShuffle`(input_broadcastable, new_order, inplace=True)[source]

Allows to reorder the dimensions of a tensor or insert or remove broadcastable dimensions.

In the following examples, ‘x’ means that we insert a broadcastable dimension and a numerical index represents the dimension of the same rank in the tensor passed to perform.

Parameters

• new_order – A list representing the relationship between the input’s dimensions and the output’s dimensions. Each element of the list can either be an index or ‘x’. Indices must be encoded as python integers, not theano symbolic integers.

• inplace (bool, optional) – If True (default), the output will be a view of the input.

Notes

If j = new_order[i] is an index, the output’s ith dimension will be the input’s jth dimension. If new_order[i] is x, the output’s ith dimension will be 1 and Broadcast operations will be allowed to do broadcasting over that dimension.

If input.broadcastable[i] == False then i must be found in new_order. Broadcastable dimensions, on the other hand, can be discarded.

```DimShuffle((False, False, False), ['x', 2, 'x', 0, 1])
```

This op will only work on 3d tensors with no broadcastable dimensions. The first dimension will be broadcastable, then we will have the third dimension of the input tensor as the second of the resulting tensor, etc. If the tensor has shape (20, 30, 40), the resulting tensor will have dimensions (1, 40, 1, 20, 30). (AxBxC tensor is mapped to 1xCx1xAxB tensor)

```DimShuffle((True, False), [1])
```

This op will only work on 2d tensors with the first dimension broadcastable. The second dimension of the input tensor will be the first dimension of the resulting tensor. If the tensor has shape (1, 20), the resulting tensor will have shape (20, ).

Examples

```DimShuffle((), ['x'])  # make a 0d (scalar) into a 1d vector
DimShuffle((False, False), [0, 1])  # identity
DimShuffle((False, False), [1, 0])  # inverts the 1st and 2nd dimensions
DimShuffle((False,), ['x', 0])  # make a row out of a 1d vector
# (N to 1xN)
DimShuffle((False,), [0, 'x'])  # make a column out of a 1d vector
# (N to Nx1)
DimShuffle((False, False, False), [2, 0, 1])  # AxBxC to CxAxB
DimShuffle((False, False), [0, 'x', 1])  # AxB to Ax1xB
DimShuffle((False, False), [1, 'x', 0])  # AxB to Bx1xA
```

The reordering of the dimensions can be done with the numpy.transpose function. Adding, subtracting dimensions can be done with reshape.

`R_op`(inputs, eval_points)[source]

This method is primarily used by tensor.Rop

Suppose the op outputs

[ f_1(inputs), …, f_n(inputs) ]

Parameters
• inputs (a Variable or list of Variables) –

• eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.

Returns

rval[i] should be Rop(f=f_i(inputs),

wrt=inputs, eval_points=eval_points)

Return type

list of n elements

`make_node`(_input)[source]

Create a “apply” nodes for the inputs in that order.

`perform`(node, inp, out, params)[source]

Required: Calculate the function on the inputs and put the variables in the output storage. Return None.

Parameters
• node (Apply instance) – Contains the symbolic inputs and outputs.

• inputs (list) – Sequence of inputs (immutable).

• output_storage (list) – List of mutable 1-element lists (do not change the length of these lists)

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a Numpy ndarray, with the right number of dimensions, and the correct dtype. Its shape and stride pattern, can be arbitrary. It not is guaranteed that it was produced by a previous call to impl. It could be allocated by another Op impl is free to reuse it as it sees fit, or to discard it and allocate new memory.

Raises

MethodNotDefined – The subclass does not override this method.

class `theano.tensor.elemwise.``Elemwise`(scalar_op, inplace_pattern=None, name=None, nfunc_spec=None, openmp=None)[source]

Generalizes a scalar op to tensors.

All the inputs must have the same number of dimensions. When the Op is performed, for each dimension, each input’s size for that dimension must be the same. As a special case, it can also be 1 but only if the input’s broadcastable flag is True for that dimension. In that case, the tensor is (virtually) replicated along that dimension to match the size of the others.

The dtypes of the outputs mirror those of the scalar Op that is being generalized to tensors. In particular, if the calculations for an output are done inplace on an input, the output type must be the same as the corresponding input type (see the doc of scalar.ScalarOp to get help about controlling the output type)

Parameters
• scalar_op – An instance of a subclass of scalar.ScalarOp which works uniquely on scalars.

• inplace_pattern – A dictionary that maps the index of an output to the index of an input so the output is calculated inplace using the input’s storage. (Just like destroymap, but without the lists.)

• nfunc_spec – Either None or a tuple of three elements, (nfunc_name, nin, nout) such that getattr(numpy, nfunc_name) implements this operation, takes nin inputs and nout outputs. Note that nin cannot always be inferred from the scalar op’s own nin field because that value is sometimes 0 (meaning a variable number of inputs), whereas the numpy function may not have varargs.

Notes

Elemwise(add) represents + on tensors (x + y)
Elemwise(add, {0 : 0}) represents the += operation (x += y)
Elemwise(add, {0 : 1}) represents += on the second argument (y += x)
Elemwise(mul)(rand(10, 5), rand(1, 5)) the second input is completed along the first dimension to match the first input
Elemwise(true_div)(rand(10, 5), rand(10, 1)) same but along the second dimension
Elemwise(int_div)(rand(1, 5), rand(10, 1)) the output has size (10, 5)
Elemwise(log)(rand(3, 4, 5))
`R_op`(inputs, eval_points)[source]

This method is primarily used by tensor.Rop

Suppose the op outputs

[ f_1(inputs), …, f_n(inputs) ]

Parameters
• inputs (a Variable or list of Variables) –

• eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.

Returns

rval[i] should be Rop(f=f_i(inputs),

wrt=inputs, eval_points=eval_points)

Return type

list of n elements

`c_code`(node, nodename, inames, onames, sub)[source]

Required: return the C implementation of an Op.

Returns C code that does the computation associated to this Op, given names for the inputs and outputs.

Parameters
• node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.

• name (str) – A name that is automatically assigned and guaranteed to be unique.

• inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.

• outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.

• sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

Raises

MethodNotDefined – The subclass does not override this method.

`c_code_cache_version_apply`(node)[source]

Return a tuple of integers indicating the version of this Op.

An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.

The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.

Notes

This function overrides c_code_cache_version unless it explicitly calls c_code_cache_version. The default implementation simply calls c_code_cache_version and ignores the node argument.

`c_headers`()[source]

Return the header file name “omp.h” if openMP is supported

`c_support_code`()[source]

Optional: Return utility code (a string, or a list of strings) for use by a Variable or Op to be included at global scope prior to the rest of the code for this class.

QUESTION: How many times will this support code be emitted for a graph with many instances of the same type?

Raises

MethodNotDefined – Subclass does not implement this method.

`c_support_code_apply`(node, nodename)[source]

Optional: return utility code for use by an Op that will be inserted at global scope, that can be specialized for the support of a particular Apply node.

Parameters
• node (an Apply instance in the graph being compiled) –

• name (str) – A string or number that serves to uniquely identify this node. Symbol names defined by this support code should include the name, so that they can be called from the c_code, and so that they do not cause name collisions.

Notes

This function is called in addition to c_support_code and will supplement whatever is returned from there.

Raises

MethodNotDefined – Subclass does not implement this method.

`get_output_info`(dim_shuffle, *inputs)[source]

Return the outputs dtype and broadcastable pattern and the dimshuffled niputs.

`make_node`(*inputs)[source]

If the inputs have different number of dimensions, their shape is left-completed to the greatest number of dimensions with 1s using DimShuffle.

`perform`(node, inputs, output_storage)[source]

Required: Calculate the function on the inputs and put the variables in the output storage. Return None.

Parameters
• node (Apply instance) – Contains the symbolic inputs and outputs.

• inputs (list) – Sequence of inputs (immutable).

• output_storage (list) – List of mutable 1-element lists (do not change the length of these lists)

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a Numpy ndarray, with the right number of dimensions, and the correct dtype. Its shape and stride pattern, can be arbitrary. It not is guaranteed that it was produced by a previous call to impl. It could be allocated by another Op impl is free to reuse it as it sees fit, or to discard it and allocate new memory.

Raises

MethodNotDefined – The subclass does not override this method.

`prepare_node`(node, storage_map, compute_map, impl)[source]

Make any special modifications that the Op needs before doing make_thunk().

This can modify the node inplace and should return nothing.

It can be called multiple time with different impl. It is the op responsibility to don’t re-prepare the node when it isn’t good to do so.

`python_constant_folding`(node)[source]

Return True if we do not want to compile c code when doing constant folding of this node.

class `theano.tensor.elemwise.``MulWithoutZeros`(output_types_preference=None, name=None)[source]
`c_code`(node, name, inp, out, sub)[source]

Required: return the C implementation of an Op.

Returns C code that does the computation associated to this Op, given names for the inputs and outputs.

Parameters
• node (Apply instance) – The node for which we are compiling the current c_code. The same Op may be used in more than one node.

• name (str) – A name that is automatically assigned and guaranteed to be unique.

• inputs (list of strings) – There is a string for each input of the function, and the string is the name of a C variable pointing to that input. The type of the variable depends on the declared type of the input. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list.

• outputs (list of strings) – Each string is the name of a C variable where the Op should store its output. The type depends on the declared type of the output. There is a corresponding python variable that can be accessed by prepending “py_” to the name in the list. In some cases the outputs will be preallocated and the value of the variable may be pre-filled. The value for an unallocated output is type-dependent.

• sub (dict of strings) – Extra symbols defined in CLinker sub symbols (such as ‘fail’). WRITEME

Raises

MethodNotDefined – The subclass does not override this method.

`c_code_cache_version`()[source]

Return a tuple of integers indicating the version of this Op.

An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.

The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.

`c_code_cache_version_apply()`

class `theano.tensor.elemwise.``Prod`(axis=None, dtype=None, acc_dtype=None, no_zeros_in_input=False)[source]

Multiplies all the values of a tensor along the specified axis(es).

Equivalent to CAReduce(scalar.prod, axis = axis), with the difference that this defines the gradient of prod wrt its tensor input.

`L_op`(inp, out, grads)[source]

The grad of this Op could be very easy, if it is was not for the case where zeros are present in a given “group” (ie. elements reduced together to form the product).

If no zeros are found in the elements of the product, then the partial derivative of the product relative to one of the elements (one of the inputs) is simply the product of the other elements. That’s easy to see from the chain rule.

Now the trick (with no zeros) is to take the overall product, then for every original element, the partial derivative is given by this product divided by the element itself (which equals the product of the other terms). This is easy to do by broadcasting the original product.

(Note that we also need to broadcast-multiply by the “incoming gradient”, ie. the gradient of the cost relative to the output/product).

With zeros, things get more complicated. For a given group, we have 3 cases:

• No zeros in the group. Use previous trick.

• If only one zero is present, then the gradient for that element is

non-zero, but is zero for all others.

• If more than one zero is present, then all the derivatives are zero.

For the last two cases (with 1 or more zeros), we can’t use the division trick, as this gives divisions by 0.

Implementing that case-by-case logic is not as trivial, so a bunch of hacks are piled down here to do it. Notably, for the “only one zero” case, there’s a special Op that computes the product of the elements in the group, minus the zero (see ProdWithoutZero). The trick is then to use the division trick for groups with no zero, to use the ProdWithoutZeros op where there’s only one zero, and to output a derivative of zero for any element part of a group with more than one zero.

I do this by first counting the number of zeros in each group (see the “T.eq()” bits), then taking this or that behavior (see T.switch) based on the result of this count.

`c_code_cache_version`()[source]

Return a tuple of integers indicating the version of this Op.

An empty tuple indicates an ‘unversioned’ Op that will not be cached between processes.

The cache mechanism may erase cached modules that have been superceded by newer versions. See ModuleCache for details.

`c_code_cache_version_apply()`

class `theano.tensor.elemwise.``ProdWithoutZeros`(axis=None, dtype=None, acc_dtype=None)[source]
class `theano.tensor.elemwise.``Sum`(axis=None, dtype=None, acc_dtype=None)[source]

Sums all the values of a tensor along the specified axis(es).

Equivalent to CAReduceDtype(scalar.add, axis=axis, dtype=dtype), with the difference that this defines the gradient of sum wrt its tensor input.

Parameters
• axis – Axis(es) along which the tensor should be summed (use None to sum over all axes, and a list or tuple to sum along more than one axis).

• dtype – The dtype of the internal accumulator and returned tensor. If None, then we use the default dtype which is the same as the input tensor’s dtype except when: - the input dtype is a signed integer of precision < 64 bit, in which case we use int64 - the input dtype is an unsigned integer of precision < 64 bit, in which case we use uint64 This value does not depend on the value of “acc_dtype”.

• acc_dtype – The dtype of the internal accumulator. If None (default), we use the dtype in the list below, or the input dtype if its precision is higher: - for int dtypes, we use at least int64; - for uint dtypes, we use at least uint64; - for float dtypes, we use at least float64; - for complex dtypes, we use at least complex128.

`R_op`(inputs, eval_points)[source]

This method is primarily used by tensor.Rop

Suppose the op outputs

[ f_1(inputs), …, f_n(inputs) ]

Parameters
• inputs (a Variable or list of Variables) –

• eval_points – A Variable or list of Variables with the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R op is to be evaluated.

Returns

rval[i] should be Rop(f=f_i(inputs),

wrt=inputs, eval_points=eval_points)

Return type

list of n elements