blocksparse – Block sparse dot operations (gemv and outer)

class theano.tensor.nnet.blocksparse.SparseBlockGemv(inplace=False)[source]

This op computes the dot product of specified pieces of vectors and matrices, returning pieces of vectors:

for b in range(batch_size):
    for j in range(o.shape[1]):
        for i in range(h.shape[1]):
            o[b, j, :] += numpy.dot(h[b, i], W[iIdx[b, i], oIdx[b, j]])

where b, h, W, o iIdx, oIdx are defined in the docstring of make_node.

../../../_images/blocksparse.png
grad(inputs, grads)[source]

Construct a graph for the gradient with respect to each input variable.

Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.

Parameters
  • inputs (list of Variable) – The input variables.

  • output_grads (list of Variable) – The gradients of the output variables.

Returns

grads – The gradients with respect to each Variable in inputs.

Return type

list of Variable

make_node(o, W, h, inputIdx, outputIdx)[source]

Compute the dot product of the specified pieces of vectors and matrices.

The parameter types are actually their expected shapes relative to each other.

Parameters
  • o (batch, oWin, oSize) – output vector

  • W (iBlocks, oBlocks, iSize, oSize) – weight matrix

  • h (batch, iWin, iSize) – input from lower layer (sparse)

  • inputIdx (batch, iWin) – indexes of the input blocks

  • outputIdx (batch, oWin) – indexes of the output blocks

Returns

dot(W[i, j], h[i]) + o[j]

Return type

(batch, oWin, oSize)

Notes

  • batch is the number of examples in a minibatch (batch size).

  • iBlocks is the total number of blocks in the input (from lower

    layer).

  • iSize is the size of each of these input blocks.

  • iWin is the number of blocks that will be used as inputs. Which

    blocks will be used is specified in inputIdx.

  • oBlocks is the number or possible output blocks.

  • oSize is the size of each of these output blocks.

  • oWin is the number of output blocks that will actually be computed.

    Which blocks will be computed is specified in outputIdx.

perform(node, inp, out_)[source]

Required: Calculate the function on the inputs and put the variables in the output storage. Return None.

Parameters
  • node (Apply) – The symbolic Apply node that represents this computation.

  • inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.

  • output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.

  • params (tuple) – A tuple containing the values of each entry in __props__.

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this PureOp.perform; they could’ve been allocated by another PureOp’s perform method. A PureOp is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.

Raises

MethodNotDefined – The subclass does not override this method.

class theano.tensor.nnet.blocksparse.SparseBlockOuter(inplace=False)[source]

This computes the outer product of two sets of pieces of vectors updating a full matrix with the results:

for b in range(batch_size):
    o[xIdx[b, i], yIdx[b, j]] += (alpha * outer(x[b, i], y[b, j]))

This op is involved in the gradient of SparseBlockGemv.

make_node(o, x, y, xIdx, yIdx, alpha=None)[source]

Compute the dot product of the specified pieces of vectors and matrices.

The parameter types are actually their expected shapes relative to each other.

Parameters
  • o (xBlocks, yBlocks, xSize, ySize) –

  • x (batch, xWin, xSize) –

  • y (batch, yWin, ySize) –

  • xIdx (batch, iWin) – indexes of the x blocks

  • yIdx (batch, oWin) – indexes of the y blocks

Returns

outer(x[i], y[j]) + o[i, j]

Return type

(xBlocks, yBlocks, xSize, ySize)

Notes

  • batch is the number of examples in a minibatch (batch size).

  • xBlocks is the total number of blocks in x.

  • xSize is the size of each of these x blocks.

  • xWin is the number of blocks that will be used as x. Which blocks will be used is specified in xIdx.

  • yBlocks is the number or possible y blocks.

  • ySize is the size of each of these y blocks.

  • yWin is the number of y blocks that will actually be computed. Which blocks will be computed is specified in yIdx.

perform(node, inp, out_)[source]

Required: Calculate the function on the inputs and put the variables in the output storage. Return None.

Parameters
  • node (Apply) – The symbolic Apply node that represents this computation.

  • inputs (Sequence) – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.

  • output_storage (list of list) – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.

  • params (tuple) – A tuple containing the values of each entry in __props__.

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this PureOp.perform; they could’ve been allocated by another PureOp’s perform method. A PureOp is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.

Raises

MethodNotDefined – The subclass does not override this method.

theano.tensor.nnet.blocksparse.sparse_block_dot(W, h, inputIdx, b, outputIdx)[source]

Compute the dot product (plus bias) of the specified pieces of vectors and matrices. See SparseBlockGemv to get more information.

The parameter types are actually their expected shapes relative to each other.

Parameters
  • W (iBlocks, oBlocks, iSize, oSize) – weight matrix

  • h (batch, iWin, iSize) – input from lower layer (sparse)

  • inputIdx (batch, iWin) – indexes of the input blocks

  • b (oBlocks, oSize) – bias vector

  • outputIdx (batch, oWin) – indexes of the output blocks

Returns

dot(W[i, j], h[i]) + b[j] but b[j] is only added once

Return type

(batch, oWin, oSize)

Notes

  • batch is the number of examples in a minibatch (batch size).

  • iBlocks is the total number of blocks in the input (from lower layer).

  • iSize is the size of each of these input blocks.

  • iWin is the number of blocks that will be used as inputs. Which blocks

    will be used is specified in inputIdx.

  • oBlocks is the number or possible output blocks.

  • oSize is the size of each of these output blocks.

  • oWin is the number of output blocks that will actually be computed.

    Which blocks will be computed is specified in outputIdx.