Neural Networks

model-list

Holds submodules in a list.

Model-List it contains are properly tracked by find-variables.

Note: This Layer is exported from Package cl-waffe.

Parameters

(model-list list)

list (list): an list of models

This model can also be created by mlist

(mlist models) ; -> [Model: MODEL-LIST]

Forward

(call (Model-List) index &rest args)

Note that index must be waffetensor.

To avoid this, mth is available.

(call (mth 0 (Model-List)) &rest args)

index (waffetensor of which data is fixnum): an index of models
args (list): arguments for index-th model

Example

(setq models (Model-List (list (linearlayer 10 1)(linearlayer 10 1))))
(call models (const 0)(!randn `(10 10)))
(call (mth 0 models)(!randn `(10 10)))

Linearlayer

Applies a linear transformation to the incoming data: (setq y (!add (!matmul x weight) bias))

Parameters

(LinearLayer in-features out-features &optional (bias T))

in-features (fixnum): size of each input sample
out-features (fixnum): size of each output sample
bias (boolean): If set to nil, the layer will not learn an additive bias. default:t

Shape

LinearLayer: (batch-size in-features) -> (batch-size out-features)

Input: x (Tensor) where the x is the shape of (batch-size in-features)
Output: Output: an tensor that applied linearlayer, where the tensor is the shape of (batch-size out-features)

Forward

(call (LinearLayer 10 1) x)

x: the input tensor

Example

(call (LinearLayer 10 1)(!randn `(10 10)))

DenseLayer

Calling LinearLayer, and activation specified in activation.

Parameters

(DenseLayer in-features out-features &optional (bias t)(activation :relu))

in-features (fixnum): size of each input sample
out-features (fixnum): size of each output sample
bias (boolean): If set to nil, the layer will not learn an additive bias. default:t
activation (keyword or function): activation are following: :relu :sigmoid :tanh, If set to function, that is called as an activation.

Shape

DenseLayer: (batch-size in-features) -> (batch-size out-features)

Input: x (Tensor) where the x is the shape of (batch-size in-features)
Output: Output: an tensor that applied denselayer, where the tensor is the shape of (batch-size out-features)

Forward

(call (DenseLayer 10 1) x)

x: the input tensor

Example

(call (DenseLayer 10 1)(!randn `(10 10)))

Dropout

When *no-grad* is nil, dropout randomly zeroes some elements of the given tensor with sampling bernoulli tensor of dropout-rate.

Futhermore, the outputs are scaled by (/ (- 1 (self dropout-rate))), (i.e.: This is a Inverted Dropout.). This means when *no-grad* is t (i.e.: during predicting) dropout simply returns the given tensor.

Parameters

(dropout &optional (dropout-rate 0.5))

dropout-rate: Dropout samples bernoulli distribution based on dropout-rate.

Shape

Dropout: (Any) -> (The same as a input)

Input: Any is OK
Output: The same as given input's shape.

Forward

(setq x (!randn `(10 10)))
;#Const(((-0.59... -0.09... ~ 0.289... 0.390...)        
;                 ...
;        (1.447... 1.032... ~ -0.66... -0.55...)) :mgl t :shape (10 10))
(call (Dropout 0.5) x)
;#Const(((0.0 -0.19... ~ 0.0 0.0)        
;                 ...
;        (2.895... 2.064... ~ 0.0 -1.10...)) :mgl t :shape (10 10))

BatchNorm2d

Applies BatchNorm2D.

Parameters

(BatchNorm2D in-features &key (affine t)(epsilon 1.0e-7))

in-features: an excepted input of size
affine: if t, the model has trainable affine layers.
epsilon: the value used to the denominator for numerical stability. Default: 1.0e-7

Shape

(call (BatchNorm2D) x)

BatchNorm2D : (any, in-feature) -> (the same as input of shape)

Example

(setq model (BatchNorm2D 10))
(call model (!randn `(30 10)))

LayerNorm

Embedding

A simple lookup table object to store embedding vectors for NLP Models.

Parameter

(Embedding vocab-size embedding-ize &key pad-idx)

vocab-size: (fixnum) size of the dictionary of embeddings
embedding-size: (fixnum) the size of each embedding tensor
pad-idx: If specified, the entries at padding_idx do not contribute to the gradient. If nil, ignored.

Shape

(call (Embedding 10 10) x)

Embedding: (batch-size sentence-length) -> (batch-size sentence-len embedding-dim)

x: input x, where each element are single-float (like 1.0, 2.0 ...)

Example

(setq model (cl-waffe.nn:Embedding 10 20))

(call model (!ones `(1 10)))
#Const((((-0.01... -0.01... ~ 0.013... 0.002...)         
                   ...
         (-0.01... -0.01... ~ 0.013... 0.002...))) :mgl t :shape (1 10 20))

RNN

Applies a multi-layer RNN with tanh or ReLU.

The assumption is that (setf !aref)'s backward contributes to it.

Parameters

(RNN input-size hidden-size &key (num-layers 1)(activation :tanh)(bias t)(dropout nil)(biredical nil))

input-size: The number of excepted features of x
hidden-size: The number of features in hidden-layer
num-layers: Number of reccurent layers
activation: Can be either :tanh or :relu
bias: (boolean) If t, the model has a trainable bias.
dropout: (boolean) If t, the model has a dropout layer.
biredical: (boolean) If t, the model become a biredical RNN

Shape

(call (RNN 10 10) x &optional (hs nil))

RNN : (batch-size sentence-length input-size) -> (batch-size sentence-length hidden-size)

x: the input x where the shape is (batch-size sentence-length input-size)
hs: The last hidden-state. if nil, the model creates a new one.

Example

(setq model (RNN 10 20))
(setq embedding (Embedding 10 10))
(call model
  (call embedding (!one `(10 10))))

;#Const((((-1.46... -1.46... ~ -5.53... 1.766...)         
;                   ...
;         (-1.46... -1.46... ~ -5.53... 1.766...))        
;                 ...
;        ((-1.46... -1.46... ~ -5.53... 1.766...)         
;                   ...
;         (-1.46... -1.46... ~ -5.53... 1.766...))) :mgl t :shape (10 10 20))

LSTM

GRU

MaxPooling

AvgPooling

Conv1D

Conv2D

Transformer

TransformerEncoderLayer

TransformerDecoderLayer

CrossEntropy

cross-entropy(x y &optional (delta 1.0e-7) (epsilon 0.0))

This criterion computes the cross entropy loss between x and y.

If epsilon is greater than 0.0, smooth-labeling is enabled.

If avoid-overrflow is t, x is substracted by x's average in order to avoid overflowing.

delta is the value used for (!log (!add x delta)).

x is a probability distribution.

y is a proablitity distribution or labels.

If y is labels, y is fixed to a probability distribution.

SoftMaxCrossEntropy

softmax-cross-entropy(x y &key (avoid-overflow t) (delta 1.0e-7) (epsilon 0.0))

This criterion computes the softmax cross entropy loss between x and y.

If epsilon is greater than 0.0, smooth-labeling is enabled.

If avoid-overrflow is t, x is substracted by x's average in order to avoid overflowing.

delta is the value used for (!log (!add x delta)).

x is a probability distribution.

y is a proablitity distribution or labels.

If y is labels, y is fixed to a probability distribution.

MSE

mse(p y)

Computes MSE Loss.

mse is defined as (!mean (!pow (!sub p y) 2) 1)

cl-waffe

model-list

Parameters

Forward

Example

Linearlayer

Parameters

Shape

Forward

Example

DenseLayer

Parameters

Shape

Forward

Example

Dropout

Parameters

Shape

Forward

BatchNorm2d

Parameters

Shape

Example

LayerNorm

Embedding

Parameter

Shape

Example

RNN

Parameters

Shape

Example

LSTM

GRU

MaxPooling

AvgPooling

Conv1D

Conv2D

Transformer

TransformerEncoderLayer

TransformerDecoderLayer

CrossEntropy

SoftMaxCrossEntropy

MSE

L1Norm

L2Norm

BinaryCrossEntropy

KLdivLoss

CosineSimilarity