Neural Networks
model-list
Holds submodules in a list.
Model-List it contains are properly tracked by find-variables
.
Note: This Layer is exported from Package cl-waffe
.
Parameters
(model-list list)
- list (list)
- an list of models
This model can also be created by mlist
(mlist models) ; -> [Model: MODEL-LIST]
Forward
(call (Model-List) index &rest args)
Note that index
must be waffetensor.
To avoid this, mth
is available.
(call (mth 0 (Model-List)) &rest args)
- index (waffetensor of which data is fixnum)
- an index of models
- args (list)
- arguments for index-th model
Example
(setq models (Model-List (list (linearlayer 10 1)(linearlayer 10 1))))
(call models (const 0)(!randn `(10 10)))
(call (mth 0 models)(!randn `(10 10)))
Linearlayer
Applies a linear transformation to the incoming data: (setq y (!add (!matmul x weight) bias))
Parameters
(LinearLayer in-features out-features &optional (bias T))
- in-features (fixnum)
- size of each input sample
- out-features (fixnum)
- size of each output sample
- bias (boolean)
- If set to nil, the layer will not learn an additive bias. default:t
Shape
LinearLayer: (batch-size in-features) -> (batch-size out-features)
- Input
- x (Tensor) where the x is the shape of (batch-size in-features)
- Output
- Output: an tensor that applied linearlayer, where the tensor is the shape of (batch-size out-features)
Forward
(call (LinearLayer 10 1) x)
- x
- the input tensor
Example
(call (LinearLayer 10 1)(!randn `(10 10)))
DenseLayer
Calling LinearLayer, and activation specified in activation
.
Parameters
(DenseLayer in-features out-features &optional (bias t)(activation :relu))
- in-features (fixnum)
- size of each input sample
- out-features (fixnum)
- size of each output sample
- bias (boolean)
- If set to nil, the layer will not learn an additive bias. default:t
- activation (keyword or function)
- activation are following: :relu :sigmoid :tanh, If set to function, that is called as an activation.
Shape
DenseLayer: (batch-size in-features) -> (batch-size out-features)- Input
- x (Tensor) where the x is the shape of (batch-size in-features)
- Output
- Output: an tensor that applied denselayer, where the tensor is the shape of (batch-size out-features)
Forward
(call (DenseLayer 10 1) x)
- x
- the input tensor
Example
(call (DenseLayer 10 1)(!randn `(10 10)))
Dropout
When *no-grad* is nil, dropout randomly zeroes some elements of the given tensor with sampling bernoulli tensor of dropout-rate
.
Futhermore, the outputs are scaled by (/ (- 1 (self dropout-rate))), (i.e.: This is a Inverted Dropout.). This means when *no-grad* is t (i.e.: during predicting) dropout simply returns the given tensor.
Parameters
(dropout &optional (dropout-rate 0.5))
- dropout-rate
- Dropout samples bernoulli distribution based on dropout-rate.
Shape
Dropout: (Any) -> (The same as a input)- Input
- Any is OK
- Output
- The same as given input's shape.
Forward
(setq x (!randn `(10 10)))
;#Const(((-0.59... -0.09... ~ 0.289... 0.390...)
; ...
; (1.447... 1.032... ~ -0.66... -0.55...)) :mgl t :shape (10 10))
(call (Dropout 0.5) x)
;#Const(((0.0 -0.19... ~ 0.0 0.0)
; ...
; (2.895... 2.064... ~ 0.0 -1.10...)) :mgl t :shape (10 10))
BatchNorm2d
Applies BatchNorm2D.
Parameters
(BatchNorm2D in-features &key (affine t)(epsilon 1.0e-7))
- in-features
- an excepted input of size
- affine
- if t, the model has trainable affine layers.
- epsilon
- the value used to the denominator for numerical stability. Default: 1.0e-7
Shape
(call (BatchNorm2D) x)
BatchNorm2D : (any, in-feature) -> (the same as input of shape)Example
(setq model (BatchNorm2D 10))
(call model (!randn `(30 10)))
LayerNorm
Embedding
A simple lookup table object to store embedding vectors for NLP Models.
Parameter
(Embedding vocab-size embedding-ize &key pad-idx)
- vocab-size
- (fixnum) size of the dictionary of embeddings
- embedding-size
- (fixnum) the size of each embedding tensor
- pad-idx
- If specified, the entries at padding_idx do not contribute to the gradient. If nil, ignored.
Shape
(call (Embedding 10 10) x)
Embedding: (batch-size sentence-length) -> (batch-size sentence-len embedding-dim)
- x
- input x, where each element are single-float (like 1.0, 2.0 ...)
Example
(setq model (cl-waffe.nn:Embedding 10 20))
(call model (!ones `(1 10)))
#Const((((-0.01... -0.01... ~ 0.013... 0.002...)
...
(-0.01... -0.01... ~ 0.013... 0.002...))) :mgl t :shape (1 10 20))
RNN
Applies a multi-layer RNN with tanh or ReLU.
The assumption is that (setf !aref)'s backward contributes to it.
Parameters
(RNN input-size hidden-size &key (num-layers 1)(activation :tanh)(bias t)(dropout nil)(biredical nil))
- input-size
- The number of excepted features of x
- hidden-size
- The number of features in hidden-layer
- num-layers
- Number of reccurent layers
- activation
- Can be either :tanh or :relu
- bias
- (boolean) If t, the model has a trainable bias.
- dropout
- (boolean) If t, the model has a dropout layer.
- biredical
- (boolean) If t, the model become a biredical RNN
Shape
(call (RNN 10 10) x &optional (hs nil))
RNN : (batch-size sentence-length input-size) -> (batch-size sentence-length hidden-size)
- x
- the input x where the shape is (batch-size sentence-length input-size)
- hs
- The last hidden-state. if nil, the model creates a new one.
Example
(setq model (RNN 10 20))
(setq embedding (Embedding 10 10))
(call model
(call embedding (!one `(10 10))))
;#Const((((-1.46... -1.46... ~ -5.53... 1.766...)
; ...
; (-1.46... -1.46... ~ -5.53... 1.766...))
; ...
; ((-1.46... -1.46... ~ -5.53... 1.766...)
; ...
; (-1.46... -1.46... ~ -5.53... 1.766...))) :mgl t :shape (10 10 20))
LSTM
GRU
MaxPooling
AvgPooling
Conv1D
Conv2D
Transformer
TransformerEncoderLayer
TransformerDecoderLayer
CrossEntropy
cross-entropy
(x y &optional (delta 1.0e-7) (epsilon 0.0))
This criterion computes the cross entropy loss between x and y.
If epsilon is greater than 0.0, smooth-labeling is enabled.
If avoid-overrflow is t, x is substracted by x's average in order to avoid overflowing.
delta is the value used for (!log (!add x delta)).
x is a probability distribution.
y is a proablitity distribution or labels.
If y is labels, y is fixed to a probability distribution.
SoftMaxCrossEntropy
softmax-cross-entropy
(x y &key (avoid-overflow t) (delta 1.0e-7) (epsilon 0.0))
This criterion computes the softmax cross entropy loss between x and y.
If epsilon is greater than 0.0, smooth-labeling is enabled.
If avoid-overrflow is t, x is substracted by x's average in order to avoid overflowing.
delta is the value used for (!log (!add x delta)).
x is a probability distribution.
y is a proablitity distribution or labels.
If y is labels, y is fixed to a probability distribution.
MSE
mse
(p y)
Computes MSE Loss.
mse is defined as (!mean (!pow (!sub p y) 2) 1)