MNIST Tutorial

First

Thank you for having an interest in my framework.

In this section, we define Simple MLP with cl-waffe, and train MNIST.

Let's get started!

All the codes below is in Official Repository

After you cloned cl-waffe repos, please run this command:

$ cd ./examples
$ sh ./install.sh ; scripts for downloading training datum.
$ cd ..

$ ./run-test-model.ros mnist

And you can try cl-waffe quickly!

Define Your Model

Define the structure of the network using cl-waffe

defmodel(name args &key (parameters nil) forward (optimize nil) (document An model, defined by cl-waffe))

This macro defines a cl-waffe model as name.

At the same time, a constructor name is defined and you can initialize your model like:

(cl-waffe.nn:LinearLayer 100 20) ; => [Model: Linearlayer]

name

Your model and constructor name

args

The arguments of a constructor

parameters

The parameters your model has.

Every time you initialize the model, the parameters are initialized.

Note that defmodel behaves like class.

The arguments are the same as defstruct

Format Example: ((param-name param-initial-value &key (type your-type)))

optimize

when t, your forward slot is defined with (declare (optimize (speed 3)(space 0)(debug 0))). It helps faster training after you ensured debugged.

forward

Define here the forward propagation of your model.

When backward, Automatic differentiation applies.

The defmodel macro is the most basic unit when defining your network in cl-waffe.

Let's check a example and define 3 layers MLP.

; ensure (use-package :cl-waffe) and (use-package :cl-waffe.nn)

(defmodel MLP (activation)
  :parameters ((layer1   (denselayer (* 28 28) 512 T activation))
	       (layer2   (denselayer 512 256 T activation))
	       (layer3   (linearlayer 256 10 T)))
  :forward ((x)
            (call (self layer3)
	          (call (self layer2)
		        (call (self layer1) x)))))

See :parameters, cl-waffe.nn exports denselayer and linearlayer where constructors are `(in-features out-features &optional (bias T)(activation :relu))`.

And, when MLP are inited, layer1~layer3 are initied.

In :forward, define your forward propagations.

You can access your model's parameter through macro (self name), and this is just slot-value, so it's setfable.

You can call :forward step by using the function call.

call(model &rest args)

Calls the forward steps which defined in: defnode, defmodel, defoptimizer.

All forward steps must be called through this function, otherwise the returned tensor doesn't have: computation nodes, thread-datum which supports performance.

Building computation nodes is ignored when *no-grad* is t.

model: Your initialized model/node/optimizer objects
args: Arguments :forward needs

Example:

(defnode Add nil
  :optimize t
  :parameters nil
  :forward  ((x y)
	     (sysconst (+ (data x)(data y))))
  :backward ((dy)(list dy dy)))

(call (Add)(const 1.0)(const 1.0))
;=>Const(2.0)

Output: Waffetensor of list which comprised of waffetensor.

Whether you are lisper or not, It is natural that you think MLP's :forward is too rebundant.

So, the macro `(with-calling-layers)` is exported and you can rewrite it concisely.

with-calling-layers(input &rest layers)

This macro allows to sequentially call layers.

the argument input must be a tensor.

Refering each layers from (self) macro, destructively modifying x with the returned value.

Note: This macro supposes models to be returned a single tensor, not a list.

(defmodel MLP (activation)
   :parameters ((layer1   (denselayer (* 28 28) 512 T activation))
   	        (layer2   (denselayer 512 256 T activation))
	        (layer3   (linearlayer 256 10 T)))
   :forward ((x)
	     (with-calling-layers x
	       (layer1 x)
 	       (layer2 x)
               (layer3 x))))

For the different arguments.

(with-calling-layers x
     (layer1 x 1 1)
     (layer2 1 x 2)
     (layer3 x y))

Output: An last value of layers.

You can see MLP requires activation which indicates the type of activation where activation is symbol.

Finally, this is how MLP is defined.

(defmodel MLP (activation)
  :parameters ((layer1   (denselayer (* 28 28) 512 T activation))
	       (layer2   (denselayer 512 256 T activation))
	       (layer3   (linearlayer 256 10 T)))
  :forward ((x)
	    (with-calling-layers x
	      (layer1 x)
 	      (layer2 x)
	      (layer3 x))))

(setq model (MLP :relu)) ; => [Model: MLP]

Define Your Dataset

Define the structure of the datasets available to the cl-waffe API.

defdataset(name args &key parameters next length (document An dataset structure defined by cl-waffe.))

Defining dataset. (This is kinda pytorch's dataloader)

The slots you defined can be invoked by using (get-dataset dataset index)(get-length dataset).

parameters: parameters datasets have.
next: when function (get-dataset dataset index) is called, this slot invokes. Return waffetensor for the next batch in response to your task.
length: In this form, the function must return the total length of your datasets where the value is fixnum. (Not a batch, and not a current index.)

(defdataset Mnistdata (train valid batch-size)
  :parameters ((train train)(valid valid)(batch-size batch-size))
  :next    ((index)
	    (list (!set-batch (self train) index (self batch-size))
		  (!set-batch (self valid) index (self batch-size))))
  :length (()(car (!shape (self train)))))

cl-waffe excepts index to be 1, 2, 3, ... (dataset-maxlen)

So, please manage batch-sizes in args and :next slots.

It is not always necessary to define a Dataset, but it is required to use the trainer described below.

In real, the format of the dataset is similar for different task, so I will use the default dataloader defined in the standard.

waffedataset

Option	Value
Constructor:	`(waffedataset train valid &key (batch-size 1) &aux (train train) (valid valid) (batch-size batch-size))`
Predicate:	`waffedataset-p`
Copier:	`copy-waffedataset`
Print Function:	`print-dataset`

cl-waffe's Dataset: WaffeDataSet

This structure is an cl-waffe object

Overview

The standard dataset for 2d training data.

How to Initialize

(WaffeDataSet train valid &key (batch-size 1)) => [DATASET: WaffeDataSet]

get-dataset

(get-dataset WaffeDataSet index) ; => Next Batch

get-dataset-length

(get-dataset-length WaffeDataSet) ; => Total length of WaffeDataSet

Object's slots

train
Option Value
Type: cl-waffe:waffetensor
Read Only: nil
Accessor: cl-waffe::waffedataset-train
Initform: cl-waffe:train
valid
Option Value
Type: cl-waffe:waffetensor
Read Only: nil
Accessor: cl-waffe::waffedataset-valid
Initform: cl-waffe::valid
batch-size
Option Value
Type: fixnum
Read Only: nil
Accessor: cl-waffe::waffedataset-batch-size
Initform: cl-waffe::batch-size
length
Option Value
Type: boolean
Read Only: nil
Accessor: cl-waffe::waffedataset-length
Initform: t
dataset-next
Option Value
Type: boolean
Read Only: nil
Accessor: cl-waffe::waffedataset-dataset-next
Initform: t

Write your own programme to load your dataset and initialize the Dataloader

However, a package called cl-waffe.io, exports functions to read data in libsvm format, since there is no unified library for reading data for different tasks in CommonLisp as far as I know. (This package is temporary and APIs will change without notice in the near future.)

Finally, this is How dataset created:

; ensure (use-package :cl-waffe.io)(use-package :cl-waffe)
; In ./examples/install.sh, here's downloader of mnist.
; Please make change the pathname of MNIST yourself if necessary.

(multiple-value-bind (datamat target)
    (read-libsvm-data "examples/tmp/mnist.scale" 784 10 :most-min-class 0)
  (defparameter mnist-dataset datamat)
  (defparameter mnist-target target))

(multiple-value-bind (datamat target)
    (read-libsvm-data "examples/tmp/mnist.scale.t" 784 10 :most-min-class 0)
  (defparameter mnist-dataset-test datamat)
  (defparameter mnist-target-test target))

(defparameter train (WaffeDataSet mnist-dataset mnist-target :batch-size 100))
(defparameter valid (WaffeDataSet mnist-dataset-test mnist-target-test :batch-size 100))

Train Your Model

The model is automatically trained using the train function and deftrainer macro.

The function train can start training automatically, given trainer object defined by deftrainer.

Of course, an API is provided for manual definition.

deftrainer(name args &key model optimizer optimizer-args step-model predict (document An trainer structure defined by cl-waffe.))

Defining trainer, which is made in order to call train function.

The slots you defined can be invoked by using (step-model model &rest args), (predict model &rest args). See below.

model: An model defined by (defmodel) which you want to train.
optimizer: An optimizer defined by (defoptimizer)
optimizer-args: An arguments for optimizer
step-model: For each batch step, :step-model is called in (train) function. Describe here forward step, backward, zero-grad, update for training.
predict: an code for predicting

These macro below are defined by macrolet and you can use them in :step-model, :predict

(self name): access trainer's parameters.
(model): access trainer's model, defined by :model keyword.
(zero-grad): Find model's all parameters and constants, and initialize their grads. (i.e. call optimizer's backward)
(update): Find model's all parameters, and call optimizer and change parameter's data. (i.e. call optimizer's forward)

This trainer macro is defined in order to integrate following works:

calling models
calling criterions
calling backward
calling optimizer
calling zero-grad
defining predict

Example:

(deftrainer MLPTrainer (activation lr)
  :model          (MLP activation)
  :optimizer      cl-waffe.optimizers:Adam ; Note: :optimizer requires a single variable.
  :optimizer-args (:lr lr) ; these arguments directly expanded to optimizer's args.
  :step-model ((x y)
	       (zero-grad) ; call zero-grad
	       (let ((out (cl-waffe.nn:softmax-cross-entropy (call (model) x) y))) ; get criterion
		 (backward out) ; backward
		 (update) ; call optimizer
		 out)) ; return loss
 :predict ((x)(call (model) x))) ;for predict

(setq trainer (MLPTrainer :relu 1e-4)) ; init your trainer

; Train:   (step-model trainer model-input-x model-input-y)
; Predict: (predict trainer model-input-x)

Init your trainer like...

(deftrainer MLPTrainer (activation lr)
  :model          (MLP activation)
  :optimizer      cl-waffe.optimizers:Adam
  :optimizer-args (:lr lr)
  :step-model ((x y)
	       (zero-grad)
	       (let ((out (cl-waffe.nn:softmax-cross-entropy (call (model) x) y)))
		 (backward out)
		 (update)
		 out))
 :predict ((x)(call (model) x)))

So, everything is now ready to go.

Now all you have to do is to pass your trainer, dataset to train

train

(trainer dataset &key (valid-dataset nil) (valid-each 100) (enable-animation t) (epoch 1) (batch-size 1) (max-iterate nil) (verbose t) (stream t) (progress-bar-freq 1) (save-model-path nil) (width 45) (random nil) (height 10) (print-each 10))

Trainining given trainer. If any, valid valid-dataset

trainer: Trainer you defined by deftrainer
dataset: Dataset you defined by defdataset
valid-dataset: If valid-dataset=your dataset, use this to valid. If nil, ignored
enable-animation: Ignored
epoch: Iterate training by epoch, default=1
batch-size: Do batch training. default=1
verbose: if t, put log to stream

This function is temporary and other arguments are ignored.

And this function has a lot of todo.

So, The whole code looks like this:

(defpackage :mnist-example
  (:use :cl :cl-waffe :cl-waffe.nn :cl-waffe.io))

(in-package :mnist-example)

; set batch as 100
(defparameter batch-size 100)

; Define Model Using defmodel
(defmodel MLP (activation)
  :parameters ((layer1   (denselayer (* 28 28) 512 T activation))
	       (layer2   (denselayer 512 256 T activation))
	       (layer3   (linearlayer 256 10 T)))
  :forward ((x)
	    (with-calling-layers x
	      (layer1 x)
 	      (layer2 x)
	      (layer3 x))))

; Define Trainer Using deftrainer
(deftrainer MLPTrainer (activation lr)
  :model          (MLP activation)
  :optimizer      cl-waffe.optimizers:Adam
  :optimizer-args (:lr lr)
  :step-model ((x y)
	       (zero-grad)
	       (let ((out (cl-waffe.nn:softmax-cross-entropy (call (model) x) y)))
		 (backward out)
		 (update)
		 out))
 :predict ((x)(call (model) x)))

; Initialize your trainer
(defparameter trainer (MLPTrainer :relu 1e-4))

; Loading MNIST Dataset Using cl-waffe.io
(format t "Loading examples/tmp/mnist.scale ...~%")
  
(multiple-value-bind (datamat target)
    (read-libsvm-data "examples/tmp/mnist.scale" 784 10 :most-min-class 0)
  (defparameter mnist-dataset datamat)
  (defparameter mnist-target target))

(format t "Loading examples/tmp/mnist.scale.t~%")

(multiple-value-bind (datamat target)
    (read-libsvm-data "examples/tmp/mnist.scale.t" 784 10 :most-min-class 0)
  (defparameter mnist-dataset-test datamat)
  (defparameter mnist-target-test target))

; Initialize Your Dataset
(defparameter train (WaffeDataSet mnist-dataset
                                  mnist-target
			          :batch-size batch-size))

(defparameter test (WaffeDataSet mnist-dataset-test
			         mnist-target-test
			         :batch-size 100))
(time (train
         trainer
	 train
	 :epoch 30
	 :batch-size batch-size
	 :valid-dataset test
         :verbose t
	 :random t
	 :print-each 100))

; Accuracy would be approximately about 0.9685294

You can either define a package and copy this or $ ./run-test-model.ros mnist is available to run this. (It needs roswell)

Option	Value
Type:	`cl-waffe:waffetensor`
Read Only:	`nil`
Accessor:	`cl-waffe::waffedataset-train`
Initform:	`cl-waffe:train`