This UGen trains a self-organising map (a well-known type of neural net), which is a system which learns to map a high-dimensional feed of input data onto a lower-dimensional array of "nodes". The neural net is stored in a Buffer. Once trained, the net can be analysed or can be used to transform other incoming data (using SOMRd).
bufnum |
A reference to the buffer where the map will be created. Initialising the map is up to you (see below). |
inputdata |
An Array of the input signals for the net to learn. |
netsize |
The size of the neural net along one dimension. |
numdims |
The dimensionality of the neural net. Choose from 1, 2, 3, or 4. |
traindur |
The length of the training period; the number of data frames that will come in before the net gradually "freezes" into its final state |
nhood |
The initial size of the neighbourhood used in training. (The size always shrinks to zero as the training progresses.) The size is expressed as a fraction of netsize. Default is 0.5, and you probably don't need to change it. |
gate |
A simple on-or-off control. When off (gate<=0) the incoming data is ignored. |
initweight |
How heavily the algorithm weights the data points at first (the weighting always tails off to zero as the training progresses, in a reciprocal curve). Default is 1, and you usually don't need to change this. |
The UGen outputs an array of three values: the number of data points still to come in before training finishes; the "reconstruction error" of the single data point that has most recently been input (the squared-distance between it and the node nearest it); and the frame index of the most recent matching node. The "reconstruction error" will vary a lot but should in general decrease as the net comes closer and closer to mapping the data well. The frame index gives a direct index into the Buffer, and can be converted to a multidimensional location in the SOM structure using SOMRd.bufIndexToCoords.
Note: this UGen does not cope well if the buffer is freed or changed during running. For efficiency purposes, it doesn't keep checking the buffer while running; so you should avoid changing the buffer while the training is running.
An SOM will try to fit a smooth-ish surface to the given data, so as a test case let's create a single sine-wave undulation in a Buffer, and see if a one-dimensional SOM can fit nicely to that sinewave. (For higher dimension examples see SOMTrain_2D_example, SOMTrain_3D_example)
The number of nodes in the net is defined by netsize AND numdims.
If the netsize is 10 and the numdims is 2, the actual number of nodes is 10 x 10 = 100.
Or if numdims is 3, the number of nodes is 10 x 10 x 10 = 1000.
The buffer must contain the same number of frames as the number of nodes; and it must have the same number of channels as the inputdata array. The SOMTrain class provides a convenience function for allocating a buffer of the right size:
~asuitablebuffer = SOMTrain.allocBuf(s, netsize, numdims, numinputdims); // Calls Buffer.alloc on your behalf
The values held by the SOM nodes must usually be initialised to some state before the training begins. In the literature on SOMs there are a couple of common options such as random intialisation, or initialisation using the principal components of the input data. It's up to you - since the map is just a regular Buffer object, you can use Buffer's loadCollection methods (etc) to fill the map in any way. However, the SOMTrain class provides a couple of conveniences:
SOMTrain.initBufRand(b); // Each node gets an independent randomly-distributed co-ordinate (i.e. ignores where its neighbours might be)
SOMTrain.initBufGrid(b, netsize, numdims, spinmatrix); // The nodes are initialised as a grid, oriented according to the supplied rotation matrix (nested array, size [numinputdims][numdims])
SOMTrain.initBufGridRand(b, netsize, numdims); // The nodes are initialised as a grid, randomly oriented in the input space
The first of these options is simple but not recommended: starting with a completely random arrangement of nodes is quite likely to lead to a resulting map with "twists" in it, which may be an unhelpful local-minimum solution.