G-LVQ tries to overcome the limitations of LVQ by using genetic algorithms. But traditional GAs have also got some limitations, so that some improvements had to be added to the traditional GA paradigm, but remaining at the same time within the GA spirit.
The best solution to the problem of LVQ optimization is to optimize a LVQ network globally. This means optimizing the initial weight vectors, the learning algorithm, and the number of weight vectors as well. Nevertheless, the learning algorithm seems to be optimal enough, so that it will not taken into account.
In order to use a GA on a LVQ the initial weight vectors must, then, be coded into a chromosome. All the components of each weight vector, along with its label, are coded together. As is usual in GAs, a population of chromosomes is used, each one is evaluated by decoding a net from the chromosome, and training and testing it on the training sample, and a new generation is created from the fittest.
Usual GA operators, like mutation and uniform crossover, are used. But, besides, bearing in mind the length of the LVQ network must be also optimized, there must be some operators for altering the length of the chromosome. These operators add or eliminate the representation of a whole weight vector (including label), and are, namely:
With respect to the fitness criterium, it is usual in classification problems that adding more vectors to the classifier improves classification accuracy, but at a cost (the cost of computing the class raises with the number of vectors, since the input vector must be compared with each of the reference vectors). Besides, it is desirable that the classifier itself is close (in the euclidean sense) to the samples, in such a way that it becomes a good representation of the training sample. Thus, there are three quantities to optimize: classification accuracy, length of the dictionnary (neural net size) and distortion (or proximity from the neural net to the classified samples).
Usual scalar fitness cannot cope with this, since it is difficult to join all these criteria in a single fitness formula. Thus, G-LVQ uses a vector fitness, in such a way that two members of the population are compared in turn for each of those quantities. Accuracy is usually considered the most important, being distortion and neural net size less important. However, the order of evaluation can be altered by the user.
On the other hand, G-LVQ algorithm is intended to be executed in parallel machines, like PVM or a Connection Machine. It has been designed to perform al GA operations locally, in such a way that the chromosomes in the population are considered to be in a grid. All comparison and other operations take place locally. For instance, each member of the population is only compared with those in the neighboring nodes of the grid, and mates also with one of the chromosomes in those nodes. Routing time is reduced, and besides, this modification keeps the original spirit of GAs, distilled on the phrase ``Distrust central authority'' .