Multiple Expert Models

We have looked at dividing problems into subtasks based on the structure of what is being predicted.

There are also problems that can be naturally divided into subproblems based on the input data. This might be as simple as partitions of the input feature space, or something more elaborate, such as dividing an image into the foreground and background and developing a model for each.

A more general approach for this from the field of neural networks is referred to as a mixture of experts(MoE).

The approach involves first dividing the learning task into subtasks, developing an expert model for each subtask, using a gating model to decide or learn which expert to use for each example and the pool the outputs of the experts, and gating model together to make a final prediction.

  • MoE : A technique that develops an expert model for each subtask and learns how much to trust each expert when making a prediction for specific examples. Two aspects of MoE make the method unique. The first is the explicit partitioning of the input feature space, and the second is the use of a gating network or gating model that learns which expert to trust in each situation, e.g, each input case. The contributing members in a mixture of experts model can address the whole problem, at least partially or to some degree. Although an expert may not be tailored to a specific input, it can still be used to make a prediction on something outside of its area of expertise.