# Extended Compact Genetic Algorithm #

## Name #

Extended Compact Genetic Algorithm (eCGA)

## Taxonomy #

The Extended Compact Genetic Algorithm is a probabilistic model-building genetic algorithm that belongs to the field of Evolutionary Computation, a subfield of Computational Intelligence. It is closely related to the Compact Genetic Algorithm (cGA) and the Bayesian Optimization Algorithm (BOA).

- Computational Intelligence
- Evolutionary Computation
- Evolutionary Algorithms
- Genetic Algorithms
- Probabilistic Model-Building Genetic Algorithms
- Extended Compact Genetic Algorithm (eCGA)

- Probabilistic Model-Building Genetic Algorithms

- Genetic Algorithms

- Evolutionary Algorithms

- Evolutionary Computation

## Strategy #

The Extended Compact Genetic Algorithm is a probabilistic model-building genetic algorithm that aims to identify and exploit problem structure to efficiently solve optimization problems. It maintains a probabilistic model of promising solutions, which is used to generate new candidate solutions.

### Probabilistic Modeling #

eCGA represents the population as a probability distribution over the search space. The probabilistic model captures the dependencies between problem variables, allowing the algorithm to exploit problem structure and generate high-quality solutions.

### Model Building and Sampling #

In each generation, eCGA builds a probabilistic model based on a selected set of promising solutions. The model is typically a marginal product model, which assumes that the problem variables can be partitioned into subsets, each following a separate probability distribution. New candidate solutions are then sampled from this model.

### Selection and Model Updating #

After evaluating the sampled solutions, eCGA selects the most promising ones based on their fitness values. These selected solutions are then used to update the probabilistic model, refining the search direction for the next generation.

## Procedure #

Data Structures:

- Probabilistic Model: A data structure representing the probability distribution over the search space, typically a marginal product model.
- Population: A set of candidate solutions.

Parameters:

- Population Size: The number of candidate solutions maintained in each generation.
- Selection Pressure: The proportion of the population selected for model building and updating.
- Learning Rate: The rate at which the probabilistic model is updated based on the selected solutions.

- Initialize the probabilistic model 1.1. Set initial probabilities for each problem variable
- Generate an initial population by sampling from the probabilistic model
- Evaluate the fitness of each candidate solution in the population
- While the termination criterion is not met, repeat: 4.1. Select a set of promising solutions from the population based on their fitness 4.2. Build a new probabilistic model based on the selected solutions 4.3. Sample a new population from the updated probabilistic model 4.4. Evaluate the fitness of each new candidate solution 4.5. Replace the old population with the new population
- Return the best solution found

## Considerations #

Advantages:

- Exploits problem structure by capturing dependencies between variables
- Requires fewer evaluations compared to traditional genetic algorithms
- Suitable for problems with large search spaces and complex variable interactions

Disadvantages:

- The performance depends on the choice of the probabilistic model
- Building the probabilistic model can be computationally expensive
- May prematurely converge to suboptimal solutions if the model is not expressive enough

## Heuristics #

### Population Size #

- Start with a population size that is at least 10 times the problem size (number of variables)
- Increase the population size for problems with complex variable interactions or large search spaces

### Selection Pressure #

- Use a moderate selection pressure (e.g., 50% of the population) to balance exploration and exploitation
- Higher selection pressure can lead to faster convergence but may result in premature convergence to suboptimal solutions

### Learning Rate #

- Set the learning rate to a small value (e.g., 0.1) to avoid rapid changes in the probabilistic model
- Gradually increase the learning rate over generations to refine the search direction

### Probabilistic Model #

- Use a marginal product model as the default choice for the probabilistic model
- Consider more expressive models, such as Bayesian networks, for problems with complex variable interactions
- Adapt the model complexity based on the problem size and available computational resources

### Termination Criterion #

- Set a maximum number of generations or fitness evaluations based on the available computational budget
- Terminate the algorithm if the best solution’s fitness has not improved for a specified number of generations
- Consider problem-specific termination criteria, such as reaching a target fitness value or solution quality