## Overview

• Theorem explains why the GA is working.
• Schema - a template that identifies a subset of strings with similarities at certain string position.
• `*` the wildcard means 0 or 1 (Doesn’t matter)
``````S1 = (*01*00110) - this schema is represented by 4 strings (only changes filled):
V1 = (0  0     )
V2 = (0  1     )
V3 = (1  0     )
V4 = (1  1     )
``````
• Schema Order `o(S)` is defined as the number of fixed positions in the template:
``````S1 =  (*1***1**0) o(S1) = 3
S2 = (*1**01*111) o(S2) = 6
``````
• Schema Length `δ(S)` is the distance between first and last fixed position.
``````S1 =  (*1***1**0) δ(S1) = 9 - 2 = 7
S2 = (*1**01*111) δ(S2) = 10 - 2 = 8
``````
• Schema Fitness is the average fitness of all strings matching the schema.
``````f(V) - Fitness of individual
f(S1) - Fitness of schema1

S1 = (*01*00110)
V1 = (001000110) f(V1) = 0.2
V2 = (001100110) f(V2) = 0.5
V3 = (101000110) f(V3) = 0.8
V4 = (101100110) f(V4) = 0.5

f(schema) = sum(fitness variables from schema) / (num variables from schema)
f(S1) = [f(V1) + f(V2) + f(V3) + f(V4)] / 4 = 4 / 4 = 1
``````

## Effect of GA on a schema

### Effect of Selection on Schema

• assume fitness proportional selection
``````f(S1) - Fitness of schema1
f(all) - Average fitness of all individuals
M(S1) - Excepected number of individuals from schema

M(S1) = f(S1) / f(all)
``````

Schemas with fitness `f(S1) > f(all)` are likely to appear more in the next generation

### Effect of Crossover on Schema

• assume the single-point crossover

Schema `S1` survives crossover operation if at least one from a parent and offspring is an instance of the schema `S1`

Crossover survival example:

``````S1 = (* 1 0  * *)

P1 = (1 1 0| 1 0) ∈ S1
P2 = (1 0 1| 1 1) ∉ S1
V1 = (1 1 0  1 1) ∈ S1 => Schema SURVIVED
V2 = (1 0 1  1 0) ∉ S1

--------------------------------------

P1 = (1 1| 0 1 0) ∈ S1
P2 = (1 0| 1 1 1) ∉ S1
V1 = (1 1  1 1 1) ∉ S1
V2 = (1 0  0 1 0) ∉ S1
Schema DESTROYED
``````

Calculate probability if crossover occurs within the defining length bits:

``````m - total length
δ(S1) - Schema defining length of schema S1
Pd(S1) - Probability that crossover happen within defining length of schema S1

Pd(S1) = δ(S1) / (m - 1)

S1 = (*10**)
l = 5
δ(S1) = 5 - 2 = 3

Pd(S1) = 3 / (5 - 1) = 3/4
``````

Schemas with low order (number of fixed positions) are more likely to survive.

### Effect of Mutation on Schema

• assume mutation is applied gene by gene
``````Pm - probability of particular gene mutation
o(S1) - Order of schema S1
Ps(S1) - Probability that schema S1 survives mutation

Ps(S1) = (1 - Pm)^o(S1)

S1 = (*10**)
o(S1) = 2
Pm = 0.01 - Choosen by me

Ps(S1) = (1 - 0.01)^2 = 0.9801
``````

For schema `S1` to survive all fixed bits must remain unchanged.

## Conclusion

### Definitions

Schema theorem - schema with above-average fitness, short defining length and lower order is more likely to survive

Building-block hypothesis - GA works well when short, low-order, highly-fit schemas recombine to form more highly fit, higher-order schemas.

### Criticism

Schema theorem assumes that GA maintains an infinitely large population.