Concept Learning

A concept is a boolean-valued function over a set of input instances (each comprising input attributes). Concept learning is a form of supervised learning. Infer an unknown boolean-valued function from training-examples.

Hypothesis

There is a trade-off between expressive power and smaller hypothesis space. Large hypothesis spaces are bad, because search is going to take a long time, and also requires more data. Humans exploit structure in the hypothesis space to guide search and learn faster.

A hypothesis $h$ is consistent with a set of training examples $D$ iff $h (x) = c (x)$ for all $< x, c (x) >\in D$ .

Inductive Learning

Any hypothesis found to approximate the target function well over a sufficient large set of training examples will also approximate the target function well over other unobserved examples.

Concept Learning is Search

The goal is to search for a hypothesis $h \in H$ that is consistent with $D$ .

Exploit Structure in Concept Learning

$h_{j}$ is more general than or equal to $h_{k}$ (denoted $h_{j} \geq_{g} p h_{k}$ ) iff any input instance $x$ that satisfies $h_{j}$ also satisfies $h_{k}$ .

This is relation is a partial order.

Find-S Algorithm

Intuition: Start with the most specific hypothesis $h$ . Whenever it wrongly classifies a positive training example, we “minimally” generalize it to satisfy its input instance.

Limitations

Can’t tell whether Find-S has learnt the target concept
Can’t tell when training examples are inconsistent
Picks a maximally specific $h$
Depending on $H$ , there may be several solutions

Version Space

$V S_{H, D} = h \in H | h is consistent with D$

If $c \in H$ , then D can reduce $V S_{H, D}$ to $c$ .
If D is insufficient, then $V S_{H, D}$ represents the uncertainty of what the target concept is
$V S_{H, D}$ contains all consistent hypotheses, including maximally specific hypotheses

The general boundary G of $V S_{H, D}$ is the set of maximally general members of $H$ consistent with $D$ .

The specific boundary S of $V S_{H, D}$ is the set of maximally general members of $H$ consistent with $D$ .

$V S_{H, D} = h \in H | \exists s \in S \exists g \in G g \geq_{g} h \geq_{g} s$

List-Then-Eliminate Algorithm

Iterate through all hypotheses in $H$ , and eliminate any hypothesis found inconsistent with any training example. This algorithm is often prohibitively expensive.

Candidate-Elimination Algorithm

Start with most general and specific hypotheses. Each training example “minimally” generalizes S and specializes G to remove inconsistent hypotheses from version space.