Jethro's Braindump

Concept Learning

A concept is a boolean-valued function over a set of input instances (each comprising input attributes). Concept learning is a form of supervised learning. Infer an unknown boolean-valued function from training-examples.

Hypothesis

There is a trade-off between expressive power and smaller hypothesis space. Large hypothesis spaces are bad, because search is going to take a long time, and also requires more data. Humans exploit structure in the hypothesis space to guide search and learn faster.

A hypothesis \(h\) is consistent with a set of training examples \(D\) iff \(h(x) = c(x)\) for all \(<x,c(x)> \in D\).

Inductive Learning

Any hypothesis found to approximate the target function well over a sufficient large set of training examples will also approximate the target function well over other unobserved examples.

The goal is to search for a hypothesis \(h \in H\) that is consistent with \(D\).

Exploit Structure in Concept Learning

\(h_j\) is more general than or equal to \(h_k\) (denoted \(h_j \ge_{g} ph_k\)) iff any input instance \(x\) that satisfies \(h_j\) also satisfies \(h_k\).

This is relation is a partial order.

Find-S Algorithm

Intuition: Start with the most specific hypothesis \(h\). Whenever it wrongly classifies a positive training example, we “minimally” generalize it to satisfy its input instance.

Limitations

  1. Can’t tell whether Find-S has learnt the target concept
  2. Can’t tell when training examples are inconsistent
  3. Picks a maximally specific \(h\)
  4. Depending on \(H\), there may be several solutions

Version Space

\begin{equation*} VS_{H,D} = {h \in H | h \text{ is consistent with }D} \end{equation*}

  • If \(c \in H\), then D can reduce \(VS_{H,D}\) to \({c}\).
  • If D is insufficient, then \(VS_{H,D}\) represents the uncertainty of what the target concept is
  • \(VS_{H,D}\) contains all consistent hypotheses, including maximally specific hypotheses

The general boundary G of \(VS_{H,D}\) is the set of maximally general members of \(H\) consistent with \(D\).

The specific boundary S of \(VS_{H,D}\) is the set of maximally general members of \(H\) consistent with \(D\).

\begin{equation*} VS_{H,D} = {h \in H | \exists s \in S \exists g \in G g \ge_g h \ge_g s } \end{equation*}

List-Then-Eliminate Algorithm

Iterate through all hypotheses in \(H\), and eliminate any hypothesis found inconsistent with any training example. This algorithm is often prohibitively expensive.

Candidate-Elimination Algorithm

Start with most general and specific hypotheses. Each training example “minimally” generalizes S and specializes G to remove inconsistent hypotheses from version space.