(original) (raw)

\slidehead{Outline \red \ } \bk \centerline{[read Chapter 2]} \centerline{[suggested exercises 2.2, 2.3, 2.4, 2.6]} \bigskip \bi \item Learning from examples \item General-to-specific ordering over hypotheses \item Version spaces and candidate elimination algorithm \item Picking new examples \item The need for inductive bias \ei \bigskip Note: simple approach assuming no noise, illustrates key concepts \slidehead{Training Examples for {\em EnjoySport} \red \ } \bk \begin{center} \normalsize %\small \begin{tabular}{|ccccccc|} \hline Sky & Temp & Humid & Wind & Water & Forecst & EnjoySpt \\ \hline Sunny & Warm & Normal & Strong & Warm & Same & Yes \\ Sunny & Warm & High & Strong & Warm & Same & Yes \\ Rainy & Cold & High & Strong & Warm & Change & No \\ Sunny & Warm & High & Strong & Cool & Change & Yes \\ \hline \end{tabular} \bigskip \bigskip What is the general concept? \end{center} \slidehead{Representing Hypotheses \red \ } \bk Many possible representations \bigskip Here, hhh is conjunction of constraints on attributes \bigskip Each constraint can be \bi \item a specfic value (e.g., Water=WarmWater=WarmWater=Warm) \item don't care (e.g., ``$Water=?$'') \item no value allowed (e.g.,``Water=$\emptyset$'') \ei For example, \begin{tabular}{ccccccc} Sky & AirTemp & Humid & Wind & Water & Forecst \\ langleSunny\langle SunnylangleSunny & ??? & ??? & StrongStrongStrong & ??? & SamerangleSame \rangleSamerangle \end{tabular} \slidehead{Prototypical Concept Learning Task \red \ } \bk \bi \item {\bf Given:} \bi \item Instances XXX: Possible days, each described by the attributes {\em Sky, AirTemp, Humidity, Wind, Water, Forecast} \item Target function ccc: EnjoySport:Xrightarrow0,1EnjoySport: X \rightarrow \{0,1 \}EnjoySport:Xrightarrow0,1 \item Hypotheses HHH: Conjunctions of literals. E.g. \[\langle ?, Cold, High, ?, ?, ? \rangle. \] \item Training examples DDD: Positive and negative examples of the target function \[\langle x_1, c(x_1) \rangle , \ldots \langle x_m, c(x_m) \rangle\] \end{itemize} \item {\bf Determine:} A hypothesis hhh in HHH such that h(x)=c(x)h(x)=c(x)h(x)=c(x) for all xxx in DDD. %\begin{itemize} %\item[] ? %A hypothesis hhh in HHH such that h(x)=c(x)h(x)=c(x)h(x)=c(x) for all xxx in XXX. % %\item %(A hypothesis hhh in HHH such that h(x)=c(x)h(x)=c(x)h(x)=c(x) for all xxx in DDD.) %\end{itemize} \ei \newpage \vspace*{2in} \begin{description} \item {\bf The inductive learning hypothesis:} Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples. \end{description} \slidehead{Instance, Hypotheses, and More-General-Than \red \ } \bk \hbox{ \psfig{file=./bookps/vs-gen-spec.ps,width=7in}} \slidehead{\pgm{Find-S} Algorithm \red \ } \bk \be \item Initialize hhh to the most specific hypothesis in HHH \item For each positive training instance xxx \bi \item For each attribute constraint aia_{i}ai in hhh \bi \item[] If the constraint aia_{i}ai in hhh is satisfied by xxx \item[] Then do nothing \item[] Else replace aia_{i}ai in hhh by the next more general constraint that is satisfied by xxx \ei \ei \item Output hypothesis hhh \ee \slidehead{Hypothesis Space Search by \pgm{Find-S} \red \ } \bk \hbox{ \psfig{file=./bookps/finds-gen-spec.ps,width=8in}} \slidehead{Complaints about \pgm{Find-S} \red \ } \bk \bi \item Can't tell whether it has learned concept \item Can't tell when training data inconsistent \item Picks a maximally specific hhh (why?) \item Depending on HHH, there might be several! \ei \slidehead{Version Spaces \red \ } \bk \begin{quote} A hypothesis hhh is {\bf consistent} with a set of training examples DDD of target concept ccc if and only if h(x)=c(x)h(x)=c(x)h(x)=c(x) for each training example langlex,c(x)rangle\langle x, c(x) \ranglelanglex,c(x)rangle in DDD. \[Consistent(h,D) \equiv (\forall \langle x, c(x) \rangle \in D)\ h(x)=c(x) \] \end{quote} \bigskip \bigskip \begin{quote} The {\bf version space}, VSH,DVS_{H,D}VSH,D, with respect to hypothesis space HHH and training examples DDD, is the subset of hypotheses from HHH consistent with all training examples in DDD. \[VS_{H,D} \equiv \{h \in H|Consistent(h,D)\} \] \end{quote} \slidehead{The \pgm{List-Then-Eliminate} Algorithm: \red \ } \bk \begin{enumerate} \item VersionSpaceleftarrowVersionSpace \leftarrowVersionSpaceleftarrow a list containing every hypothesis in HHH \item For each training example, langlex,c(x)rangle\langle x, c(x) \ranglelanglex,c(x)rangle \item[] remove from VersionSpaceVersionSpaceVersionSpace any hypothesis hhh for which h(x)neqc(x)h(x) \neq c(x)h(x)neqc(x) \item Output the list of hypotheses in VersionSpaceVersionSpaceVersionSpace \end{enumerate} \slidehead{Example Version Space \red \ } \bk \hbox{\psfig{file=./bookps/figure-vs3.ps,width=7in}} \slidehead{Representing Version Spaces \red \ } \bk \begin{description} \item The {\bf General boundary}, G, of version space VSH,DVS_{H,D}VSH,D is the set of its maximally general members \bigskip \item The {\bf Specific boundary}, S, of version space VSH,DVS_{H,D}VSH,D is the set of its maximally specific members \bigskip \item Every member of the version space lies between these boundaries \[VS_{H,D} = \{h \in H| (\exists s \in S)(\exists g \in G) (g \geq h \geq s)\}\] where xgeqyx \geq yxgeqy means xxx is more general or equal to yyy \end{description} \slidehead{ Candidate Elimination Algorithm \red \ } \bk GlaG \laGla maximally general hypotheses in HHH SlaS \laSla maximally specific hypotheses in HHH For each training example ddd, do \bi \item If ddd is a positive example \bi \item Remove from GGG any hypothesis inconsistent with ddd \item For each hypothesis sss in SSS that is not consistent with ddd \bi \item Remove sss from SSS \item Add to SSS all minimal generalizations hhh of sss such that \be \item hhh is consistent with ddd, and \item some member of GGG is more general than hhh \ee \item Remove from SSS any hypothesis that is more general than another hypothesis in SSS \ei \ei \item If ddd is a negative example \bi \item Remove from SSS any hypothesis inconsistent with ddd \item For each hypothesis ggg in GGG that is not consistent with ddd \bi \item Remove ggg from GGG \item Add to GGG all minimal specializations hhh of ggg such that \be \item hhh is consistent with ddd, and \item some member of SSS is more specific than hhh \ee \item Remove from GGG any hypothesis that is less general than another hypothesis in GGG \ei \ei \ei \slidehead{Example Trace \red \ } \bk \hbox{ \psfig{file=./bookps/vsexamp-initialize.ps,width=7in}} \slidehead{What Next Training Example? \red \ } \bk \hbox{\psfig{file=./bookps/figure-vs3.ps,width=7in}} \slidehead{How Should These Be Classified? \red \ } \bk \hbox{\psfig{file=./bookps/figure-vs3.ps,width=7in}} \bigskip \bi \item[] \[ \langle Sunny\ Warm\ Normal\ Strong\ Cool\ Change \rangle \] \item[] \[ \langle Rainy\ Cool\ Normal\ Light\ Warm\ Same \rangle \] \item[] \[ \langle Sunny\ Warm\ Normal\ Light\ Warm\ Same \rangle \] \ei \slidehead{What Justifies this Inductive Leap? \red \ } \bk \[+\ \ \langle Sunny\ Warm\ Normal\ Strong\ Cool\ Change \rangle \] \[+\ \ \langle Sunny\ Warm\ Normal\ Light\ Warm\ Same \rangle \] \rule{7in}{.2mm}\vskip -3mm \[S:\ \ \langle Sunny\ Warm\ Normal\ ?\ ?\ ? \rangle \] \bigskip Why believe we can classify the unseen \[ \langle Sunny\ Warm\ Normal\ Strong\ Warm\ Same \rangle \] \slidehead{An UNBiased Learner \red \ } \bk \bi \item[] Idea: Choose HHH that expresses every teachable concept (i.e., HHH is the power set of XXX) \item[] Consider H′=H' =H′= disjunctions, conjunctions, negations over previous HHH. E.g., \[ \langle Sunny\ Warm\ Normal\ ?\ ?\ ? \rangle \ \tor \ \neg \langle ?\ ?\ ?\ ?\ ?\ Change \rangle \] \ei What are SSS, GGG in this case? \bi \item[] S la\lala \item[] G la\lala \ei \slidehead{Inductive Bias \red \ } \bk Consider \bi \item concept learning algorithm LLL \item instances XXX, target concept ccc \item training examples Dc=langlex,c(x)rangleD_c = \{\langle x, c(x) \rangle \}Dc=langlex,c(x)rangle \item let L(xi,Dc)L(x_i,D_c)L(xi,Dc) denote the classification assigned to the instance xix_ixi by LLL after training on data DcD_cDc. \ei \vspace*{.2in} {\bf Definition}: \begin{quote}The {\bf inductive bias} of LLL is any minimal set of assertions BBB such that for any target concept ccc and corresponding training examples DcD_cDc \end{quote} \[ (\forall x_i \in X) [(B \tand D_c \tand x_i) \entails L(x_i,D_c)] \] where AentailsBA \entails BAentailsB means AAA logically entails BBB \slidehead{Inductive Systems and Equivalent Deductive Systems \red \ } \bk \centerline{\hbox{ \psfig{file=./bookps/figure-vs-bias-new.ps,width=5.6in}}} \slidehead{Three Learners with Different Biases \red \ } \bk \be \item {\em Rote learner:} Store examples, Classify xxx iff it matches previously observed example. \item {\em Version space candidate elimination algorithm} \item {\em Find-S} \ee \slidehead{Summary Points \red \ } \bk \be \item Concept learning as search through HHH \item General-to-specific ordering over HHH \item Version space candidate elimination algorithm \item SSS and GGG boundaries characterize learner's uncertainty \item Learner can generate useful queries \item Inductive leaps possible only if learner is biased \item Inductive learners can be modelled by equivalent deductive systems \ee