Name: Student number:11s1: COMP9417 Machine Learning and Data MiningSAMPLE: Mid-session ExaminationYour Name and Student number must appear at the head of this page.Duration of the exam: 1 hour.This examination has five questions. Answer all questions.Total marks available in the exam: 50.Multiple-choice questions require only one answer.Show all working in your script book.Paper is NOT to be retained by the candidate.1 Please see overThis page intentionally left blank.2 Please see overQuestion 1 [Total marks: 5]Well-posed Machine Learning problems(a) [1 mark] What is required to define a well-posed learning problem ?(b) [3 marks] Here are two potential real-world application tasks for machine learning:1. a winery wishes to uncover relationships between records of the quantitative analyses ofits wines from the lab and some key subjective descriptions applied to its wine (e.g. dry,fruity, light, etc.)2. you want to predict studentsâ marks in the final exam of COMP9417 given their marks fromthe other assessable components in the course â you may assume that the correspondingdata from previous years is availablePick one of the tasks and state how you would define it as a well-posed machine learning problemin terms of the above requirements.(c) [1 mark] Suggest a learning algorithm for the problem you chose (give the name, and ina sentence explain why it would be a good choice).Question 2 [Total marks: 6]Concept Learning(a) [3 marks] Write an algorithm called âFind-Gâ to find a maximally-general consistenthypothesis. You can assume the data will be noise-free and that the target concept is in thehypothesis space.(b) [3 marks] Outline the steps in a proof that Find-G will never fail to cover a positiveexample in the training set.3 Please see overQuestion 3 [Total marks: 18]Decision Tree Learning(a) [3 marks] Describe the main steps in the basic decision tree learning algorithm.The table below contains a sample S of ten examples. Each example is described using twoBoolean attributes A and B. Each is labelled (classified) by the target Boolean function.Id A B Class1 1 0 +2 0 1 -3 1 1 -4 1 0 +5 1 1 -6 1 1 -7 0 0 +8 1 1 +9 0 0 +10 0 0 -(b) [2 marks] What is the entropy of thse examples with respect to the given classification ?[Note: you must show how you got your answer using the standard formula.]This table gives approximate values of entropy for frequencies of positive examples in a two-classsample.Frequency of class â+â in sample Entropy of sample0.0 0.000.1 0.470.2 0.720.3 0.880.4 0.970.5 1.000.6 0.970.7 0.880.8 0.720.9 0.471.0 0.004 Please see over(c) [4 marks] What is the information gain of attribute A on sample S above ?(d) [4 marks] What is the information gain of attribute B on sample S above ?(e) [2 marks] Which would be chosen as the âbestâ attribute by a decision tree learner usingthe information gain splitting criterion ? Why ?(f) [3 marks] Describe a method for overfitting-avoidance in decision tree learning.Question 4 [Total marks: 10]Learning for Numeric Prediction(a) Let the weights of a two-input perceptron be: w0 = 0.2, w1 = 0.5 and w2 = 0.5. Assumingthat x0 = 1, what is the output of the perceptron when:[i] [1 mark] x1 = ?1 and x2 = ?1 ?[ii] [1 mark] x1 = ?1 and x2 = 1 ?Letting w0 = ?0.2 and keeping x0 = 1, w1 = 0.5 and w2 = 0.5, what is the perceptron outputwhen:[iii] [1 mark] x1 = 1 and x2 = ?1 ?[iv] [1 mark] x1 = 1 and x2 = 1 ?(b) [6 marks] Here is a regression tree with leaf nodes denoted A, B and C:X <= 5 : AX > 5 😐 X <= 9: B| X > 9: CThis is the training set from which the regression tree was learned:5 Please see overX Class1 83 114 86 37 68 29 511 1212 1514 15Write down the output (class) values and number of instances that appear in each of the leafnodes A, B and C of the tree.6 Please see overQuestion 5 [Total marks: 11]Neural and Tree Learning on Continuous Attributes(a) [1 mark] In general, feedforward neural networks (multi-layer perceptrons) trained byerror back-propagation are:(i) fast to train, and fast to run on unseen examples(ii) slow to train, and fast to run on unseen examples(iii) fast to train, and slow to run on unseen examples(iv) slow to train, and slow to run on unseen examplesIn one sentence, explain your choice of answer.Suppose you have a decision tree (DT) and a multi-layer perceptron (MLP) that have beentrained on data sampled from a two-class target function, with all attributes numeric. You canthink of both models as graphs whose edges are labelled with numbers: weights in the MLP andthreshold constants for attribute tests in the DT.(b) [4 marks] Compare and contrast the roles of these numbers in the two models.(c) [6 marks] Compare and contrast the methods of learning these numbers in the two models.7 Please see over