MAT 3373 The SVM Classifier Discussion

MAT 3373: Short HW2

1. Do Question 8.11 from the textbook. Note: If you like, you need only do the last part. You can view parts a-f as merely suggestions on how to do the last

part.

2. This question is about fitting the SVM classifier described in Equations (9.12-9.15) of the textbook, using 2-fold cross-validation. Sketch a large dataset in \

(\mathbb{R}^{2}\), along with a generic-looking split into testing and training parts, that satisfy the following:

Any classifier that is linear in the two-dimensional observations will have very large training and testing error (e.g. more than 15 percent misclassified).

You can add a (small number of) nonlinear feature(s) so that the associated SVM classifier that has 0 training error.

When you use the nonlinear features described in the previous part of the question and do cross-validation, the tuning parameter \(C\) that is chosen will be

\(>10\) and furthermore the training error will be \(>0\), despite the fact that a 0-error split is possible.

Give a short informal explanation of why this dataset will have all these properties, and write down the function used in the nonlinear classifier.

Note: When I say that a split should be “generic,” I’m really telling you: don’t try to avoid the question by drawing testing and training sets that have a lot of

structure/patterns (e.g. the training set is all the points on the left-hand-side of the picture, the testing set is all the points on the right-hand-side). If you would like

a more formal description, here is a notion that people often use: the explanation you give should be “generic” in the sense that it applies to at least 75 percent of

all possible test/train splits.

3. In class, we considered fitting the Gaussian mixture model (GMM) \(Y_{i} \sim \mathrm{Unif}(\{0,1\})\), \(X_{i} | Y_{i} \sim N(\mu_{Y_{i}},

\sigma_{Y_{i}}^{2})\) (Note: yes, the \(X_{i}\)’s are one-dimensional). We saw that, when \(\sigma_{0},\sigma_{1}\) are assumed known, we could

estimate \(Y_{1},\ldots,Y_{n}\) and \(\mu_{0},\mu_{1}\) using an algorithm that was nearly identical to the k-means algorithm.

Like the k-means algorithm, the algorithm for fitting the GMM does not always converge to the global optimum – it can get “stuck” at a local optimum.

Sketch a dataset and an estimate that is stuck at a local optimum.

Sketch a dataset for which there are two obvious clusters, but where \(2\)-means will not give the right answer. Sketch the solution given by 2-means,

describe an algorithm that will give something quite close to the right answer, and sketch this algorithm’s decision boundary. Note: depending on the

particular algorithm chosen, it might be easier to make this last sketch in several pieces. That is fine.

4. Do problem 10.4 from the textbook.

5. You generate \(n=100000\) data points \((x_{i},y_{i})\) as follows: \(x_{i} = \frac{i}{n}\), \(y_{i} = N(0,1)\) (so that \(y_{i}\) is actually independent of \

(x_{i}\)). You are considering the following modelling strategy: you will fit the data to the \(k\)-nearest-neighbour model for various values of \(k \in \

{1,2,\ldots,n-1\}\), and will to choose the correct value of \(k\) via cross-validation. In this questionm we’ll look at what happens in this “uninformative”

regime.

Computed the expected mean-squared training error for these models, for all values of \(k\). Note: yes, this means that there will be a nice formula.

Compute the expected mean-squared leave-one-out cross-validation (LOOCV) error of these models, for all values of \(k\).

Compute the variance of the quantity computed in the previous question.

Which value of \(k\) should be chosen by this procedure? Describe which values of \(k\) are plausible results of this procedure. Note: The data is not

deterministic, so you may not get your desired result!

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Delivering a high-quality product at a reasonable price is not enough anymore.

That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more