Problem Set Supervised Learning
490 - Spring 2025
American University of Beirut
Important Notes
For any questions regarding the assignment, please contact the Teaching Assistants:
• Mohammad Zbeeb — mbz02@[Link]
• Mariam Salman — mcs12@[Link]
Note: Do not contact the professor regarding this assignment.
Overview
This assignment is divided into two parts:
1. Toolbox Tasks (Data Collection & Modeling): Use the Physics Toolbox Sensor
Suite application to collect sensor data, and design two machine learning tasks (one
regression and one classification).
2. Theoretical Questions: Answer theoretical questions related to supervised learning.
All questions in this section are required.
1
1 Part I: Toolbox Tasks for Regression and Classification
1.1 Overview
In this part, you will use real-world sensor data collected from your smartphone using the
Physics Toolbox Sensor Suite application. The aim is to design and implement two separate
machine learning tasks:
• Regression Task: Choose a physical phenomenon (e.g., harmonic motion, acceleration
changes, etc.) and analyze it using regression techniques.
• Classification Task: Collect labeled data for an activity recognition problem (e.g., de-
tecting push-ups over time) and apply classification methods.
1.2 Task Requirements
1. Download the Physics Toolbox Sensor Suite application on your smartphone.
2. Collect appropriate sensor data for each task.
3. Preprocess and analyze the collected data using Python in a Google Colab notebook.
4. Implement the corresponding regression and classification models.
5. Evaluate and interpret the results.
6. Document your entire workflow in a well-organized Colab notebook.
1.3 Submission Guidelines
Submit the following:
• A publicly accessible link to your Google Colab notebook. Ensure that anyone with
the link can view it.
• A publicly accessible Google Drive link to your dataset. Ensure that the dataset is
shared with anyone with the link.
Submission Format: Provide both links in a plain text file or a PDF document in the
submission box on Moodle.
2
2 Part II: Theoretical Questions
In this part, you will answer theoretical questions related to supervised learning. All questions
in this section are required. Provide your answers in the same Colab Notebook as Part I.
Theoretical Question 1: Multivariate Least Squares
For a dataset where y (i) is vector-valued with p outputs, the cost function is:
m p
1 X X T (i) (i) 2
J(Θ) = (Θ x )j − yj .
2
i=1 j=1
Tasks:
1. Express this cost function in matrix-vector notation.
2. Derive the normal equations for Θ.
3. Compare this solution to solving p independent least squares problems.
Theoretical Question 3: Losses, Error Estimates, and k-NN Review
This exercise will help you review the notions of losses, prediction error, and estimates of error
(training error, error using an independent test set, cross-validation estimates of error), as well
as remind you of basic k-Nearest Neighbor (k-NN) ideas.
Setup. We consider a training set T = {(yi , xi )}i=1,...,n , where for simplicity we assume xi are
fixed and yi ∈ {−1, 1} is a binary random variable with pi = Pr(yi = 1). Given a training set,
we assume our estimate for each pi will be
1 + ayi
p̂i = ,
2
where 0 ≤ a ≤ 1 is a parameter that controls the degree of fit to the training data. Larger
values of a provide a closer fit.
We want to compute training and test errors under different losses to parallel the zero-one
loss case discussed in class. The losses are:
• Exponential loss:
L(y, f ) = exp(−yf ),
• Squared error (L2) loss:
L(y, f ) = (y − f )2 ,
• Absolute error (L1) loss:
L(y, f ) = |y − f |,
• Logistic (likelihood) loss:
L(y, f ) = log 1 + e−2 y f .
Tasks (Choose 3 out of the 4 losses):
3
1. For each of your chosen losses, obtain f ∗ , the population minimizer of the corresponding
population risk:
f ∗ = arg min E(X,Y ) L(Y, f (X)) ,
f
where the expectation is with respect to the distribution of (X, Y ). In this simplified
scenario, X is fixed and the randomness is from Y only, so effectively pi is all we need.
2. Using p̂i in place of pi , obtain the corresponding estimate fˆ (i.e., show how f ∗ depends
on pi ; then plug in p̂i ).
3. Compute the training error R̂ (as a function of a) and the average test error R = Err of
the rule F̂ . Comment on how the errors vary with a. For R, find also (if it exists) an a∗
that minimizes R. To simplify your expression for R, recall the 0-1 loss error term:
n
2 X
e = pi (1 − pi ),
n
i=1
as discussed in the class notes.
4. Compute the mean and variance of fˆ at a point X ∗ = x.
(Hint: Try to parallel the worked example in the lecture notes where the 0-1 loss was used.
Most of the reasoning remains similar, but the minimizers f ∗ differ by loss function.)
Theoretical Question 4: A Simple Non-Convex Optimization
In this question, you will analyze a small non-convex function and explore its critical points.
Consider the function:
J(w1 , w2 ) = w12 + w22 − α w1 w2 − log 1 + ew1 +w2 ,
where α is a real constant (e.g., α > 0).
Tasks:
1. Take partial derivatives with respect to w1 and w2 ; set them equal to zero to find the
critical points.
2. Discuss the nature of the critical points (local minima, maxima, or saddle points).
You can use the Hessian or any other preferred method to analyze the curvature around
each critical point.
3. Implementation (Optional): Provide a small code snippet (e.g., in Python) using
symbolic differentiation to verify your analytical derivatives.
Example Python Code (Optional): Symbolic Derivatives
Listing 1: Symbolic Differentiation of a Non-Convex Function
1 import sympy
2
3 # Define the symbols
4 w1 , w2 , alpha = sympy . symbols ( ’ w1 ␣ w2 ␣ alpha ’ , real = True )
5
6 # Define the function J ( w1 , w2 )
4
7 J = w1 **2 + w2 **2 - alpha * w1 * w2 - sympy . log (1 + sympy . exp ( w1 + w2 ))
8
9 # Partial derivatives
10 dJ_w1 = sympy . diff (J , w1 )
11 dJ_w2 = sympy . diff (J , w2 )
12
13 # Display the derivatives
14 print ( " dJ / dw1 ␣ = " , dJ_w1 )
15 print ( " dJ / dw2 ␣ = " , dJ_w2 )
You can then set dJ_w1 = 0 and dJ_w2 = 0 numerically (e.g., with [Link]) or analyze
them by hand.