0% found this document useful (0 votes)

8 views4 pages

pset1

The document outlines Problem Set 11 for EE 439: Introduction to Machine Learning at the University of Engineering and Technology, due on February 27, 2026. It covers topics such as Newton's method for least squares, locally-weighted logistic regression, multivariate least squares, and modeling emergency call rates using a Poisson distribution. The problem set includes theoretical proofs, implementation tasks, and evaluations related to machine learning algorithms and statistical modeling.

Uploaded by

Dayem Mujahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

pset1

Uploaded by

Dayem Mujahid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

University of Engineering and Technology

Department of Electrical Engineering

EE 439: Introduction to Machine Learning
Spring 2026

Problem Set 11 Issue Date: February 20, 2026 Due Date: February 27, 2026

1. Newton’s method for computing least squares.

In this problem, we will prove that if we use Newton’s method solve the least squares optimization
problem, then we only need one iteration to converge to θ∗ .
(a) Find the Hessian of the cost function
m
1 X T (i)
J(θ) = (θ x − y (i) )2 .
2
i=1

(b) Show that the first iteration of Newton’s method gives us

θ∗ = (X T X)−1 X T ⃗y ,

the solution to our least squares problem.

2. Locally-weighted logistic regression

In this problem you will implement a locally-weighted version of logistic regression, where we
weight different training examples differently according to the query point. The locally-weighted
logistic regression problem is to maximize

m
λ T X h i
ℓ(θ) = − θ θ + w(i) y (i) log hθ (x(i) ) + (1 − y (i) ) log(1 − hθ (x(i) )) .
2
i=1

The − λ2 θT θ here is what is known as a regularization parameter, which will be discussed in a

future lecture, but which we include here because it is needed for Newton’s method to perform
well on this task. For the entirety of this problem you can use the value λ = 0.0001.
Using this definition, the gradient of ℓ(θ) is given by

∇θ ℓ(θ) = X T z − λθ

where z ∈ Rm is defined by

zi = w(i) (y (i) − hθ (x(i) ))

and the Hessian is given by

H = X T DX − λI

where D ∈ Rm×m is a diagonal matrix with

Dii = −w(i) hθ (x(i) )(1 − hθ (x(i) )).

For the sake of this problem you can just use the above formulas, but you should try to derive
these results for yourself as well.
1
Questions 1, 2, 3 are from Stanford’s CS 229
Given a query point x, we choose compute the weights
!
(i) ∥x − x(i) ∥2
w = exp − .
2τ 2

Much like the locally weighted linear regression that was discussed in class, this weighting scheme
gives more when the “nearby” points when predicting the class of a new example.
(a) Implement the Newton–Raphson algorithm for optimizing ℓ(θ) for a new query point x, and
use this to predict the class of x.
You should implement a function of the form
y = lwlr(X train, y train, x, tau)
in Python. This function takes as input the training set (the X train and y train matrices,
in the form described in the class notes), a new query point x, and the weight bandwidth
tau.
Given this input, the function should 1) compute weights w(i) for each training example
using the formula above, 2) maximize ℓ(θ) using Newton’s method, and finally 3) output
y = 1{hθ (x) > 0.5} as the prediction.
No starter code will be provided. You may structure your program as you wish, but your
implementation must explicitly compute the gradient and Hessian using the formulas given
in the problem statement and perform iterative Newton updates until convergence.
You may use numerical libraries such as numpy for linear algebra operations. However,
you may not use high-level machine learning libraries (e.g., scikit-learn) to perform the
optimization. You may use loadtxt() function from numpy to load the given data.
To visualize the classifier, you may evaluate your lwlr function over a grid of points and
plot the resulting predictions, coloring regions according to whether the classifier predicts
y = 0 or y = 1. Depending on how efficient your implementation is, generating a fine grid
may take some time. We recommend debugging with a coarse resolution (e.g., 50), and later
increasing it to at least 200 to better visualize the decision boundary.
(b) Evaluate the system with a variety of different bandwidth parameters τ . In particular, try
τ = 0.01, 0.05, 0.1, 0.5, 1.0, 5.0. How does the classification boundary change when varying
this parameter? Can you predict what the decision boundary of ordinary (unweighted)
logistic regression would look like?

3. Multivariate least squares

So far in class, we have only considered cases where our target variable y is a scalar value. Suppose
that instead of trying to predict a single output, we have a training set with multiple outputs for
each example:

{(x(i) , y (i) ), i = 1, . . . , m}, x(i) ∈ Rn , y (i) ∈ Rp .

Thus for each training example, y (i) is vector-valued, with p entries. We wish to use a linear
model to predict the outputs, as in least squares, by specifying the parameter matrix Θ in

y = ΘT x,

where Θ ∈ Rn×p .
(a) The cost function for this case is
m p
1 X X T (i) (i) 2

J(Θ) = (Θ x )j − yj .
2
i=1 j=1

Page 2
Write J(Θ) in matrix-vector notation (i.e., without using any summations). [Hint: Start
with the m × n design matrix

−(x(1) )T −
 
 −(x(2) )T − 
X=
 
.. 
 . 
−(x(m) )T −
and the m × p target matrix

−(y (1) )T −
 
 −(y (2) )T − 
Y =
 
.. 
 . 
−(y (m) )T −
and then work out how to express J(Θ) in terms of these matrices.]
(b) Find the closed form solution for Θ which minimizes J(Θ). This is the equivalent to the
normal equations for the multivariate case.
(c) Suppose instead of considering the multivariate vectors y (i) all at once, we instead compute
(i)
each variable yj separately for each j = 1, . . . , p. In this case, we have p individual linear
models, of the form

(i)
yj = θjT x(i) , j = 1, . . . , p.

(So here, each θj ∈ Rn .) How do the parameters from these p independent least squares
problems compare to the multivariate solution?

4. Modeling and Estimating Emergency Call Rates

The city of Metroville records the number of emergency calls received per hour in one district.
Let X denote the number of calls in a randomly selected hour. A statistician proposes the model

X ∼ Poisson(λ), λ > 0,

where λ is the average number of calls per hour. Recall that for a Poisson random variable,

E[X] = λ, Var(X) = λ.

The city collects data over n = 200 hours and reports:

X̄ = 2.4, S 2 = 2.5,

where
n n
1X 2 1 X
X̄ = Xi , S = (Xi − X̄)2 .
n n−1
i=1 i=1

Define the dispersion ratio

S2
D= .
X̄
For large n, under a Poisson model, D is approximately centered at 1 with standard deviation
r
2
.
n
(a) Model Evaluation

Page 3
i. What is the support of the Poisson distribution? Explain why this makes it a natural
(or unnatural) model for hourly call counts.
ii. Let X1 ∼ Poisson(λ1 ) and X2 ∼ Poisson(λ2 ) be independent. Prove that

X1 + X2 ∼ Poisson(λ1 + λ2 ).

iii. Suppose emergency calls arise independently from two different neighborhoods, or from
two independent sources (e.g., medical and fire calls), with hourly counts

X1 ∼ Poisson(λ1 ), X2 ∼ Poisson(λ2 ),

and total calls X = X1 +X2 . Using the result above, explain why the additivity property
is desirable in this context. Interpret λ1 + λ2 in terms of call intensities.
iv. Give two realistic situations in which the Poisson model might fail for emergency call
data. For each, clearly state which modeling assumption is violated.
v. Give two mathematical reasons why a normal distribution may be inappropriate for
modeling hourly call counts. Your answer must explicitly refer to (i) the support of the
normal distribution and (ii) at least one additional structural property of the normal
family.
vi. What additional assumptions about the data-generating process would be required to
justify a binomial model for hourly call counts? Would these assumptions be reasonable
in this setting? Briefly explain.
vii. Compute the dispersion ratio D. Using the approximation above, assess whether the
deviation of D from 1 is consistent with sampling variability when n = 200. Briefly
comment on whether the Poisson model appears reasonable.
(b) Maximum Likelihood Estimation
Assume for the remainder of the problem that the Poisson model is adopted.
i. Write down the likelihood function L(λ) and log-likelihood function ℓ(λ) for observations
X1 , . . . , Xn .
ii. Show that the maximum likelihood estimator is

λ̂MLE = X̄.

iii. Compute the numerical value of λ̂MLE .

(c) Maximum A Posteriori Estimation
City planners believe, based on historical data from similar districts, that the typical call
rate is around 2 calls per hour and that extremely large rates are unlikely. They model this
belief with a Gamma prior:
β α α−1 −βλ
p(λ) = λ e , λ > 0,
Γ(α)
with α = 4 and β = 2.
i. Write down the posterior distribution p(λ | data) up to proportionality.
ii. Derive the MAP estimator λ̂MAP .
iii. Compare λ̂MLE and λ̂MAP . Explain why they differ and interpret the MAP estimator as
a form of shrinkage.
(d) Prediction
i. Using the MLE plug-in estimate, write an expression for the probability that in the next
hour there will be at least 5 calls.
ii. Write the analogous expression using the MAP plug-in estimate.
iii. Explain conceptually why a full Bayesian predictive distribution would generally differ
from both plug-in approaches. No computation is required.

Page 4

CS229 Problem Set 1: Supervised Learning
No ratings yet
CS229 Problem Set 1: Supervised Learning
55 pages
CS229 Fall 2017 Problem Set 1: Supervised Learning
100% (1)
CS229 Fall 2017 Problem Set 1: Supervised Learning
8 pages
CS229 Problem Set 1 Solutions
No ratings yet
CS229 Problem Set 1 Solutions
10 pages
CS229 Autumn 2014 Problem Set 1
No ratings yet
CS229 Autumn 2014 Problem Set 1
5 pages
CS229 Summer 2020 Problem Set 1 Solutions
No ratings yet
CS229 Summer 2020 Problem Set 1 Solutions
25 pages
CS229 Midterm Exam - Autumn 2015
No ratings yet
CS229 Midterm Exam - Autumn 2015
25 pages
CS229 Practice Midterm Overview
No ratings yet
CS229 Practice Midterm Overview
4 pages
Machine Learning Homework 3: Logistic Regression
No ratings yet
Machine Learning Homework 3: Logistic Regression
7 pages
CS229 Practice Midterm Questions
No ratings yet
CS229 Practice Midterm Questions
4 pages
Machine Learning Exam Guidelines 2023
No ratings yet
Machine Learning Exam Guidelines 2023
22 pages
Machine Learning Assignment 1 Guide
No ratings yet
Machine Learning Assignment 1 Guide
6 pages
Linear & Logistic Regression Concepts
No ratings yet
Linear & Logistic Regression Concepts
18 pages
TDDE07 Bayesian Learning Exam Guide
No ratings yet
TDDE07 Bayesian Learning Exam Guide
3 pages
Machine Learning Exam: Kharagpur 2011
No ratings yet
Machine Learning Exam: Kharagpur 2011
10 pages
Statistical Machine Learning Exam Guide
No ratings yet
Statistical Machine Learning Exam Guide
12 pages
CS229 Autumn 2015 Midterm Exam
No ratings yet
CS229 Autumn 2015 Midterm Exam
21 pages
CS229 Problem Set 1: Supervised Learning
No ratings yet
CS229 Problem Set 1: Supervised Learning
8 pages
CS 189 Spring 2014 Machine Learning Midterm Exam
No ratings yet
CS 189 Spring 2014 Machine Learning Midterm Exam
12 pages
Supervised Learning Problem Set
No ratings yet
Supervised Learning Problem Set
5 pages
CS229 Problem Set 1: Supervised Learning
No ratings yet
CS229 Problem Set 1: Supervised Learning
8 pages
SVM Classifier on Modified Iris Dataset
No ratings yet
SVM Classifier on Modified Iris Dataset
45 pages
CS229 Autumn 2014 Midterm Exam
No ratings yet
CS229 Autumn 2014 Midterm Exam
23 pages
IIT Patna CS244 Data Science Exam 2022
No ratings yet
IIT Patna CS244 Data Science Exam 2022
6 pages
CS-419 Semester-End Exam Instructions
No ratings yet
CS-419 Semester-End Exam Instructions
6 pages
NTU Fall 2024 Machine Learning Homework 3
No ratings yet
NTU Fall 2024 Machine Learning Homework 3
4 pages
CS229 Autumn 2014 Midterm Exam
No ratings yet
CS229 Autumn 2014 Midterm Exam
23 pages
Machine Learning Midsem Exam Solutions
No ratings yet
Machine Learning Midsem Exam Solutions
6 pages
CS229 Practice Midterm Solutions
No ratings yet
CS229 Practice Midterm Solutions
8 pages
Machine Learning Exam Solutions 2019
No ratings yet
Machine Learning Exam Solutions 2019
4 pages
Statistical Machine Learning Exam Guide
No ratings yet
Statistical Machine Learning Exam Guide
12 pages
IIT Kharagpur Machine Learning Exam 2023
No ratings yet
IIT Kharagpur Machine Learning Exam 2023
9 pages
Statistical Machine Learning Exam Guide
No ratings yet
Statistical Machine Learning Exam Guide
11 pages
Machine Learning Exam Solutions 2021
No ratings yet
Machine Learning Exam Solutions 2021
8 pages
4432_2022_final
No ratings yet
4432_2022_final
10 pages
CS229 Fall 2018 Problem Set 1: Supervised Learning
No ratings yet
CS229 Fall 2018 Problem Set 1: Supervised Learning
9 pages
Machine Learning Assignment Overview
No ratings yet
Machine Learning Assignment Overview
45 pages
Machine Learning Final Exam Questions
No ratings yet
Machine Learning Final Exam Questions
22 pages
CSE 446/546 Homework #3 Guidelines
No ratings yet
CSE 446/546 Homework #3 Guidelines
7 pages
CS6923 Machine Learning Homework 3
No ratings yet
CS6923 Machine Learning Homework 3
3 pages
Machine Learning Assignment Overview
No ratings yet
Machine Learning Assignment Overview
46 pages
EE 769 Machine Learning Exercises
No ratings yet
EE 769 Machine Learning Exercises
4 pages
ECON 556X Problem Set 2 Instructions
No ratings yet
ECON 556X Problem Set 2 Instructions
5 pages
Machine Learning Exam Questions and Solutions
No ratings yet
Machine Learning Exam Questions and Solutions
4 pages
PCML 2014 Exam Sample Questions
No ratings yet
PCML 2014 Exam Sample Questions
6 pages
Data Quality Issues and Solutions in Datasets
No ratings yet
Data Quality Issues and Solutions in Datasets
4 pages
CS4100/CS5100 Machine Learning Assignment
No ratings yet
CS4100/CS5100 Machine Learning Assignment
10 pages
Entropy Calculation for Class Split
No ratings yet
Entropy Calculation for Class Split
32 pages
Machine Learning Concepts and Tasks
No ratings yet
Machine Learning Concepts and Tasks
36 pages
Midterm Exam Solutions for ML Course
No ratings yet
Midterm Exam Solutions for ML Course
11 pages
Machine Learning Optimization Techniques
No ratings yet
Machine Learning Optimization Techniques
6 pages
IIT Madras Machine Learning Assignment 3
No ratings yet
IIT Madras Machine Learning Assignment 3
3 pages
CMPT 726 Assignment 2 Overview
No ratings yet
CMPT 726 Assignment 2 Overview
7 pages
CS229 Fall 2018 Problem Set 1 Solutions
100% (1)
CS229 Fall 2018 Problem Set 1 Solutions
25 pages
CS229 Problem Set 0: Linear Algebra Basics
No ratings yet
CS229 Problem Set 0: Linear Algebra Basics
47 pages
Linear Classification Exercises Solutions
No ratings yet
Linear Classification Exercises Solutions
9 pages
Supervised Learning Overview and Models
No ratings yet
Supervised Learning Overview and Models
4 pages
Air Conditioning Evaluation Guide
No ratings yet
Air Conditioning Evaluation Guide
32 pages
Boolean Logic Circuit Revision Questions
No ratings yet
Boolean Logic Circuit Revision Questions
7 pages
National 110UE SCR Drilling Rig Specs
No ratings yet
National 110UE SCR Drilling Rig Specs
1 page
G991 IC Datasheet and Pinout Guide
No ratings yet
G991 IC Datasheet and Pinout Guide
2 pages
Cooperative Motion Planning for AMVs
No ratings yet
Cooperative Motion Planning for AMVs
6 pages
Change Management Process Overview
No ratings yet
Change Management Process Overview
11 pages
KAB Electrical Plan for Covered Court
No ratings yet
KAB Electrical Plan for Covered Court
1 page
Pneumonia Detection System with AI
No ratings yet
Pneumonia Detection System with AI
64 pages
Understanding STP and Its Attacks
No ratings yet
Understanding STP and Its Attacks
6 pages
CAWI Electronic Lock 7205 Instructions
No ratings yet
CAWI Electronic Lock 7205 Instructions
7 pages
Vehicle Speed Detection Using Arduino and IR Sensors: P.Nihanth, P.Sahithi, R.Sreeja Dr.P.K.Pradhan
No ratings yet
Vehicle Speed Detection Using Arduino and IR Sensors: P.Nihanth, P.Sahithi, R.Sreeja Dr.P.K.Pradhan
4 pages
Ba Sinec-Pni 76
No ratings yet
Ba Sinec-Pni 76
47 pages
ASME B30.2-2022 Safety Standards
No ratings yet
ASME B30.2-2022 Safety Standards
11 pages
Overview of Android 13 Features and Architecture
No ratings yet
Overview of Android 13 Features and Architecture
8 pages
Cloud & Network Security Cheatsheet
No ratings yet
Cloud & Network Security Cheatsheet
55 pages
Challenges of Online Learning in Malaysia
No ratings yet
Challenges of Online Learning in Malaysia
4 pages
Object-Oriented Software Engineering: Practical Software Development Using UML and Java
No ratings yet
Object-Oriented Software Engineering: Practical Software Development Using UML and Java
65 pages
Raymond 8000 Series Pallet Trucks Brochure
No ratings yet
Raymond 8000 Series Pallet Trucks Brochure
11 pages
Solar Power System Tender for SBI ATMs
No ratings yet
Solar Power System Tender for SBI ATMs
31 pages
Rule Generation in Data Mining
No ratings yet
Rule Generation in Data Mining
92 pages
Trenchless Pipe Rehabilitation Guide
No ratings yet
Trenchless Pipe Rehabilitation Guide
28 pages
Queen of The Blazing Throne Claire Legrand Ebook & Testbank
No ratings yet
Queen of The Blazing Throne Claire Legrand Ebook & Testbank
286 pages
Truss Analysis Methods Explained
No ratings yet
Truss Analysis Methods Explained
3 pages
Project Timeline for Bank XYZ Implementation
No ratings yet
Project Timeline for Bank XYZ Implementation
3 pages
Efficient Multi-Pattern Motion Estimation for HEVC
No ratings yet
Efficient Multi-Pattern Motion Estimation for HEVC
10 pages
Panasonic NPM-D3 Operating Manual
No ratings yet
Panasonic NPM-D3 Operating Manual
704 pages
PHP Script for BIN Lookup and Payment
No ratings yet
PHP Script for BIN Lookup and Payment
5 pages
Introduction to Applied Mathematics Course
No ratings yet
Introduction to Applied Mathematics Course
1 page
Modal Parameter Extraction Algorithm
No ratings yet
Modal Parameter Extraction Algorithm
6 pages
Computing Research Project Reflection
No ratings yet
Computing Research Project Reflection
34 pages