0% found this document useful (0 votes)

1K views3 pages

Neural Network Functions and Techniques

This document is an assignment consisting of 10 questions related to large language models and machine learning concepts. Each question includes a statement or query with multiple-choice answers, along with the correct answer and a brief explanation. Topics covered include the Perceptron learning algorithm, backpropagation, activation functions, regularization techniques, and the purpose of hidden layers in multi-layer perceptrons.

Uploaded by

Harsh Vardhan Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views3 pages

Neural Network Functions and Techniques

Uploaded by

Harsh Vardhan Choudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Introduction to Large Language Models

Assignment- 3

Number of questions: 10 Total mark: 10 X 1 = 10

_________________________________________________________________________

Question 1:

State whether the following statement is True/False.

The Perceptron learning algorithm can solve problems with non-linearly separable data.
a. True
b. False
Correct Answer: b
Solution: The Perceptron algorithm can only handle linearly separable problems.

_________________________________________________________________________

QUESTION 2:

In backpropagation, which method is used to compute the gradients?

a. Gradient descent
b. Chain rule of derivatives
c. Matrix factorization
d. Linear regression

Correct Answer: b
Solution: Backpropagation uses the chain rule of derivatives to calculate the gradients layer
by layer.

_________________________________________________________________________

QUESTION 3:

Which activation function outputs values in the range [−1,1]?

a. ReLU
b. Tanh
c. Sigmoid

d. Linear

Correct Answer: b
Solution: The tanh function maps input values to the range [−1,1].

_________________________________________________________________
QUESTION 4:

What is the primary goal of regularization in machine learning?

a. To improve the computational efficiency of the model
b. To reduce overfitting
c. To increase the number of layers in a network
d. To minimize the loss function directly

Correct Answer: b
Solution: As discussed in lecture.

_________________________________________________________________________

QUESTION 5:

Which of the following is a regularization technique where we randomly deactivate neurons

during training?
a. Early stopping
b. L1 regularization
c. Dropout
d. Weight decay

Correct Answer: c
Solution: As discussed in lecture.

_________________________________________________________________________

Question 6:

Which activation function has the vanishing gradient problem for large positive or negative
inputs?
a. ReLU
b. Sigmoid
c. GELU
d. Swish
Correct Answer: b
Solution: The sigmoid function saturates at extreme input values (large positive or negative
inputs).
_________________________________________________________________________

QUESTION 7:

Which activation function is defined as: f(x)=x⋅σ(x), where σ(x) is the sigmoid function?

a. Swish
b. ReLU
c. GELU

d. SwiGLU

Correct Answer: a
Solution: As discussed in lecture.

_________________________________________________________________________

QUESTION 8:

What does the backpropagation algorithm compute in a neural network?

a. Loss function value at each epoch
b. Gradients of the loss function with respect to weights of the network
c. Activation values of the output layer
d. Output of each neuron

Correct Answer: b
Solution: Please refer to the lecture.

__________________________________________________________________

Question 9:
Which type of regularization encourages sparsity in the weights?
a. L1 regularization
b. L2 regularization
c. Dropout
d. Early stopping

Correct Answer: a
Solution: L1 regularization encourages sparsity in the weights.

_________________________________________________________________________

QUESTION 10:

What is the main purpose of using hidden layers in an MLP?

a. Helps to the network bigger
b. Enables us to handle linearly separable data
c. Learn complex and nonlinear relationships in the data
d. Minimize the computational complexity
Correct Answer: c
Solution: Hidden layers enable MLPs to learn complex and nonlinear relationships that a
single-layer perceptron cannot model.

_________________________________________________________________________

Common questions

The tanh activation function is a scaled version of the sigmoid function that symmetrically maps inputs into the range [-1, 1]. The function calculates the hyperbolic tangent of the input, resulting in a smooth curve that approaches -1 for large negative inputs and 1 for large positive inputs, with a range centered around zero for inputs near zero .

The chain rule of derivatives enables backpropagation by systematically applying the derivative of composite functions. In neural networks, the total loss is a composition of several functions, representing the layers' operations. By applying the chain rule, backpropagation computes the gradient of the loss function with respect to each weight by breaking it down into the gradients of the loss with respect to the output of each layer, and then using these to compute gradients further back in the network .

Regularization reduces overfitting by imposing a penalty on more complex models, which discourages the model from fitting noise in the training data. Overfitting represents a major problem because it results in high variance models that perform well on training data but poorly on unseen data. By enforcing simplicity, regularization enhances the generalization capability of a model, making it more robust in practical applications .

The Swish activation function is defined as f(x)=x⋅σ(x), where σ(x) is the sigmoid function. Unlike ReLU, which completely shuts off neurons with inputs below zero, Swish is smooth and non-monotonic, allowing for small negative values, which can help ensure neurons continue to propagate error signals even when they are not activated, potentially leading to improved training dynamics and better performance in some scenarios .

Hidden layers in an MLP allow the network to model and learn complex, nonlinear relationships within the data by applying multiple layers of nonlinear transformations to the input data. Each layer learns an increasingly abstract representation of the data, enabling the network to capture intricate patterns and dependencies that simple linear models cannot .

Dropout regularization helps prevent overfitting by randomly deactivating a subset of neurons during training, which prevents the model from relying too heavily on any particular element of the model. This stochasticity forces the network to learn more robust features, distributes representation across neurons, and acts as a form of model averaging, leading to better generalization on unseen data .

The Perceptron learning algorithm is designed to find a hyperplane that can separate data into two distinct classes. It adjusts weights based on whether the current decision boundary correctly classifies the data points. However, when data is not linearly separable, no such hyperplane exists, and thus the perceptron cannot correctly classify the data, causing it to fail in these situations .

Backpropagation uses the chain rule of derivatives to compute gradients necessary for updating weights. This iterative method applies the chain rule to propagate the error back through the layers of the network. It is preferred because it efficiently computes the necessary gradients for deep networks, enabling feasible training of modern neural architectures which consist of numerous layers and massive number of parameters .

The sigmoid function outputs values between 0 and 1, and its gradient is small for large positive or negative inputs, effectively saturating. When the network weights are updated through backpropagation, these small gradients cause weight updates to be increasingly smaller, slowing down the learning process. This can halt learning entirely in deep networks, a phenomenon known as the vanishing gradient problem .

L1 regularization adds the absolute value of the coefficients as a penalty to the loss function. This encourages sparsity in the weight vectors, meaning many weights are driven to zero. The implication is that it can lead to models that are simpler and more interpretable, as they use fewer features, but it requires careful tuning to ensure that predictive power is not lost by overly penalizing the model complexity .

Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
3 pages
Deep Learning MCQ Questions Collection
No ratings yet
Deep Learning MCQ Questions Collection
7 pages
Midterm Exam: Deep Neural Networks
No ratings yet
Midterm Exam: Deep Neural Networks
14 pages
Deep Learning Quiz 1: Concepts & Questions
No ratings yet
Deep Learning Quiz 1: Concepts & Questions
5 pages
Advantages of RNNs Over Fixed-Window Models
No ratings yet
Advantages of RNNs Over Fixed-Window Models
4 pages
Language Model Assignment Overview
No ratings yet
Language Model Assignment Overview
4 pages
Introduction to Large Language Models
No ratings yet
Introduction to Large Language Models
3 pages
Key Concepts in Instruction Tuning
No ratings yet
Key Concepts in Instruction Tuning
7 pages
Machine Learning Assignment 1 Questions
No ratings yet
Machine Learning Assignment 1 Questions
4 pages
Deep Learning Week 10 Assignment MCQs
No ratings yet
Deep Learning Week 10 Assignment MCQs
5 pages
Large Language Models Assignment 7
No ratings yet
Large Language Models Assignment 7
3 pages
Disadvantages of Hierarchical Softmax
No ratings yet
Disadvantages of Hierarchical Softmax
4 pages
Week 4: LLMs Assignment Overview
No ratings yet
Week 4: LLMs Assignment Overview
3 pages
Deep Learning Course - IIT Ropar
No ratings yet
Deep Learning Course - IIT Ropar
4 pages
Introduction To Large Language Models (LLMS) - Unit 7 - Week 5
No ratings yet
Introduction To Large Language Models (LLMS) - Unit 7 - Week 5
4 pages
Solving Deep Learning Challenges
No ratings yet
Solving Deep Learning Challenges
4 pages
AI Language Model Assignment Questions
No ratings yet
AI Language Model Assignment Questions
4 pages
Assignment-2 Nptel
No ratings yet
Assignment-2 Nptel
2 pages
Understanding Bias in Large Language Models
No ratings yet
Understanding Bias in Large Language Models
6 pages
Deep Learning Week 8 Overview
No ratings yet
Deep Learning Week 8 Overview
4 pages
NPTEL Deep Learning Week 2 Overview
No ratings yet
NPTEL Deep Learning Week 2 Overview
7 pages
Graphical Models and HMMs Overview
No ratings yet
Graphical Models and HMMs Overview
3 pages
IIT Ropar AI Online Course Overview
No ratings yet
IIT Ropar AI Online Course Overview
4 pages
Deep Learning Course at IIT Ropar
No ratings yet
Deep Learning Course at IIT Ropar
4 pages
Assignment 4 Solution
No ratings yet
Assignment 4 Solution
3 pages
Machine Learning Exam Questions
No ratings yet
Machine Learning Exam Questions
5 pages
Machine Learning Assignment 7 Solutions
100% (1)
Machine Learning Assignment 7 Solutions
3 pages
Machine Learning Assignment Solutions
No ratings yet
Machine Learning Assignment Solutions
4 pages
Linear Regression Assignment Insights
100% (1)
Linear Regression Assignment Insights
3 pages
Deep Learning Week 6 Assignment Answers
No ratings yet
Deep Learning Week 6 Assignment Answers
11 pages
EM Algorithm in Gaussian Mixture Models
100% (2)
EM Algorithm in Gaussian Mixture Models
3 pages
IIT Ropar Deep Learning Course
No ratings yet
IIT Ropar Deep Learning Course
5 pages
Gradient Descent and LSTM in Deep Learning
No ratings yet
Gradient Descent and LSTM in Deep Learning
17 pages
Understanding Knowledge Graph Embeddings
No ratings yet
Understanding Knowledge Graph Embeddings
7 pages
NPTEL Week 3 Deep Learning Assignment
No ratings yet
NPTEL Week 3 Deep Learning Assignment
3 pages
SAID and Adaptation Techniques in LLMs
No ratings yet
SAID and Adaptation Techniques in LLMs
6 pages
Deep Learning Assignments - IIT Ropar
No ratings yet
Deep Learning Assignments - IIT Ropar
3 pages
Positional Encoding Calculation in Transformers
100% (1)
Positional Encoding Calculation in Transformers
3 pages
Week 2: Machine Learning Overview
No ratings yet
Week 2: Machine Learning Overview
5 pages
Deep Learning Course - IIT Ropar
100% (1)
Deep Learning Course - IIT Ropar
4 pages
NPTEL Machine Learning Course Overview
100% (1)
NPTEL Machine Learning Course Overview
3 pages
Deep Learning Tutorial: Week 3 Insights
No ratings yet
Deep Learning Tutorial: Week 3 Insights
4 pages
Linear Regression Model Insights
No ratings yet
Linear Regression Model Insights
8 pages
Deep Learning Week 1 Overview
No ratings yet
Deep Learning Week 1 Overview
7 pages
Decision Tree Pruning and Responses
100% (1)
Decision Tree Pruning and Responses
10 pages
CS378 Natural Language Processing Midterm
No ratings yet
CS378 Natural Language Processing Midterm
11 pages
Machine Learning Assignment Solutions
No ratings yet
Machine Learning Assignment Solutions
4 pages
Deep Learning Course Insights - IIT Ropar
100% (1)
Deep Learning Course Insights - IIT Ropar
5 pages
NLP Assignment: MCQs and Solutions
100% (1)
NLP Assignment: MCQs and Solutions
6 pages
Deep Learning: PCA and Eigenvalues
No ratings yet
Deep Learning: PCA and Eigenvalues
4 pages
Week 1 Assignment: Machine Learning
No ratings yet
Week 1 Assignment: Machine Learning
3 pages
Machine Learning Week 3 Assignment MCQs
No ratings yet
Machine Learning Week 3 Assignment MCQs
6 pages
Machine Learning Assignment Overview
No ratings yet
Machine Learning Assignment Overview
45 pages
Derivative of Sigmoid Function Explained
No ratings yet
Derivative of Sigmoid Function Explained
4 pages
100 Essential Deep Learning MCQs
No ratings yet
100 Essential Deep Learning MCQs
2 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
6 pages
Key Concepts in Neural Networks and Learning
No ratings yet
Key Concepts in Neural Networks and Learning
2 pages
Neural Network Fundamentals and Techniques
100% (1)
Neural Network Fundamentals and Techniques
3 pages
Deep Learning Concepts and Techniques
No ratings yet
Deep Learning Concepts and Techniques
8 pages
Deep Learning Exam Questions & Answers
No ratings yet
Deep Learning Exam Questions & Answers
4 pages
Dynamic Programming and Backtracking MCQs
No ratings yet
Dynamic Programming and Backtracking MCQs
6 pages
Linear Equations in Engineering Solutions
No ratings yet
Linear Equations in Engineering Solutions
39 pages
Modular Inequalities Practice Problems
No ratings yet
Modular Inequalities Practice Problems
2 pages
LSTM vs GRU: Key Differences Explained
No ratings yet
LSTM vs GRU: Key Differences Explained
62 pages
Lagrange Interpolation Method Study
No ratings yet
Lagrange Interpolation Method Study
8 pages
UMBC 671 Midterm Exam Overview
No ratings yet
UMBC 671 Midterm Exam Overview
8 pages
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
No ratings yet
1 - 8 Find The General Solution of Each Equation: Exercises B-4.1
3 pages
Optimization Algorithms Overview
No ratings yet
Optimization Algorithms Overview
62 pages
Overview of Neural Networks in Deep Learning
No ratings yet
Overview of Neural Networks in Deep Learning
24 pages
BAS 202 Engineering Mathematics II Exam
No ratings yet
BAS 202 Engineering Mathematics II Exam
3 pages
Polynomial Functions: Roots and Equations
No ratings yet
Polynomial Functions: Roots and Equations
52 pages
CBSE Class 10 Mathematics NCERT Solutions
No ratings yet
CBSE Class 10 Mathematics NCERT Solutions
263 pages
COMP 3711 Fall 2018 Midterm Exam
No ratings yet
COMP 3711 Fall 2018 Midterm Exam
8 pages
Algorithm Complexity Course Notes
No ratings yet
Algorithm Complexity Course Notes
41 pages
Regression Analysis Summary Output
No ratings yet
Regression Analysis Summary Output
19 pages
Dynamic Programming Solutions and Techniques
No ratings yet
Dynamic Programming Solutions and Techniques
4 pages
Numerical Methods for CE Problems
No ratings yet
Numerical Methods for CE Problems
4 pages
Transportation and Assignment Models
No ratings yet
Transportation and Assignment Models
67 pages
Heuristic Search Techniques Explained
No ratings yet
Heuristic Search Techniques Explained
49 pages
Greedy Algorithms Overview and Examples
No ratings yet
Greedy Algorithms Overview and Examples
34 pages
Master Theorem and Recurrence Relations
No ratings yet
Master Theorem and Recurrence Relations
4 pages
Ch5 Integer Programming
No ratings yet
Ch5 Integer Programming
46 pages
Gauss Lemma and Schur's Theorem Insights
No ratings yet
Gauss Lemma and Schur's Theorem Insights
16 pages
Numerical Methods for Finding Roots
No ratings yet
Numerical Methods for Finding Roots
15 pages
Sorting Algorithms and Their Complexity
No ratings yet
Sorting Algorithms and Their Complexity
123 pages
MAT202 Question Paper - Series II
No ratings yet
MAT202 Question Paper - Series II
2 pages
Characteristic Polynomial in ODEs
No ratings yet
Characteristic Polynomial in ODEs
3 pages
Differential Calculus Assignment Overview
No ratings yet
Differential Calculus Assignment Overview
2 pages
Engineering Mathematics Assignment 2
No ratings yet
Engineering Mathematics Assignment 2
2 pages
Shortest Path Algorithms Explained
No ratings yet
Shortest Path Algorithms Explained
28 pages