0% found this document useful (0 votes)
37 views30 pages

Understanding AI: Machine Learning & Generative AI

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views30 pages

Understanding AI: Machine Learning & Generative AI

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Artificial Intelligence 2025

ARTIFICIAL INTELLIGENCE
Introduction
Artificial intelligence (AI) is technology that enables computers and machines to simulate
human learning, comprehension, problem solving, decision making, creativity and autonomy.
Applications and devices equipped with AI can see and identify objects. They can understand and
respond to human language. They can learn from new information and experience. They can make
detailed recommendations to users and experts. They can act independently, replacing the need for
human intelligence or intervention (a classic example being a self-driving car). But in 2024, most AI
researchers, practitioners and most AI-related headlines are focused on breakthroughs in generative
AI (gen AI), a technology that can create original text, images, video and other content. To fully
understand generative AI, it’s important to first understand the technologies on which generative AI

tools are built: machine learning (ML) and deep learning.

MACHINE LEARNING
A simple way to think about AI is as a series of nested or derivative concepts that
have emerged over more than 70 years .How artificial intelligence, machine learning, deep learning
and generative AI are related. Directly underneath AI, we have machine learning, which involves
creating models by training an algorithm to make predictions or decisions based on data. It
encompasses a broad range of techniques that enable computers to learn from and make inferences
based on data without being explicitly programmed for specific [Link] are many types of
machine learning techniques or algorithms, including linear regression, logistic regression, decision
trees, random forest, support vector machines (SVMs), k-nearest neighbor (KNN), clustering and
more. Each of these approaches is suited to different kinds of problems and [Link] one of the most
popular types of machine learning algorithm is called a neural network (or artificial neural
network). Neural networks are modeled after the human brain's structure and function. A neural
network consists of interconnected layers of nodes (analogous to neurons) that work together to
process and analyze complex data. Neural networks are well suited to tasks that involve identifying
complex patterns and relationships in large amounts of data. The simplest form of machine
learning is called supervised learning, which involves the use of labeled data sets to train algorithms
to classify data or predict outcomes accurately. In supervised learning, humans pair each training
example with an output label. The goal is for the model to learn the mapping between inputs and
outputs in the training data, so it can predict the labels of new, unseen data.

DEEP LEARNING
Deep learning is a subset of machine learning that uses multilayered neural networks,
called deep neural networks, that more closely simulate the complex decision-making power

AI Page 1
Artificial Intelligence 2025
of the human brain. Deep neural networks include an input layer, at least three but usually
hundreds of hidden layers, and an output layer, unlike neural networks used in classic machine
learning models, which usually have only one or two hidden [Link] multiple layers
enable unsupervised learning: they can automate the extraction of features from large, unlabeled
and unstructured data sets, and make their own predictions about what the data represents
Because deep learning doesn’t require human intervention, it enables machine learning at a
tremendous scale. It is well suited to natural language processing (NLP), computer vision, and
other tasks that involve the fast, accurate identification complex patterns and relationships in
large amounts of data. Some form of deep learning powers most of the artificial intelligence
(AI) applications in our lives today.

Deep learning also enables:


• Semi-supervised learning, which combines supervised and unsupervised learning by using both
labeled and unlabeled data to train AI models for classification and regression tasks.
• Self-supervised learning, which generates implicit labels from unstructured data,
rather than relying on labeled data sets for supervisory signals.
• Reinforcement learning, which learns by trial-and-error and reward functions rather than by
extracting information from hidden patterns.
• Transfer learning, in which knowledge gained through one task or data set is used to improve model
performance on another related task or different data set.

Generative AI
Generative AI, sometimes called "gen AI", refers to deep learning models that can create complex
original content such as long-form text, high-quality images, realistic video or audio and more in
response to a user’s prompt or request.

AI Page 2
Artificial Intelligence 2025

• In a deep neural network, multiple layers of nodes can extract meaning and relationships from large
volumes of unstructured, unlabeled data. At a high level, generative models encode a simplified
representation of their training data, and then draw from that representation to create new work that’s
similar, but not identical, to the original data. Generative models have been used for years in
statistics to analyze numerical data. But over the last decade, they evolved to analyze and generate
more complex data types. This evolution coincided with the emergence of three sophisticated deep
learning model types:
• Variational autoencoders or VAEs, which were introduced in 2013, and enabled models that could
generate multiple variations of content in response to a prompt or instruction.
• Diffusion models, first seen in 2014, which add "noise" to images until they are

unrecognizable, and then remove the noise to generate original images in response to prompts.

• Transformers (also called transformer models), which are trained on sequenced data to generate
extended sequences of content (such as words in sentences, shapes in an image, frames of a video or
commands in software code). Transformers are at the core of most of today’s headline-making
generative AI tools, including ChatGPT and GPT-4, Copilot, BERT, Bard and Midjourney.

How generative AI works


In general, generative AI operates in three phases:
1. Training, to create a foundation model.
2. Tuning, to adapt the model to a specific application.
3. Generation, evaluation and more tuning, to improve accuracy.

Training
Generative AI begins with a "foundation model"; a deep learning model that serves as the basis
for multiple different types of generative AI applications. The most common foundation models
today are large language models (LLMs), created for text generation applications. But there are also
foundation models for image, video, sound or music generation, and multimodal foundation models
that support several kinds of content. To create a foundation model, practitioners train a deep
learning algorithm on huge volumes of relevant raw, unstructured, unlabeled data, such as
terabytes or petabytes of data text or images or video from the internet. The training yields a
neural network of billions of parameters encoded representations of the entities, patterns and
relationships in the data that can generate content autonomously in response to prompts. This is
the foundation model. This training process is compute-intensive, time-consuming and expensive. It
requires thousands of clustered graphics processing units (GPUs) and weeks of processing, all of
which typically costs millions of dollars. Open source foundation model projects, such as Meta's
Llama-2, enable gen AI developers to avoid this step and its costs.

AI Page 3
Artificial Intelligence 2025

Tuning
Next, the model must be tuned to a specific content generation task. This can be done in
various ways, including:
• Fine-tuning, which involves feeding the model application-specific labeled data, questions or prompts the
application is likely to receive, and corresponding correct answers in the wanted format.
• Reinforcement learning with human feedback (RLHF), in which human users evaluatethe accuracy or
relevance of model outputs so that the model can improve itself. This can be as simple as having people type
or talk back corrections to a chatbot or virtual assistant.
Generation, evaluation and more tuning
Developers and users regularly assess the outputs of their generative AI apps, and further tune the
model even as often as once a week for greater accuracy or relevance. In contrast, the foundation
model itself is updated much less frequently, perhaps every year or 18 [Link] option for
improving a gen AI app's performance is retrieval augmented generation (RAG), a technique for
extending the foundation model to use relevant sources outside of the training data to refine the
parameters for greater accuracy or relevance.

AI agents and agentic AI


An AI agent is an autonomous AI program, it can perform tasks and accomplish goals on
behalf of a user or another system without human intervention, by designing its own workflow and
using available tools (other applications or services). Agentic AI is a system of multiple AI agents,
the efforts of which are coordinated, or orchestrated, to accomplish a more complex task or a greater
goal than any single agent in the system could [Link] chatbots and other AI models
which operate within predefined constraints and require human intervention, AI agents and agentic
AI exhibit autonomy, goal- driven behavior and adaptability to changing circumstances. The terms
―agent‖ and ―agentic‖ refer to these models’ agency, or their capacity to act independently and
purposefully. Explore our 2025 guide to AI agents

Benefits of AI
AI offers numerous benefits across various industries and applications. Some of the most
commonly cited benefits include:

• Automation of repetitive tasks.


• More and faster insight from data.
• Enhanced decision-making.
• Fewer human errors.
• 24x7 availability.
• Reduced physical risks.

Automation of repetitive tasks


AI can automate routine, repetitive and often tedious tasks including digital tasks such as data

AI Page 4
Artificial Intelligence 2025
collection, entering and preprocessing, and physical tasks such as warehouse stock-picking and
manufacturing processes. This automation frees to work on higher value, more creative work.

Enhanced decision-making
Whether used for decision support or for fully automated decision-making, AI enables
faster, more accurate predictions and reliable, data-driven decisions. Combined with
automation, AI enables businesses to act on opportunities and respond to crises as they
emerge, in real time and without human intervention.

Fewer human errors


AI can reduce human errors in various ways, from guiding people through the proper steps of a
process, to flagging potential errors before they occur, and fully automating processes without
human intervention. This is especially important in industries such as healthcare where, for example,
AI-guided surgical robotics enable consistent precision. Machine learning algorithms can continually
improve their accuracy and further reduce errors as they're exposed to more data and "learn" from
experience.

Round-the-clock availability and consistency


AI is always on, available around the clock, and delivers consistent performance every time. Tools
such as AI chatbots or virtual assistants can lighten staffing demands for customer service or support. In
other applications such as materials processing or production lines, AI can help maintain consistent
work quality and output levels when used to complete repetitive or tedious tasks.
Reduced physical risk
By automating dangerous work such as animal control, handling explosives, performing tasks
in deep ocean water, high altitudes or in outer space, AI can eliminate the need to put human
workers at risk of injury or worse. While they have yet to be perfected, self-driving cars and other
vehicles offer the potential to reduce the risk of injury to passengers.

AI use cases
The real-world applications of AI are many. Here is just a small sampling of use cases across various
industries to illustrate its potential:

Customer experience, service and support


Companies can implement AI-powered chatbots and virtual assistants to handle customer
inquiries, support tickets and more. These tools use natural language processing (NLP) and generative
AI capabilities to understand and respond to customer questions about order status, product details and
return policies. Chatbots and virtual assistants enable always-on support, provide faster answers to
frequently asked questions (FAQs), free human agents to focus on higher-level tasks, and give
customers faster, more consistent service.

Fraud detection
Machine learning and deep learning algorithms can analyze transaction patterns and flag
anomalies, such as unusual spending or login locations, that indicate fraudulent transactions. This

AI Page 5
Artificial Intelligence 2025
enables organizations to respond more quickly to potential fraud and limit its impact, giving
themselves and customers greater peace of mind.

Personalized marketing
Retailers, banks and other customer-facing companies can use AI to create personalized customer
experiences and marketing campaigns that delight customers, improve sales and prevent churn.
Based on data from customer purchase history and behaviors, deep learning algorithms can recommend
products and services customers are likely to want, and even generate personalized copy and special
offers for individual customers in real time.

Human resources and recruitment


AI-driven recruitment platforms can streamline hiring by screening resumes, matching
candidates with job descriptions, and even conducting preliminary interviews using video analysis.
These and other tools can dramatically reduce the mountain of administrative paperwork associated
with fielding a large volume of candidates. It can also reduce response times and time-to-hire,
improving the experience for candidates whether they get the job or not.

Application development and modernization


Generative AI code generation tools and automation tools can streamline repetitive coding tasks
associated with application development, and accelerate the migration and modernization
(reformatting and replatorming) of legacy applications at scale. These tools can speed up tasks, help
ensure code consistency and reduce errors.

Predictive maintenance
Machine learning models can analyze data from sensors, Internet of Things (IoT) devices and
operational technology (OT) to forecast when maintenance will be required and predict equipment
failures before they occur. AI-powered preventive maintenance helps prevent downtime and enables
you to stay ahead of supply chain issues before they affect the bottom line.

AI challenges and risks


Organizations are scrambling to take advantage of the latest AI technologies and capitalize on
AI's many benefits. This rapid adoption is necessary, but adopting and maintaining AI workflows
comes with challenges and risks.

Data risks
AI systems rely on data sets that might be vulnerable to data poisoning, data tampering, data
bias or cyberattacks that can lead to data breaches. Organizations can mitigate these risks by
protecting data integrity and implementing security and availability throughout the entire AI
lifecycle, from development to training and deployment and postdeployment.

Model risks
Threat actors can target AI models for theft, reverse engineering or unauthorized
manipulation. Attackers might compromise a model’s integrity by tampering with its architecture,
weights or parameters; the core components that determine a model’s behavior, accuracy and
AI Page 6
Artificial Intelligence 2025
performance.

Operational risks
Like all technologies, models are susceptible to operational risks such as model drift, bias and
breakdowns in the governance structure. Left unaddressed, these risks can lead to system failures and
cybersecurity vulnerabilities that threat actors can use.
Ethics and legal risks
If organizations don’t prioritize safety and ethics when developing and deploying AI systems,
they risk committing privacy violations and producing biased outcomes. For example, biased training
data used for hiring decisions might reinforce gender or racial stereotypes and create AI models that
favor certain demographic groups over others.

AI ethics and governance


AI ethics is a multidisciplinary field that studies how to optimize AI's beneficial impact while
reducing risks and adverse outcomes. Principles of AI ethics are applied through a system of AI
governance consisted of guardrails that help ensure that AI tools and systems remain safe and
ethical. AI governance encompasses oversight mechanisms that address risks. An ethical approach to
AI governance requires the involvement of a wide range of stakeholders, including developers, users,
policymakers and ethicists, helping to ensure that AI- related systems are developed and used to align
with society's values. Here are common values associated with AI ethics and responsible AI: As AI
becomes more advanced, humans are challenged to comprehend and retrace how the algorithm came to
a result. Explainable AI is a set of processes and methods that enables human users to interpret,
comprehend and trust the results and output created by algorithms. Although machine learning, by
its very nature, is a form of statistical discrimination, the discrimination becomes objectionable
when it places privileged groups at systematic advantage and certain unprivileged groups at
systematic disadvantage, potentially causing varied harms. To encourage fairness, practitioners can
try to minimize algorithmic bias across data collection and model design, and to build more diverse
and inclusive teams. Robust AI effectively handles exceptional conditions, such as abnormalities in
input or malicious attacks, without causing unintentional harm. It is also built to withstand intentional
and unintentional interference by protecting against exposed [Link] should
implement clear responsibilities and governance structures for the development, deployment and
outcomes of AI systems. In addition, users should be able to see how an AI service works, evaluate its
functionality, and comprehend its strengths and limitations. Increased transparency provides
information for AI consumers to better understand how the AI model or service was created. Many
regulatory frameworks, including GDPR, mandate that organizations abide by certain privacy
principles when processing personal information. It is crucial to be able to protect AI models that
might contain personal information, control what data goes into the model in the first place, and to
build adaptable systems that can adjust to changes in regulation and attitudes around AI ethics.

Weak AI vs. Strong AI


In order to contextualize the use of AI at various levels of complexity and sophistication,
researchers have defined several types of AI that refer to its level of sophistication: Weak AI: Also

AI Page 7
Artificial Intelligence 2025
known as ―narrow AI,‖ defines AI systems designed to perform a specific task or a set of tasks.
Examples might include ―smart‖ voice assistant apps, such as Amazon’s Alexa, Apple’s Siri, a social
media chatbot or the autonomous vehicles promised by Tesla. Strong AI: Also known as ―artificial
general intelligence‖ (AGI) or ―general AI,‖ possess the ability to understand, learn and apply
knowledge across a wide range of tasks at a level equal to or surpassing human intelligence. This level
of AI is currently theoretical and no known AI systems approach this level of sophistication.
Researchers argue that if AGI is even possible, it requires major increases in computing power.
Despite recent advances in AI development, self-aware AI systems of science fiction remain firmly
in that realm.

History of AI
The idea of "a machine that thinks" dates back to ancient Greece. But since the advent of electronic
computing (and relative to some of the topics discussed in this article) important events and milestones in
the evolution of AI include the following: 1950 Alan Turing publishes Computing Machinery and
Intelligence. In this paper, Turing famous for breaking the German ENIGMA code during WWII and often
referred to as the "father of computer science" asks the following question: "Can machines think?" From
there, he offers a test, now famously known as the "Turing Test," where a human interrogator would try to
distinguish between a computer and human text response. While this test has undergone much scrutiny
since it was published, it remains an important part of the history of AI, and an ongoing concept within
philosophy as it uses ideas around linguistics.1956 John McCarthy coins the term "artificial intelligence" at
the first-ever AI conference at Dartmouth College. (McCarthy went on to invent the Lisp language.) Later
that year, Allen Newell, J.C. Shaw and Herbert Simon create the Logic Theorist, the first-ever running AI
computer program.1967 Frank Rosenblatt builds the Mark 1 Perceptron, the first computer based on a
neural network that "learned" through trial and error. Just a year later, Marvin Minsky and Seymour Papert
publish a book titled Perceptrons, which becomes both the landmark work on neural networks and, at least
for a while, an argument against future neural network research initiatives. 1980 Neural networks, which
use a back propagation algorithm to train itself, became widely used in AI applications.1995 Stuart Russell
and Peter Norvig publish Artificial Intelligence: A Modern Approach, which becomes one of the leading
textbooks in the study of AI. In it, they delve into four potential goals or definitions of AI, which
differentiates computer systems based on rationality and thinking versus acting. 1997 IBM's Deep Blue
beats then world chess champion Garry Kasparov, in a chess match (and rematch). 2004 John McCarthy
writes a paper, What Is Artificial Intelligence?, and proposes an often- cited definition of AI. By this time,
the era of big data and cloud computing is underway, enabling organizations to manage ever-larger data
estates, which will one day be used to train AI models. 2011 IBM Watson® beats champions Ken Jennings
and Brad Rutter at Jeopardy! Also, around this time, data science begins to emerge as a popular discipline.
2015 Baidu's Minwa supercomputer uses a special deep neural network called a convolutional neural
network to identify and categorize images with a higher rate of accuracy than the average human. 2016
DeepMind's AlphaGo program, powered by a deep neural network, beats Lee Sodol, the world champion Go
player, in a five-game match. The victory is significant given the huge number of possible moves as the
game progresses (over 14.5 trillion after just four moves). Later, Google purchased Deep Mind for a reported
USD 400 million. 2022 A rise in large language models or LLMs, such as OpenAI’s ChatGPT, creates an
enormous change in performance of AI and its potential to drive enterprise value. With these new
generative AI practices, deep-learning models can be pretrained on large amounts of data. 2024 The latest
AI trends point to a continuing AI renaissance. Multimodal models that can take multiple types of data as
input are providing richer, more robust experiences These models bring together computer vision image

AI Page 8
Artificial Intelligence 2025

recognition and NLP speech recognition capabilities. Smaller models are also making strides in an age of
diminishing returns with massive models with large parameter counts.

AI Page 9
Artificial Intelligence 2025

SUPERVISED LEARNING

Supervised learning is a machine learning technique that uses human-labeled input and output
datasets to train artificial intelligence models. The trained model learns the underlying relationships
between inputs and outputs, enabling it to predict correct outputs based on new, unlabeled real-
world input data. Labeled data consists of example data points along with the correct outputs or
answers. As input data is fed into the machine learning algorithm, it adjusts its weights until the
model has been fitted appropriately. Labeled training data explicitly teaches the model to identify
the relationships between features and data [Link] machine learning helps organizations
solve for various real-world problems at scale, such as classifying spam or predicting stock prices. It
can be used to build highly accurate machine learning [Link] learning uses a labeled
training dataset to understand the relationships between inputs and output data. Data scientists
manually create training datasets containing input data along with the corresponding labels.
Supervised learning trains the model to apply the correct outputs to new input data in real-world use
[Link] training, the model’s algorithm processes large datasets to explore potential
correlations between inputs and outputs. Then, model performance is evaluated with test data to find
out whether it was trained successfully. Cross-validation is the process of testing a model using a
different portion of the [Link] gradient descent family of algorithms, including stochastic
gradient descent (SGD), are the most commonly used optimization algorithms, or learning
algorithms, when training neural networks and other machine learning models. The model’s
optimization algorithm assesses accuracy through the loss function: an equation that measures the
discrepancy between the model’s predictions and actual [Link] loss function’s slope, or gradient,
is the primary metric of model performance. The optimization algorithm descends the gradient to
minimize its value. Throughout training, the optimization algorithm updates the model’s
parameters—its operating rules or ―settings‖—to optimize the model.

The supervised learning workflow


A typical supervised learning process might look like this:

1. Identify the type of training data to be used for training the model. This data should be similar to the

intended input data that the model will process when ready for use.

AI Page 10
Artificial Intelligence 2025

2. Assemble the training data and label it to create the labeled training dataset. The training data must
be free of data bias to avoid resultant algorithmic bias and other performance flaws.
3. Create three groups of data: training data, validation data and test data. Validation

assesses the training process for further tuning and adjustment, and testing evaluates the final

model.

4. Choose a machine learning algorithm with which to create the model.

5. Feed the training dataset into the selected algorithm.

6. Validate and test the model accordingly.

7. Keep an eye on model performance and maintain accuracy with regular updates.

An example of supervised learning in actionAs an example of supervised learning, consider an

image classification model created to recognize images of vehicles and determine which type of

vehicle they are. Such a model can power the CAPTCHA tests many websites use to detect spam

[Link] train this model, data scientists prepare a labeled training dataset containing numerous

vehicle examples along with the corresponding vehicle type: car, motorcycle, truck, bicycle and

more. The model’s algorithm attempts to identify the patterns in the training data that causes an

input—vehicle images—to receive a designated output—vehicle [Link] model’s guesses are

measured against actual data values in a test set to determine whether it has made accurate

predictions. If not, the training cycle continues until the model’s performance has reached a

satisfactory level of accuracy. The principle of generalization refers to a model’s ability to make

appropriate predictions on new data from the same distribution as its training data.

Types of supervised learning


➢ Supervised learning tasks can be broadly divided into classification and regression
. problems :
➢ Classification in machine learning uses an algorithm to sort data into categories. It recognizes specific
entities within the dataset and attempts to determine how those entities should be labeled or defined.
Common classification algorithms are linear classifiers, support vector machines (SVM), decision
trees, k- nearest neighbor and random forest.
➢ Neural networks excel at handling complex classification problems. A neuralnetwork is a deep
learning architecture that processes training data with layers of nodes that mimic the human brain.
Each node is made up of inputs, weights, a bias (or threshold) and an output. If an output value
exceeds a preset threshold, the node ―fires‖ or activates, passing data to the next layer in the
network.

AI Page 11
Artificial Intelligence 2025

➢ Regression is used to understand the relationship between dependent and independent variables. In
regression problems, the output is a continuous value, and models attempt to predict the target
output. Regression tasks include projections for sales revenue or financial planning. Linear
regression, logistical regression and polynomial regression are three examples of regression
algorithms.
➢ Because large datasets typically contain many features, data scientists can simplify this complexity
through dimensionality reduction. This data science technique reduces the number of features to
those most crucial for predicting data labels, which preserves accuracy while increasing efficiency.
➢ Supervised learning algorithms
➢ Optimization algorithms such as gradient descent train a wide range of machine .
. learning algorithms that excel in supervised learning tasks.
➢ Naive Bayes: Naive Bayes is a classification algorithm that adopts the principle of class conditional
independence from Bayes’ theorem. This means that the presence of one feature does not impact the
presence of another in the probability of an outcome, and each predictor has an equal effect on that
result.
➢ Naïve Bayes classifiers include multinomial, Bernoulli and Gaussian Naïve Bayes. This technique is
often used in text classification, spam identification and recommendation systems.
➢ Linear regression: Linear regression is used to identify the relationship between a continuous dependent
variable and one or more independent variables. It is typically used to make predictions about
future outcomes.
➢ Linear regression expresses the relationship between variables as a straight line. When there is one
independent variable and one dependent variable, it is known as simple linear regression. As the
number of independent variables increases, the technique is referred to as multiple linear
regression.
➢ Nonlinear regression: Sometimes, an output cannot be reproduced from linear inputs. In these cases,
outputs must be modeled with a nonlinear function. Nonlinear regression expresses a relationship
between variables through a nonlinear, or curved line. Nonlinear models can handle complex
relationships with many parameters.
➢ Logistic regression: Logistic regression handles categorical dependent variables—when they have binary
outputs, such as true or false or positive or negative. While linear and logistic regression models seek
to understand relationships between data inputs, logistic regression mainly solves binary
classification problems, such as spam identification.
➢ Polynomial regression: Similar to other regression models, polynomial regression models a relationship
between variables on a graph. The functions used in polynomial regression express this relationship
though an exponential degree. Polynomial regression is a subset of nonlinear regression.
➢ Support vector machine (SVM): A support vector machine is used for both data classification and
regression. That said, it usually handles classification problems. Here, SVM separates the classes of
data points with a decision boundary or hyperplane. The goal of the SVM algorithm is to plot the
hyperplane that maximizes the distance between the groups of data points.
➢ K-nearest neighbor: K-nearest neighbor (KNN) is a nonparametric algorithm that classifies data points
based on their proximity and association to other available data. This algorithm assumes that
similar data points can be found near each other when plotted mathematically.
➢ Its ease of use and low calculation time make it efficient when used for recommendation engines and
image recognition. But as the test dataset grows, the processing time lengthens, making it less
appealing for classification tasks. Random forest: Random forest is a flexible supervised machine

AI Page 12
Artificial Intelligence 2025

learning algorithm used for both classification and regression purposes. The "forest" references
accollection of uncorrelated decision trees which are merged to reduce variance and increase
accuracy.

Supervised learning versus other learning methods
Supervised learning is not the only learning method for training machine learning
models. Other types of machine learning include:
➢ Unsupervised learning
➢ Semisupervised learning
➢ Self-supervised learning
➢ Reinforcement learning
➢ Supervised versus unsupervised learning
The difference between supervised learning and unsupervised learning is that unsupervised
machine learning uses unlabeled data. The model is left to discover patterns and
relationships in the data on its own. Many generative AI models are initially trained with
unsupervised learning and later with supervised learning to increase domain expertise.
Unsupervised learning can help solve for clustering or association problems in which common
properties within a dataset are uncertain. Common clustering algorithms are hierarchical, K-means
and Gaussian mixture models.

Supervised versus semi-supervised learning

Semi-supervised learning labels a portion of the input data. Because it can be time-
consuming and costly to rely on domain expertise to label data appropriately for supervised
learning, semi-supervised learning can be an appealing alternative.
Supervised versus self-supervised learning
Self-supervised learning (SSL) mimics supervised learning with unlabeled data. Rather than use
the manually created labels of supervised learning datasets, SSL tasks are configured so that the model
can generate implicit labels from unstructured data. Then, the model’s loss function uses those labels
in place of actual labels to assess model [Link]-supervised learning sees widespread use in
computer vision and natural language processing (NLP) tasks requiring large datasets that are
prohibitively expensive and time-consuming to label.
Supervised versus reinforcement learning
Reinforcement learning trains autonomous agents, such as robots and self-driving cars, to
make decisions through environmental interactions. Reinforcement learning does not use labeled data
and also differs from unsupervised learning in that it teaches by trial-and-error and reward, not by
identifying underlying patterns within datasets.
Real-world supervised learning use cases
➢ Supervised learning models can build and advance business applications, including:
➢ Image- and object-recognition: Supervised learning algorithms can be used to locate, isolate and
categorize objects out of videos or images, making them useful with computer vision and image
analysis tasks.
AI Page 13
Artificial Intelligence 2025

➢ Predictive analytics: Supervised learning models create predictive analytics systems to provide insights.
This allows enterprises to anticipate results based on an output variable and make data-driven
decisions, in turn helping business leaders justify their choices or pivot for the benefit of the
organization.
➢ Regression also allows healthcare providers to predict outcomes based on patient criteria and
historical data. A predictive model might assess a patient’s risk for a specific disease or condition
based on their biological and lifestyle data.
➢ Customer sentiment analysis: Organizations can extract and classify important pieces of information
from large volumes of data—including context, emotion and intent—with minimal human
intervention. Sentiment analysis gives a better understanding of customer interactions and can be
used to improve brand engagement efforts.
➢ Customer segmentation: Regression models can predict customer behavior based on various traits and
historical trends. Businesses can use predictive models to segment their customer base and create
buyer personas to improve marketing efforts and product development.
➢ Spam detection: Spam detection is another example of a supervised learning model. Using supervised
classification algorithms, organizations can train databases to recognize patterns or anomalies in new
data to organize spam and non-spam-related correspondences effectively.
➢ Forecasting: Regressive models excel at forecasting based on historical trends, making them suitable
for use in the financial industries. Enterprises can also use regression to predict inventory needs,
estimate employee salaries and avoid potential supply chain hiccups.
➢ Recommendation engines: With supervised learning models in play, content providers and online
marketplaces can analyze customer choices, preferences and purchases and build recommendation
engines that offer tailored recommendations more likely to convert.

Challenges of supervised learning


➢ Although supervised learning can offer businesses advantages such as deep data insights and
improved automation, it might not be the best choice for all situations.
➢ Personnel limitations: Supervised learning models can require certain levels of expertise to structure
accurately.
➢ Human involvement: Supervised learning models are incapable of self- learning. Data
scientists must validate the models’ performance output.
➢ Time requirements: Training datasets are large and must be manually labeled, which
makes the supervised learning process time-intensive.
➢ Inflexibility: Supervised learning models struggle to label data outside the bounds of their
training datasets. An unsupervised learning model might be more capable of dealing with
new data.
➢ Bias: Datasets risk a higher likelihood of human error and bias, resulting in algorithms learning
incorrectly.
➢ Overfitting: Supervised learning can sometimes result in overfitting: where a model becomes
too closely tailored to its training dataset. High accuracy in training can indicate overfitting
as opposed to generally strong performance. Avoiding overfitting requires that models be
tested with data that is different from the training data.

AI Page 14
Artificial Intelligence 2025

UNSUPERVISED LEARNING
Unsupervised learning, also known as unsupervised machine learning, uses machine learning
(ML) algorithms to analyze and cluster unlabeled data sets. These algorithms discover hidden patterns
or data groupings without the need for human intervention Unsupervised learning's ability to discover
similarities and differences in information make it the ideal solution for exploratory data analysis,
cross-selling strategies, customer segmentation and image recognition.

Common unsupervised learning approaches


Unsupervised learning models are utilized for three main tasks—clustering, association, and
dimensionality reduction. Below we’ll define each learning method and highlight common
algorithms and approaches to conduct them effectively.
Clustering
Clustering is a data mining technique which groups unlabeled data based on their similarities
or differences. Clustering algorithms are used to process raw, unclassified data objects into groups
represented by structures or patterns in the [Link] algorithms can be categorized
into a few types, specifically exclusive, overlapping, hierarchical, and probabilistic.
Exclusive and Overlapping Clustering

Exclusive clustering is a form of grouping that stipulates a data point can exist only in one
cluster. This can also be referred to as ―hard‖ clustering. K-means clustering is a common example of
an exclusive clustering method where data points are assigned into K groups, where K represents the
number of clusters based on the distance from each group’s centroid. The data points closest to a
given centroid will be clustered under the same category. A larger K value will be indicative of
smaller groupings with more granularity whereas a smaller K value will have larger groupings and less
granularity. K-means clustering is commonly used in market segmentation, document clustering,
image segmentation, and image compression. Overlapping clusters differs from exclusive clustering
in that it allows data points to belong to multiple clusters with separate degrees of membership. ―Soft‖ or
fuzzy k- means clustering is an example of overlapping clustering.

Hierarchical clustering

Hierarchical clustering, also known as hierarchical cluster analysis (HCA), is an


unsupervised clustering algorithm that can be categorized in two ways: agglomerative or
divisive. Agglomerative clustering is considered a ―bottoms-up approach.‖ Its data points are
isolated as separate groupings initially, and then they are merged together iteratively on the
basis of similarity until one cluster has been achieved. Four different methods are commonly
used to measure similarity:
1. Ward’s linkage: This method states that the distance between two clusters is defined
by the increase in the sum of squared after the clusters are merged.

AI Page 15
Artificial Intelligence 2025

2. Average linkage: This method is defined by the mean distance between two points in each cluster.
3. Complete (or maximum) linkage: This method is defined by the maximum distance between two points in
each cluster.
4. Single (or minimum) linkage: This method is defined by the minimum distance between two points
in each cluster.

Euclidean distance is the most common metric used to calculate these distances; however, other
metrics, such as Manhattan distance, are also cited in clustering literature. Divisive clustering can be
defined as the opposite of agglomerative clustering; instead it takes a ―top-down‖ approach. In this
case, a single data cluster is divided based on the differences between data points. Divisive
clustering is not commonly used, but it is still worth noting in the context of hierarchical clustering.
These clustering processes are usually visualized using a dendrogram, a tree-like diagram that
documents the merging or splitting of data points at each iteration.

Unsupervised vs. supervised and semi-supervised learning


Unsupervised learning and supervised learning are frequently discussed together. Unlike
unsupervised learning algorithms, supervised learning algorithms use labeled data. From that data, it
either predicts future outcomes or assigns data to specific categories based on the regression or
classification problem that it is trying to solve. While supervised learning algorithms tend to be more
accurate than unsupervised learning models, they require upfront human intervention to label the
data appropriately. However, these labelled datasets allow supervised learning algorithms to avoid
computational complexity as they don’t need a large training set to produce intended outcomes.
Common regression and classification techniques are linear and logistic regression, naïve bayes, KNN
algorithm, and random forest. Semi-supervised learning occurs when only part of the given input data
has been labelled. Unsupervised and semi-supervised learning can be more appealing alternatives as
it can be time-consuming and costly to rely on domain expertise to label data appropriately for
supervised learning. For a deep dive into the differences between these approaches, check out
"Supervised vs. Unsupervised Learning: What's the Difference?" Challenges of unsupervised learning
While unsupervised learning has many benefits, some challenges can occur when it allows machine
learning models to execute without any human intervention. Some of these challenges can include:
• Computational complexity due to a high volume of training data
• Longer training times
• Higher risk of inaccurate results
• Human intervention to validate output variables
• Lack of transparency into the basis on which data was clustered

AI Page 16
Artificial Intelligence 2025

Probabilistic clustering

A probabilistic model is an unsupervised technique that helps us solve density estimation or


―soft‖ clustering problems. In probabilistic clustering, data points are clustered based on the
likelihood that they belong to a particular distribution. The Gaussian Mixture Model (GMM) is the
one of the most commonly used probabilistic clustering methods.
➢ Gaussian Mixture Models are classified as mixture models, which means that they

are made up of an unspecified number of probability distribution functions. GMMs are primarily

leveraged to determine which Gaussian, or normal, probability distribution a given data point belongs

to. If the mean or variance are known, then we can determine which distribution a given data point

belongs to. However, in GMMs, these variables are not known, so we assume that a latent, or hidden,

variable exists to cluster data points appropriately. While it is not required to use the Expectation-

Maximization (EM) algorithm, it is a commonly used to estimate the assignment probabilities for a

given data point to a particular data cluster.

AI Page 17
Artificial Intelligence 2025

Association Rules
An association rule is a rule-based method for finding relationships between variables
in a given dataset. These methods are frequently used for market basket analysis, allowing companies
to better understand relationships between different products. Understanding consumption habits of
customers enables businesses to develop better cross-selling strategies and recommendation engines.
Examples of this can be seen in Amazon’s ―Customers Who Bought This Item Also Bought‖ or
Spotify’s "Discover Weekly" playlist. While there are a few different algorithms used to generate
association rules, such as Apriori, Eclat, and FP-Growth, the Apriori algorithm is most widely used.
Apriori algorithms

➢ Apriori algorithms have been popularized through market basket analyses, leading to
different recommendation engines for music platforms and online retailers. They are used within
transactional datasets to identify frequent itemsets, or collections of items, to identify the likelihood
of consuming a product given the consumption of another product. For example, if I play Black
Sabbath’s radio on Spotify, starting with their song ―Orchid‖, one of the other songs on this channel
will likely be a Led Zeppelin song, such as ―Over the Hills and Far Away.‖ This is based on my prior
listening habits as well as the ones of others. Apriori algorithms use a hash tree to count itemsets,
navigating through the dataset in a breadth-first manner.
➢ Dimensionality reduction
➢ While more data generally yields more accurate results, it can also impact the performance of
machine learning algorithms (e.g. overfitting) and it can also make it difficult to visualize datasets.

AI Page 18
Artificial Intelligence 2025
Dimensionality reduction is a technique used when the number of features, or dimensions, in a given
dataset is too high. It reduces the number of data inputs to a manageable size while also preserving
the integrity of the dataset as much as possible. It is commonly used in the preprocessing data stage,
and there are a few different dimensionality reduction methods that can be used, such as:
Principal component analysis
Principal component analysis (PCA) is a type of dimensionality reduction algorithm which is
used to reduce redundancies and to compress datasets through feature extraction. This method uses a
linear transformation to create a new data representation, yielding a set of "principal components."
The first principal component is the direction which maximizes the variance of the dataset. While the
second principal component also finds the maximum variance in the data, it is completely
uncorrelated to the first principal component, yielding a direction that is perpendicular, or orthogonal,
to the first component. This process repeats based on the number of dimensions, where a next
principal component is the direction orthogonal to the prior components with the most variance.
Singular value decomposition
Singular value decomposition (SVD) is another dimensionality reduction approach which
factorizes a matrix, A, into three, low-rank matrices. SVD is denoted by the formula, A = USVT,
where U and V are orthogonal matrices. S is a diagonal matrix, and S values are considered singular
values of matrix A. Similar to PCA, it is commonly used to reduce noise and compress data, such as
image files.
Autoencoders

Autoencoders leverage neural networks to compress data and then recreate a new representation
of the original data’s input. Looking at the image below, you can see that the hidden layer specifically
acts as a bottleneck to compress the input layer prior to reconstructing within the output layer. The
stage from the input layer to the hidden layer is referred to as ―encoding‖ while the stage from the
hidden layer to the output layer is known as ―decoding.‖

Applications of unsupervised learning


➢ Machine learning techniques have become a common method to improve a product user experience
and to test systems for quality assurance. Unsupervised learning provides an exploratory path to view
data, allowing businesses to identify patterns in large volumes of data more quickly when compared to
manual observation. Some of the most common real-world applications of unsupervised learning are:
➢ News Sections: Google News uses unsupervised learning to categorize articles on

the same story from various online news outlets. For example, the results of a presidential election

could be categorized under their label for ―US‖ news.

AI Page 19
Artificial Intelligence 2025

➢ Computer vision: Unsupervised learning algorithms are used for visual perception tasks, such as object
recognition.

➢ Medical imaging: Unsupervised machine learning provides essential features to medical imaging
devices, such as image detection, classification and segmentation,used in radiology and pathology to
diagnose patients quickly and accurately.

➢ Anomaly detection: Unsupervised learning models can comb through large amounts of data and

discover atypical data points within a dataset. These anomalies can raise awareness around faulty

equipment, human error, or breaches in security.

➢ Customer personas: Defining customer personas makes it easier to understand common traits and business

clients' purchasing habits. Unsupervised learning allows businesses to build better buyer persona

profiles, enabling organizations to align their product messaging more appropriately.

➢ Recommendation Engines: Using past purchase behavior data, unsupervised learning can help to discover

data trends that can be used to develop more effective cross-selling strategies. This is used to make

relevant add-on recommendations to customers during the checkout process for online retailers.

AI Page 20
Artificial Intelligence 2025

SEMI-SUPERVISED LEARNING
Semi-supervised learning is a branch of machine learning that combines supervised and unsupervised
learning by using both labeled and unlabeled data to train artificial intelligence (AI) models for classification
and regression [Link] semi-supervised learning is generally employed for the same use cases in
which one might otherwise use supervised learning methods, it’s distinguished by various techniques that
incorporate unlabeled data into model training, in addition to the labeled data required for conventional
supervised [Link]-supervised learning methods are especially relevant in situations where
obtaining a sufficient amount of labeled data is prohibitively difficult or expensive, but large amounts of
unlabeled data are relatively easy to acquire. In such scenarios, neither fully supervised nor unsupervised
learning methods will provide adequate solutions.

Labeled data and machine learning


Training AI models for prediction tasks like classification or regression typically requires
labeled data: annotated data points that provide necessary context and demonstrate the correct
predictions (output) for each sample input. During training, a loss function measures the difference
(loss) between the model’s predictions for a given input and the ―ground truth‖ provided by that input’s
label. Models learn from these labeled examples by using techniques like gradient descent that update
model weights to minimize loss. Because this machine learning process actively involves humans, it is
called ―supervised‖ learning. Properly labeling data becomes increasingly labor-intensive for complex
AI tasks. For example, to train an image classification model to differentiate between cars and
motorcycles, hundreds (if not thousands) of training images must be labeled ―car‖ or ―motorcycle‖; for
a more detailed computer vision task, like object detection, humans must not only annotate the
object(s) each image contains, but where each object is located; for even more detailed tasks, like
image segmentation, data labels must annotate specific pixel-by-pixel boundaries of different image
segments for each image. Labeling data can thus be particularly tedious for certain use cases. In more
specialized machine learning use cases, like drug discovery, genetic sequencing or protein
classification, data annotation is not only extremely time-consuming, but also requires very specific
domain [Link]-supervised learning offers a way to extract maximum benefit from a scarce
amount of labeled data while also making use of relatively abundant unlabeled data.
Semi-supervised learning vs. supervised learning vs. unsupervised learning
Semi-supervised learning can be thought of as a hybrid of or middle ground between supervised
learning and unsupervised learning.
Semi-supervised learning vs supervised learning
The primary distinction between semi- and fully supervised machine learning is that the latter
can only be trained using fully labeled datasets, whereas the former uses both labeled and unlabeled
data samples in the training process. Semi-supervised learning techniques modify or supplement a
supervised algorithm—called the ―base learner,‖ in this context—to incorporate information from
unlabeled [Link] data points are used to ground the base learner’s predictions and add
structure (like how many classes exist and the basic characteristics of each) to the learning problem.
The goal in training any classification model is for it to learn an accurate decision boundary: a line—
or, for data with more than two dimensions, a ―surface‖ or hyperplane—separates data points of one
classification category from data points belonging to a different classification category. Though a fully

AI Page 21
Artificial Intelligence 2025
supervised classification model can technically learn a decision boundary using only a few labeled
data points, it might not generalize well to real-world examples, making the model's predictions
[Link] classic ―half-moons‖ dataset visualizes the shortcomings of supervised models relying
on too few labeled data points. Though the ―correct‖ decision boundary would separate each of the
two half-moons, a supervised learning model is likely to overfit the few labeled data points available.
The unlabeled data points clearly convey helpful context, but a traditional supervised algorithm cannot
process unlabeled [Link] only the very limited labeled data points available, a supervised
model may learn a decision boundary that will generalize poorly and be prone to
misclassifying new examples.
Semi-supervised learning vs unsupervised learning
Unlike semi-supervised (and fully supervised) learning, unsupervised learning algorithms use
neither labeled data nor loss functions. Unsupervised learning eschews any ―ground truth‖ context
against which model accuracy can be measured and optimized .An increasingly common semi-
supervised approach, particularly for large language models, is to ―pre-train‖ models via
unsupervised tasks that require the model to learn meaningful representations of unlabeled data sets.
When such tasks involve a ―ground truth‖ and loss function (without manual data annotation), they’re
called self- supervised learning. After subsequent ―supervised fine tuning‖ on a small amount of
labeled data, pre-trained models can often achieve performance comparable to fully supervised
[Link] unsupervised learning methods can be useful in many scenarios, that lack of context
can make them ill-suited to classification on their own. Take, for example, how a typical clustering
algorithm—grouping data points into a pre-determined number of clusters based on their proximity to
one another—would treat the half- moon dataset.A typical unsupervised algorithm, k-means
clustering, might incorrectly group data points together based only on their relative closeness to
"average" datapoints (centroids).
Semi-supervised learning vs self-supervised learning
Both semi- and self-supervised learning aim to circumvent the need for large amounts of
labeled data—but whereas semi-supervised learning When combined with supervised downstream
tasks, self-supervised pretext tasks thus comprise part of a semi-supervised learning process: a
learning method using both labeled and unlabeled data for model training.
How does semi-supervised learning work?
Semi-supervised learning relies on certain assumptions about the unlabeled data used to train
the model and the way data points from different classes relate to one another.A necessary condition of
semi-supervised learning (SSL) is that the unlabeled examples used in model training must be relevant
to the task the model is being trained to perform. In more formal terms, SSL requires that the
distribution p(x) of the input data must contain information about the posterior distribution
p(y|x)—that is, the conditional probability of a given data point (x) belonging to a certain class (y). So,
for example, if one is using unlabeled data to help train an image classifier to differentiate between
pictures of cats and pictures of dogs, the training dataset should contain images of both cats and
dogs—and images of horses and motorcycles will not be helpful. Accordantly, while a 2018 study of
semi-supervised learning algorithms found that ―increasing the amount of unlabeled data tends to
improve the performance of SSL techniques,‖ it also found that ―adding unlabeled data from a
AI Page 22
Artificial Intelligence 2025
mismatched set of classes can actually hurt performance compared to not using any unlabeled data at
all." 1The basic condition of p(x) having a meaningful relationship to p(x|y) gives rise to multiple
assumptions about the nature of that relationship. These assumptions are the driving force behind
most, if not all, SSL methods: generally speaking, any semi- supervised learning algorithm relies on
one or more of the following assumptions being explicitly or implicitly satisfied.
Cluster assumption
The cluster assumption states that data points belonging to the same cluster–a set of data points
more similar to each other than they are to other available data points– will also belong to the same
[Link] sometimes considered to be its own independent assumption, the clustering assumption
has also been described by van Engelen and Hoos as ―a generalization of the other assumptions."2 In
this view, the determination of data point clusters depends on which notion of similarity is being
used: the smoothness assumption, low-density assumption and manifold assumption each simply
leverage a different definition of what comprises a ―similar‖ data point.
Smoothness assumption
The smoothness assumptions states that if two data points, x and x’, are close to each other in the
input space—the set of all possible values for x–then their labels, y and y’, should be the [Link]
assumption, also known as the continuity assumption, is common to most supervised learning: for
example, classifiers learn a meaningful approximation (or ―representation‖) of each relevant class
during training; once trained, they determine the classification of new data points via which
representation they most closely [Link] the context of SSL, the smoothness assumption has the
added benefit of being applied transitively to unlabeled data. Consider a scenario involving three data
points:
➢ a labeled data point, x1

➢ an unlabeled data point, x2, that’s close to x1

➢ another unlabeled data point, x3, that’s close to x2 but not close to x1

The smoothness assumption tells us that x2 should have the same label as x1. It also tells us that x3
should have the same label as x2. Therefore, we can assume that all three data points have the same
label, because x1’s label is transitively propagated to x3 because of x3’s proximity to x2.

Low-density assumption
The low-density assumption states that the decision boundary between classes should not pass
through high-density regions. Put another way, the decision boundary should lie in an area that
contains few data [Link] low-density assumption could thus be thought of as an extension of the
cluster assumption (in that a high-density cluster of data points represents a class, rather than the
boundary between classes) and the smoothness assumption (in that if multiple data points are near each
other, they should share a label, and thus fall on the same side of the decision boundary).This diagram
illustrates how the smoothness and low-density assumptions can inform a far more intuitive decision
boundary than would be possible with supervised methods that can only consider the (very few)
labeled data [Link]: van Engelen, et al (2018)

Manifold assumption

AI Page 23
Artificial Intelligence 2025
The manifold assumption states that the higher-dimensional input space comprises
multiple lower dimensional manifolds on which all data points lie, and that data points on the same
manifold share the same [Link] an intuitive example, consider a piece of paper crumpled up into a
ball. The location of any points on the spherical surface can only mapped with three- dimensional
x,y,z coordinates. But if that crumpled up ball is now flattened back into a sheet of paper, those same
points can now be mapped with two-dimensional x,y coordinates. This is called dimensionality
reduction, and it can be achieved mathematically using methods like autoencoders or convolutions.
In machine learning, dimensions correspond not to the familiar physical dimensions, but to each
attribute or feature of data. For example, in machine learning, a small RGB image measuring 32x32
pixels has 3,072 dimensions: 1,024 pixels, each of which has three values (for red, green and blue).
Comparing data points with so many dimensions is challenging, both because of the complexity and
computational resources required and because most of that high-dimensional space does not contain
information meaningful to the task at [Link] manifold assumption holds that when a model learns
the proper dimensionality reduction function to discard irrelevant information, disparate data points
converge to a more meaningful representation for which the other SSL assumptions are more
reliable. Mapping the data points to a lower-dimensional manifold can provide a more accurate
decision boundary, which can then be translated back to higher-dimensional space. (source: van
Engelen, et al, 2018)
Transductive learning
Transductive learning methods use available labels to discern label predictions for a given set
of unlabeled data points, so that they can be used by a supervised base [Link] inductive
methods aim to train a classifier that can model the entire (labeled and unlabeled) input space,
transductive methods aim only to yield label predictions for unlabeled data. The algorithms used for
transductive learning are largely unrelated to the algorithm(s) to be used by the supervised
classifier model to be trained using this newly labeled data.
Label propagation
Label propagation is a graph-based algorithm that computes label assignments for unlabeled
data points based on their relative proximity to labeled data points, using the smoothness assumption
and cluster assumption. The intuition behind the algorithm is that one can map a fully connected
graph in which the nodes are all available data points, both labeled and unlabeled. The closer two
nodes are based on some chosen measure of distance, like Euclidian distance (link resides outside
[Link]), the more heavily the edge between them is weighted in the algorithm. Starting from the
labeled data points, labels then iteratively propagate through neighboring unlabeled data points, using
the smoothness and cluster assumptions .LEFT: original labeled and unlabeled data points.
RIGHT: using label propagation, the unlabeled data points have been assigned pseudo-
labels.
Active learning
Active learning algorithms do not automate the labeling of data points: instead, they are used
in SSL to determine which unlabeled samples would provide the most helpful information if
manually labeled.3 The use of active learning in semi- supervised settings has achieved promising
results: for example, a recent study found that it more than halved the amount of labeled data
AI Page 24
Artificial Intelligence 2025
required to effectively train a model for semantic segmentation.4
Inductive learning
Inductive methods of semi-supervised learning aim to directly train a classification (or
regression) model, using both labeled and unlabeled data Inductive SSL methods can generally be
differentiated by the way in which they incorporate unlabeled data: via a pseudo-labeling step, an
unsupervised pre- processing step, or by direct incorporation into the model’s objective function.
Wrapper methods A relatively simple way to extend existing supervised algorithms to a semi-
supervised setting is to first train the model on the available labeled data—or simply use a suitable
pre-existing classifier—and then generate pseudo-label predictions for unlabeled data points. The
model can then be re-trained using both the originally labeled data and the pseudo-labeled data, not
differentiating between the [Link] primary benefit of wrapper methods, beyond their simplicity,
is that they are compatible with nearly any type of supervised base learner. Most wrapper methods
introduce some regularization techniques to reduce the risk of reinforcing potentially inaccurate
pseudo-label predictions.
Self-training
Self-training is a basic wrapper method. It requires probabilistic, rather than
deterministic, pseudo-label predictions: for example, a model that outputs ―85 percent dog, 15
percent cat‖ instead of simply outputting ―dog.‖Probabilistic pseudo-label predictions allow
self-training algorithms to accept only predictions that exceed a certain confidence threshold,
in a process akin to entropy minimization.5 This process can be done iteratively, to either
optimize the pseudo- classification process or reach a certain number of pseudo-labeled
samples.
Co-training
Co-training methods extend the self-training concept by training multiple supervised base
learners to assign [Link] diversification is intended to reduce the tendency to reinforce
poor initial predictions. It’s therefore important that the predictions of each base learner not be
strongly correlated with one another. A typical approach is to use different algorithms for each
classifier. Another is for each classifier to focus on a different subset of the data: for example, in
video data, training one base learner on visual data and the other on audio [Link] pre-
processing Unlike wrapper methods (and intrinsically semi-supervised algorithms), which use labeled
and unlabeled data simultaneously, some SSL methods use unlabeled and labeled data in separate
stages: an unsupervised pre-processing stage, followed by a supervised [Link] wrapper methods,
such techniques can essentially be used for any supervised base learner. But in contrast to wrapper
methods, the ―main‖ supervised model is ultimately trained only on originally (human-annotated)
labeled data points. Such pre-processing techniques range from extracting useful features from
unlabeled data to pre-clustering unlabeled data points to using ―pre-training‖ to determine the initial
parameters of a supervised model (in a process akin to the pretext tasks performed in self-supervised
learning).Cluster-then-label One straightforward semi-supervised technique involves clustering all
data points (both labeled and unlabeled) using an unsupervised algorithm. Leveraging the
clustering assumption, those clusters can be used to help train an independent classifier model—or,
if the labeled data points in a given cluster are all of the same class, pseudo-labeling the unlabeled

AI Page 25
Artificial Intelligence 2025
data points and proceeding in a manner similar to wrapper methods. As demonstrated by the ―half-
moons‖ example earlier in this article, simple methods (like k-nearest neighbors) may yield
inadequate predictions. More refined clustering algorithms, like DBSCAN (which implements the
low-density assumption),6 have achieved greater reliability.
Pre-training and feature extraction
Unsupervised (or self-supervised) pre-training allows models to learn useful representations
of the input space, reducing the amount of labeled data needed to fine tune a model with supervised
learning.A common approach is to employ a neural network, often an autoencoder, to learn an
embedding or feature representation of the input data—then using these learned features to train a
supervised base learner. This often entails dimensionality reduction, helping to make use of the
manifold assumption.
Intrinsically semi-supervised methods
Some SSL methods directly unlabeled data into the objective function of the base learner,
rather than processing unlabeled data in a separate pseudo-labeling or pre- processing [Link]-
supervised support vector machines When data points of different categories are not linearly
separable–when no straight line can neatly, accurately define the boundary between categories—
support vector machine (SVMs) algorithms map data to a higher-dimensional feature space in which
the categories can be separated by a hyper plane. In determining this decision boundary, SVM
algorithms maximize the margin between the decision boundary and the data points closest to it. This,
in practice, applies the low-density [Link] a supervised setting, a regularization term penalizes
the algorithm when labeled data points falls on the wrong side of the decision boundary. In semi-
supervised SVMs (S3VMs), this isn’t possible for unlabeled data points (whose classification is
unknown)—thus, S3VMs also penalize data points that lie within the prescribed margin.
Intrinsically semi-supervised deep learning models A variety of neural network architectures have
been adapted for semi-supervised learning. This is achieved by adding or modifying the loss terms
typically used in these architectures, allowing for the incorporation of unlabeled data points in training.
Proposed semi-supervised deep learning architectures include ladder networks, pseudo-ensembles temporal
ensembling, and select modifications to generative adversarial networks (GANS).

AI Page 26
Artificial Intelligence 2025

REINFORCEMENT LEARNING
In reinforcement learning, an agent learns to make decisions by interacting with an
environment. It is used in robotics and other decision-making settings..Reinforcement learning (RL)
is a type of machine learning process that focuses on decision making by autonomous agents. An
autonomous agent is any system that can make decisions and act in response to its environment
independent of direct instruction by a human user. Robots and self-driving cars are examples of
autonomous agents. In reinforcement learning, an autonomous agent learns to perform a task by trial
and error in the absence of any guidance from a human user.1 It particularly addresses sequential
decision-making problems in uncertain environments, and shows promise in artificial intelligence
development.

Supervised and unsupervised learning


Literature often contrasts reinforcement learning with supervised and unsupervised
learning. Supervised learning uses manually labeled data to produce predictions or classifications.
Unsupervised learning aims to uncover and learn hidden patterns from unlabeled data. In contrast to
supervised learning, reinforcement learning does not use labeled examples of correct or incorrect
behavior. But reinforcement learning also differs from unsupervised learning in that reinforcement
learning learns by trial- and-error and reward function rather than by extracting information of
hidden patterns.2 Supervised and unsupervised learning methods assume each record of input data is
independent of other records in the dataset but that each record actualizes a common underlying
data distribution model. These methods learn to predict with model performance measured
according to prediction accuracy [Link] contrast, reinforcement learning learns to act. It
assumes input data to be interdependent tuples—i.e. an ordered sequence of data—organized as state-
action- reward. Many applications of reinforcement learning algorithms aim to mimic real- world
biological learning methods through positive reinforcement .Note that, although the two are not
often compared in literature, reinforcement learning is distinct from self-supervised learning as well.
The latter is a form of unsupervised learning that uses pseudo labels derived from unlabeled training
data as a ground truth to measure model accuracy. Reinforcement learning, however, does not produce
pseudo labels or measure against a ground truth—it is not a classification method but an action
learner. The two have been combined however with promising results.3

Reinforcement learning process


Reinforcement learning essentially consists of the relationship between an agent, environment,
and goal. Literature widely formulates this relationship in terms of the Markov decision process
(MDP).Markov decision process The reinforcement learning agent learns about a problem by
interacting with its environment. The environment provides information on its current state. The
agent then uses that information to determine which actions(s) to take. If that action obtains a
reward signal from the surrounding environment, the agent is encouraged to take that action again
when in a similar future state. This process repeats for every new state thereafter. Over time, the
agent learns from rewards and punishments to take actions within the environment that meet a
specified goal.

AI Page 27
Artificial Intelligence 2025

In Markov decision processes, state space refers to all of the information provided by an
environment’s state. Action space denotes all possible actions the agent may take within a state.
Exploration-exploitation trade-off Because an RL agent has no manually labeled input data guiding
its behavior, it must explore its environment, attempting new actions to discover those that receive
rewards. From these reward signals, the agent learns to prefer actions for which it was rewarded in
order to maximize its gain. But the agent must continue exploring new states and actions as well. In
doing so, it can then use that experience to improve its decision-making. RL algorithms thus
require an agent to both exploit knowledge of previously rewarded state-actions and explore other
state-actions. The agent cannot exclusively pursue exploration or exploitation. It must continuously
try new actions while also preferring single (or chains of) actions that produce the largest
cumulative reward.6 Components of reinforcement learning Beyond the agent-environment-goal
triumvirate, four principal sub-elements characterize reinforcement learning problems Policy. This
defines the RL agent’s behavior by mapping perceived environmental states to specific actions the
agent must take when in those states. It can take the form of a rudimentary function or more
involved computational process. For instance, a policy guiding an autonomous vehicle may map
pedestrian detection to a stop [Link] signal. This designates the RL problem’s goal. Each of the
RL agent’s actions either receives a reward from the environment or not. The agent’s only objective
is to maximize its cumulative rewards from the environment. For self- driving vehicles, the reward
signal can be reduced travel time, decreased collisions, remaining on the road and in the proper lane,
avoiding extreme de- or accelerations, and so forth. This example shows RL may incorporate multiple
reward signals to guide an [Link] function. Reward signal differs from value function in that the
former denotes immediate benefit while the latter specifies long-term benefit. Value refers to a state’s
desirability per all of the states (with their incumbent rewards) that are likely to follow. An
autonomous vehicle may be able to reduce travel time by exiting its lane, driving on the sidewalk, and
accelerating quickly, but these latter three actions may reduce its overall value function. Thus, the
vehicle as an RL agent may exchange marginally longer travel time to increase its reward in the latter
three areas. Model. This is an optional sub-element of reinforcement learning systems. Models allow
agents to predict environment behavior for possible actions. Agents then use model predictions to
determine possible courses of action based on potential outcomes. This can be the model guiding
the autonomous vehicle and that helps it predict best routes, what to expect from surrounding
vehicles given their position and speed, and so forth.7 Some model-based approaches use direct
human feedback in initial learning and then shift to autonomous leanring.
Online versus offline learning
There are two general methods by which an agent collects data for learning policies:
Online. Here, an agent collects data directly from interacting with its surrounding environment. This
data is processed and collected iteratively as the agent continues interacting with that environment.
Offline. When an agent does not have direct access to an environment, it can learn through logged data
of that environment. This is offline learning. A large subset of research has turned to offline learning
given practical difficulties in training models through direct interaction with environments.

AI Page 28
Artificial Intelligence 2025

Types of reinforcement learning


Reinforcement learning is a vibrant, ongoing area of research, and as such, developers have
produced a myriad approaches to reinforcement learning. Nevertheless, three widely discussed and
foundational reinforcement learning methods are dynamic programming, monte carlo, and temporal
difference learning. Dynamic programming Dynamic programming breaks down larger tasks into
smaller tasks. Thus, it models problems as workflows of sequential decision made at discrete time
steps. Each decision is made in terms of the resulting possible next state. An agent’s reward (r) for a
given action is defined as a function of that action (a), the current environmental state (s), and the
potential next state (s’):This reward function can be used as (part of) the policy governing an agent’s
actions. Determining the optimal policy for agent behavior is a chief component of dynamic
programming methods for reinforcement learning. Enter the Bellman [Link] Bellman
equation is: In short, this equation defines vt(s) as the total expected reward starting at time t until the
end of a decision workflow. It assumes that the agent begins by occupying state s at time t. The
equation ultimately divides the reward at time t into the immediate reward rt(s,a) (i.e. the reward
formula) and the agent’s total expected reward. An agent thus maximizes its value function—being the
total value of the Bellman equation—by consistently choosing that action which receives a reward
signal in each state.

Monte Carlo method


Dynamic programming is model-based, meaning it constructs a model of its environment
to perceive rewards, identify patterns, and navigate the environment. Monte Carlo, however,
assumes a black-box environment, making it [Link] dynamic programming predicts
potential future states and reward signals in making decisions, Monte Carlo methods are
exclusively experience-based, meaning they sample sequences of states, actions, and rewards
solely through interaction with the environment. Monte Carlo methods thus learn through trial
and error rather than probabilistic [Link] Carlo further differs from dynamic
programming in value function determination. Dynamic programming seeks the largest
cumulative reward by consistently selecting rewarded actions in successive states. Monte Carlo,
by contrast, averages the returns for each state–action pair. This, in turn, means that the Monte
Carlo method must wait until all actions in a given episode (or planning horizon) have been
completed before calculating its value function, and then updating its policy.10 Temporal
difference learning Literature widely describes temporal difference (TD) learning as a
combination of dynamic programming and Monte Carlo. As in the former, TD updates its policy,
and so estimates for future states, after each step without waiting for a final value. As in Monte
Carlo, however, TD learns through raw interaction with its environment rather than using a
model thereof.11Per its name, the TD learning agent revises its policy according to the
difference between predicted and actual received rewards in each state. That is, while dynamic
programming and Monte Carlo only consider the reward received, TD further weighs the
difference between its expectation and received reward. Using this difference, the agent updates its
estimates for the next step without waiting until the event planning horizon, contra Monte
Carlo.12 TD has many variations. Two prominent variations are State–action–reward–state–
AI Page 29
Artificial Intelligence 2025
action (SARSA) and Q-learning. SARSA is an on-policy TD method, meaning it evaluates and
attempts to improve its decision-governing policy. Q-learning is off- policy. Off-policy methods
are those that use two policies: one for exploitation (target policy) and one for exploration to
generate behavior (behavior policy).
Additional methods
There is a myriad of additional reinforcement learning methods. Dynamic programming is a
value-based method, meaning it selects actions based on their estimated values according to a policy
that aims to maximize its value function. By contrast, policy gradient methods learn a parameterized
policy that can select actions without consulting a value function. These are called policy-based and are
considered more effective in high-dimensional environments. Actor-critic methods use both value-
based and policy-based. The so-called ―actor‖ is a policy gradient determining which actions to take,
while the ―critic‖ is a value function to evaluate actions. Actor-critic methods are, essentially, a form
of TD. More specifically, actor-critic evaluates a given action’s value based not only on its own
reward but the possible value of the following state, which it adds to the action’s reward. Actor-
critic’s advantage is that, due to its implementation of a value function and policy in decision-making,
it effectively requires less environment interaction.15 Examples of reinforcement learning Given
reinforcement learning is centrally concerned with decision-making in unpredictable environments,
it has been a core area of interest in robotics. For accomplishing simple and repetitive tasks,
decision-making may be straightforward. But more complicated tasks, such as attempts to simulate
human behavior or automate driving, involve interaction with high-variable and mutable real-
world environments. Research shows deep reinforcement learning with deep neural networks aids
such tasks, especially with respect to generalization and mapping high- dimensionally sensory input
to controlled systems outputs.16 Studies suggest that deep reinforcement learning with robots relies
heavily on collected datasets, and so recent work explores avenues for collecting real-world data and
repurposing prior data18 to improve reinforcement learning [Link] research suggests
leveraging natural language processing techniques and tools—e.g. large language models (LLMs)—
may improve generalization in reinforcement learning systems through textual representation of real-
world environments.19 Many studies show how interactive textual environments provide cost-effective
alternatives to three-dimensional environments when instructing learning agents in successive decision-
making tasks.20 Deep reinforcement learning also undergirds textual decision-making in chatbots. In
fact, reinforcement learning outperforms other methods for improving chatbot dialogue response.21

AI Page 30

You might also like