0% found this document useful (0 votes)

68 views15 pages

News Category Classifier with Gen AI

This project presents a deep learning-based News Category Classifier that utilizes Generative AI to enhance content understanding and user interaction. The classifier, built on the DistilBERT model and integrated with a Streamlit UI, accurately categorizes news articles while providing human-like explanations for its predictions. This innovative approach aims to improve transparency and engagement, making it suitable for various applications in journalism and content moderation.

Uploaded by

avasanth081

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views15 pages

News Category Classifier with Gen AI

Uploaded by

avasanth081

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

SNS COLLEGE OF TECHNOLOGY

(An Autonomous Institution)

DEEP LEARNING ASSIGNMENT PHASE - II

NEWS CATEGORY CLASSIFER PREDICTOR USING GEN AI

NAME: VASANTH.A

DEPT: III AIML B

REG NO. : 713522AM114

ABSTRACT

This project presents a deep learning-powered News Category Classifier

enhanced with Generative AI capabilities for improved content understanding
and user interaction. Traditional classifiers often label articles without offering
deeper context. To address this, we utilize the distilbert-base-uncased
transformer model fine-tuned on a labeled news dataset, enabling accurate
categorization across domains such as sports, politics, business, and technology.
Integrated with an interactive Streamlit UI, users can input or paste any news
article, and the app processes the text through the model to predict its category.
To further enhance interpretability, the app optionally generates human-like
summaries or reasoning using a lightweight LLM accessed via the Hugging Face
API or Ollama CLI. This hybrid approach not only ensures high classification
accuracy but also bridges the explainability gap, making it ideal for media
houses, content aggregators, and academic research platforms.
INTRODUCTION

In an era where information is generated at an unprecedented pace, organizing

and categorizing news content efficiently has become a critical challenge for
media platforms and content aggregators. While traditional machine learning
models can classify articles into categories like politics, sports, or business, they
often fail to provide insights into why a certain classification was made. This
lack of explainability reduces trust and limits the system’s usability for editorial
teams and end-users.

This project addresses that gap using Generative AI. By fine-tuning a

transformer-based language model (DistilBERT) on labeled news datasets, and
optionally integrating a large language model (e.g., LLaMA 2 or Mistral via
Hugging Face), we enable the system not only to classify articles accurately but
also to justify its decisions in natural language. This approach enhances
transparency, improves user engagement, and makes the classifier suitable for
real-world deployment in journalism, academic research, and content
moderation.
PROCESS FOR IDENTIFICATION OF PROBLEM
STATEMENT

The following steps were followed:

❖ Domain Analysis:

➢ We examined existing news classification systems and observed

that while many models can label articles accurately, they lack
transparency and offer no rationale behind their decisions,
reducing user trust and engagement.

❖ Dataset Acquisition:

➢ We used a labeled dataset containing news headlines and article

content tagged with categories such as politics, sports, technology,
business, and more. This dataset enabled us to train and evaluate
the classifier on diverse real-world examples.

❖ Problem Framing:

➢ The key question became:

“Can we develop a news classifier that not only categorizes articles
correctly but also explains the reasoning behind each classification
using human-like language?”

❖ Model Choice Justification:

➢ DistilBERT, a lightweight and efficient transformer model,

was selected for its strong performance on text classification
tasks. It was further enhanced using a generative model to
generate natural language explanations.
ARCHITECTURE
STAGES OF DEVELOPMENT

Stage 1: Data Preprocessing

• Collected and cleaned news category dataset (e.g., AG

News, BBC News, etc.)

• Transformed each record into natural language prompts

(e.g., “Classify the following news headline: ‘Stocks fall
amid inflation fears.’”)

• Split the dataset into training and testing sets for model
evaluation

Stage 2: Fine-Tuning the LLaMA or Mistral

Model

• Utilized Hugging Face’s transformers and datasets

libraries

• Fine-tuned with LoRA (Low-Rank Adaptation) for

efficient training

• Model options:
meta-llama/Llama-2-7b-chat-hf or
mistralai/Mistral-7B-Instruct

Instruction tuning format used:

json
Copy
Edit
{
"prompt": "Classify this news: 'The Prime Minister addressed
the nation regarding economic reforms...'",
"response": "Politics. This is related to government and
public administration."
}
Stage 3: Frontend UI Development

• Designed using Streamlit

• Users input headlines/articles via st.text_input() or

st.text_area()

• Prompts are dynamically constructed and passed to the

model backend

Stage 4: Model Invocation

• Prompt sent to Ollama CLI using:

ollama run news-classifier-model

• Model generates prediction + explanation in human-

readable format

Stage 5: Integration & Optimization

• Added @st.cache_resource to optimize model loading

• Integrated loading indicators, clean interface layout, and

category-wise color tags for better UX
PLATFORMS AND TOOLS INCORPORATED

1. Python

● Role: Backbone of the entire project

● Why Used: Python is the preferred language for AI/ML due to its
simplicity, extensive libraries, and vibrant ecosystem.
● Usage:
○ Data preprocessing (pandas, numpy)
○ Model training and prompt engineering
○ Streamlit app development
○ Interfacing with Ollama CLI

2. Streamlit

● Role: Frontend user interface

● Why Used: Streamlit is a fast and easy way to build web apps for
machine learning and data science projects.
● Features Used:
○ [Link]() and [Link]() for collecting user
inputs
○ st.chat_message() to simulate a ChatGPT-like interaction

○ [Link]() to show loading status during inference

● Outcome: Created a clean, interactive UI where users provide loan
application data and receive conversational feedback from the AI model.
3. Hugging Face Transformers

● Role: Model fine-tuning and inference pipeline

● Why Used: Hugging Face provides tools for loading, training, and
deploying large language models like LLaMA.
● Usage:
○ Downloading the base meta-llama/Llama-2-7b-chat-hf
model
○ Converting structured data into instruction-format prompts
○ Fine-tuning the model with domain-specific data
○ Creating inference pipelines to integrate with the UI

4. Ollama CLI

● Role: Lightweight runtime to serve the LLaMA model locally

● Why Used: Traditional deployment of LLaMA models requires heavy
GPU servers. Ollama simplifies this by offering containerized, fast local
inference.
● Usage:
○ Hosting the fine-tuned LLaMA model using a Modelfile
○ Responding to prompts sent from the Streamlit app

Command Example:
ollama run llama-custom-model

5. CUDA / GPU (Optional but Recommended)

● Role: Hardware acceleration for training large models

● Why Used: Fine-tuning a model like LLaMA-2-7B requires high memory
and compute, which CPUs can’t handle efficiently.
● Toolkits: NVIDIA CUDA, cuDNN
● Outcome: Enabled faster training during the model fine-tuning phase

6. Pandas

● Role: Data analysis and preprocessing

● Why Used: Easy handling of structured data in tabular format (CSV
files).
● Usage:
○ Read loan_data_with_cibil.csv
○ Cleaned and transformed categorical variables (e.g., gender,
employment status)
○ Merged features into training prompts

7. Scikit-learn

● Role: Data preparation and utility functions

● Why Used:
○ Used for splitting the dataset into training and testing sets
(train_test_split)
○ Label encoding for categorical features (if needed)

8. LoRA (Low-Rank Adaptation)

● Role: Parameter-efficient fine-tuning

● Why Used: Fine-tuning large models like LLaMA from scratch is
memory-intensive. LoRA reduces the number of trainable parameters.
● Library: peft from Hugging Face ecosystem
● Outcome: Reduced training cost and made fine-tuning feasible on a
mid-tier GPU

9. Transformers Datasets

● Role: Efficient data handling for model training

● Why Used:
○ Compatible with Hugging Face training pipeline
○ Fast I/O and in-memory caching for better training performance
● Usage:
○ Loaded prompt-response pairs in a format suitable for training the
LLaMA model

10. Modelfile (Ollama Specific)

● Role: Configuration file for serving LLaMA models using Ollama

● Why Used: Specifies base model, adapter files, and prompt formatting

Example:
FROM llama2

ADAPTER [Link]

SYSTEM "You are a helpful news category classifier"

OUTCOME

INPUT
OUTPUT
SOURCE CODE

Common questions

The project leverages Hugging Face's resources to optimize model performance through several mechanisms: downloading and preparing models like meta-llama/Llama-2-7b-chat-hf, converting structured data to instruction-format prompts, and creating efficient inference pipelines . The tools facilitate seamless integration for model training and evaluation, as evidenced by the use of Hugging Face's datasets library for fast I/O and in-memory caching, enhancing the training process . This ecosystem supports high adaptability in handling diverse data requirements for effective news classification .

Using natural language explanations in news categorization has significant implications for transparency and user trust. It allows the system to not only classify but also justify the classification decisions in a comprehensible manner . This builds user confidence as explanations provide insights into model operations, increasing system reliability for media houses and content aggregators . Natural language explanations also foster better user interaction by translating complex model outputs into relatable information, making AI systems more approachable and human-like .

Integrating a large language model like LLaMA addresses several challenges in news classification: providing rationale behind classification decisions, enhancing transparency, and improving user trust . Traditional models often lack explanatory capabilities, making it difficult to understand classification logic. A large language model helps bridge this gap, offering human-readable explanations that enhance interpretability and user engagement . This integration is critical in real-world applications where trust and clarity are paramount for users .

LoRA plays a crucial role in making the fine-tuning of large models like LLaMA feasible by reducing the number of trainable parameters, which significantly cuts down on memory requirements and training costs . It offers parameter-efficient fine-tuning, enabling the processing of large language models on mid-tier GPUs instead of necessitating high-end hardware . This approach not only makes the model training process more accessible and efficient but also ensures that developers can leverage the full power of LLaMA without extensive computational resources .

The incorporation of Ollama CLI facilitates the deployment of the LLaMA model by enabling fast, containerized local inference, which is less resource-intensive than traditional deployment methods that require heavy GPU servers . Ollama's lightweight runtime simplifies the deployment process, providing a streamlined way to manage AI model service without sacrificing performance or accuracy . This enhances the project's adaptability and practical usability by reducing infrastructure costs and complexity .

CUDA/GPU plays a significant role in the training of large models by providing the necessary hardware acceleration for efficient processing of computational tasks involved in model fine-tuning . Large models like LLaMA require high memory and compute capacity, which can be inefficiently handled by CPUs alone. GPUs, with their parallel processing capabilities, expedite the training process, thus improving the overall efficiency and feasibility of handling large-scale neural networks in the project .

Integrating Generative AI with news category classifiers addresses the explainability gap inherent in traditional models. While traditional classifiers accurately label articles, they often lack transparency on decision rationale, reducing user trust . Generative AI, by contrast, offers natural language explanations for classifications, enhancing user engagement and transparency . This ability to understand and justify classifications in human-like language makes the system more suitable for real-world deployment in journalism and content moderation .

DistilBERT offers a lightweight and efficient solution for text classification tasks due to its capacity to provide strong performance while being resource-efficient . Its integration with generative models allows for not only accurate categorization of news articles but also the generation of natural language explanations, thus enhancing the functionality of the classification system by improving transparency and offering deeper context .

The Streamlit-based UI supports user engagement and usability through its clean interface and interactive features such as st.text_input() and st.text_area() for users to input headlines or articles . It enhances interaction by providing loading indicators and displaying category-wise color tags which improve the user experience by making it more intuitive . Furthermore, using components like st.slider() and st.selectbox() allows for dynamic user input collection, simulating a ChatGPT-like conversation with st.chat_message(), thus creating an engaging user interaction that facilitates feedback and information flow .

The project utilizes several strategies to optimize loading and execution performance: integrating features like @st.cache_resource in Streamlit to minimize model load times and implementing dynamic loading indicators to manage user expectations during inference . Furthermore, it uses a lightweight runtime via Ollama CLI, which enables local, fast inference without heavy cloud infrastructure . These strategies collectively ensure efficient resource usage and enhance the responsiveness of the AI model in real-time applications .

Data Science Projects for Beginners
No ratings yet
Data Science Projects for Beginners
8 pages
NLP Word Classification in News Articles
No ratings yet
NLP Word Classification in News Articles
10 pages
AI Language Models Overview
No ratings yet
AI Language Models Overview
22 pages
Fake News Detection Using ML and Python
100% (1)
Fake News Detection Using ML and Python
25 pages
PDF Query System with LLMs and LangChain
No ratings yet
PDF Query System with LLMs and LangChain
4 pages
Essential Skills for ML and GenAI
No ratings yet
Essential Skills for ML and GenAI
59 pages
NLP Techniques and Applications Overview
No ratings yet
NLP Techniques and Applications Overview
43 pages
Nidhish Wakodikar: AI & Data Science Skills
No ratings yet
Nidhish Wakodikar: AI & Data Science Skills
1 page
LoRA Fine-Tuning for Llama-2 Performance
No ratings yet
LoRA Fine-Tuning for Llama-2 Performance
4 pages
Essential AI Tools for Learning
No ratings yet
Essential AI Tools for Learning
24 pages
AI Applications and Challenges Overview
No ratings yet
AI Applications and Challenges Overview
38 pages
2023 AI and ML Trends Overview
No ratings yet
2023 AI and ML Trends Overview
16 pages
Introduction to BT4222 Course
No ratings yet
Introduction to BT4222 Course
48 pages
News Headline Dataset Analysis with ML
No ratings yet
News Headline Dataset Analysis with ML
9 pages
Deep Learning for Effective Chatbots
No ratings yet
Deep Learning for Effective Chatbots
22 pages
NLP Lecture Notes and Chatbot Insights
No ratings yet
NLP Lecture Notes and Chatbot Insights
6 pages
AI Professional Workshop
No ratings yet
AI Professional Workshop
32 pages
Clickbait Detection Using NLP Techniques
No ratings yet
Clickbait Detection Using NLP Techniques
10 pages
Build Your Own Gen AI App: A Guide
No ratings yet
Build Your Own Gen AI App: A Guide
6 pages
LLaMA: Efficient Open Language Models
No ratings yet
LLaMA: Efficient Open Language Models
52 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
15 pages
Hands-On Guide to Large Language Models
100% (1)
Hands-On Guide to Large Language Models
59 pages
Faculty Availability Chatbot Project
No ratings yet
Faculty Availability Chatbot Project
28 pages
Deep Learning Models & GPU Frameworks
No ratings yet
Deep Learning Models & GPU Frameworks
15 pages
Understanding Transformers and Transfer Learning
No ratings yet
Understanding Transformers and Transfer Learning
27 pages
Transformers for NLP and Vision Overview
No ratings yet
Transformers for NLP and Vision Overview
150 pages
Building a Large Language Model Guide
No ratings yet
Building a Large Language Model Guide
13 pages
Introduction to Large Language Models
No ratings yet
Introduction to Large Language Models
185 pages
LLMs in Information Management Techniques
No ratings yet
LLMs in Information Management Techniques
68 pages
Introduction to Artificial Intelligence Concepts
No ratings yet
Introduction to Artificial Intelligence Concepts
69 pages
Introduction to Generative Media Models
No ratings yet
Introduction to Generative Media Models
17 pages
NLP and LLMs: Bridging Language and AI
No ratings yet
NLP and LLMs: Bridging Language and AI
38 pages
NLP Techniques and Applications Overview
No ratings yet
NLP Techniques and Applications Overview
8 pages
Understanding Prompt Engineering Basics
No ratings yet
Understanding Prompt Engineering Basics
69 pages
Twitter Sentiment Analysis Using Python TweetX
No ratings yet
Twitter Sentiment Analysis Using Python TweetX
3 pages
Generative AI in NLP Bootcamp Syllabus
No ratings yet
Generative AI in NLP Bootcamp Syllabus
17 pages
Generative AI Roadmap for 2024
100% (1)
Generative AI Roadmap for 2024
36 pages
NLP for Moral Judgement in AITA
No ratings yet
NLP for Moral Judgement in AITA
64 pages
Machine Learning for Fake News Detection
No ratings yet
Machine Learning for Fake News Detection
6 pages
LLMs in Python: A Comprehensive Guide
No ratings yet
LLMs in Python: A Comprehensive Guide
28 pages
SVM and Vectorization for Text Classification
No ratings yet
SVM and Vectorization for Text Classification
97 pages
IIT Guwahati E&ICT Academy Training Programs
No ratings yet
IIT Guwahati E&ICT Academy Training Programs
12 pages
AI Internship Experience and Insights
No ratings yet
AI Internship Experience and Insights
11 pages
Understanding Large Language Models
No ratings yet
Understanding Large Language Models
52 pages
Future Directions in Natural Language Processing
No ratings yet
Future Directions in Natural Language Processing
48 pages
Understanding Generative AI and LLMs
No ratings yet
Understanding Generative AI and LLMs
13 pages
NLP for Fake News Detection
No ratings yet
NLP for Fake News Detection
6 pages
Conversational Image Recognition Chatbot
No ratings yet
Conversational Image Recognition Chatbot
6 pages
Open AI Chatbot Development Project
No ratings yet
Open AI Chatbot Development Project
16 pages
LLM Developer Requirements Guide
No ratings yet
LLM Developer Requirements Guide
12 pages
Open-Source AI: Tools & Resources Guide
No ratings yet
Open-Source AI: Tools & Resources Guide
75 pages
AI Course Overview and Exam Details
No ratings yet
AI Course Overview and Exam Details
96 pages
Fine-tuning Generative Language Models
No ratings yet
Fine-tuning Generative Language Models
4 pages
GenAI Pinnacle Program Overview
No ratings yet
GenAI Pinnacle Program Overview
14 pages
Text Classification Algorithms Overview
No ratings yet
Text Classification Algorithms Overview
8 pages
Overview of Generative AI and LLMs
No ratings yet
Overview of Generative AI and LLMs
77 pages
Overview of GPT Model Features
No ratings yet
Overview of GPT Model Features
30 pages
Data Mining Concepts and Techniques
No ratings yet
Data Mining Concepts and Techniques
6 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
78 pages
RapidMiner for Evaluating Learning Algorithms
No ratings yet
RapidMiner for Evaluating Learning Algorithms
17 pages
Motion Code for Time Series Analysis
No ratings yet
Motion Code for Time Series Analysis
20 pages
Logistics Regression With Orange Python
No ratings yet
Logistics Regression With Orange Python
121 pages
Type 2 Diabetes Prediction with ML
No ratings yet
Type 2 Diabetes Prediction with ML
12 pages
Data Classification Techniques Overview
No ratings yet
Data Classification Techniques Overview
111 pages
PDA-PRGCN: Predicting piRNA-Disease Links
No ratings yet
PDA-PRGCN: Predicting piRNA-Disease Links
18 pages
MRI Model for Predicting Screw Loosening
No ratings yet
MRI Model for Predicting Screw Loosening
31 pages
You Look Like A Thing and I Love You PDF
No ratings yet
You Look Like A Thing and I Love You PDF
120 pages
Eatp SVM Based Classifier For Early Detection of Alzheimers Disease 07 04 2024 2
No ratings yet
Eatp SVM Based Classifier For Early Detection of Alzheimers Disease 07 04 2024 2
12 pages
Data Driven Technologies and Artificial Intelligence in Supply Chain (Mahesh Chand, Vineet Jain, Puneeta Ajmera) - 1
No ratings yet
Data Driven Technologies and Artificial Intelligence in Supply Chain (Mahesh Chand, Vineet Jain, Puneeta Ajmera) - 1
291 pages
Supervised Learning Classification Explained
No ratings yet
Supervised Learning Classification Explained
83 pages
Data Science Life Cycle Explained
No ratings yet
Data Science Life Cycle Explained
12 pages
Optimizing Sentiment Analysis with SVM
No ratings yet
Optimizing Sentiment Analysis with SVM
10 pages
Pneumonia Detection System Using AI
No ratings yet
Pneumonia Detection System Using AI
4 pages
ML vs DL in Computer Vision Explained
No ratings yet
ML vs DL in Computer Vision Explained
19 pages
Backpropagation Assignment with Wandb
No ratings yet
Backpropagation Assignment with Wandb
18 pages
Forecasting: Key Concepts & Techniques
No ratings yet
Forecasting: Key Concepts & Techniques
23 pages
Machine Learning for SAG Mill Energy Prediction
No ratings yet
Machine Learning for SAG Mill Energy Prediction
16 pages
Multi-Label Fault Diagnosis for Pumps
No ratings yet
Multi-Label Fault Diagnosis for Pumps
16 pages
AI Prediction of Tensile Properties in Steel
No ratings yet
AI Prediction of Tensile Properties in Steel
13 pages
Understanding Classification Models in BI
No ratings yet
Understanding Classification Models in BI
13 pages
Deep Learning for Malware Detection
No ratings yet
Deep Learning for Malware Detection
6 pages
Event Management Success in Africa
No ratings yet
Event Management Success in Africa
3 pages
Machine Learning Categories Explained
100% (2)
Machine Learning Categories Explained
12 pages
Bicycle Occlusion Level Classification
No ratings yet
Bicycle Occlusion Level Classification
9 pages
Deep Learning for Chest CT Segmentation
No ratings yet
Deep Learning for Chest CT Segmentation
44 pages
Road Failure Detection System Report
No ratings yet
Road Failure Detection System Report
73 pages
Student Performance Prediction Model
No ratings yet
Student Performance Prediction Model
45 pages