BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 1
INTRODUCTION
1.1 Python Programming Language
Python is a high-level, general-purpose programming language. Its design philosophy emphasizes
code readability with the use of significant indentation.
Python is dynamically-typed and garbage-collected. It supports multiple programming
paradigms,including structured (particularly procedural), object-oriented and functional
programming. It is often described as a "batteries included" language due to its comprehensive
standard library.
Guido van Rossum began working on Python in the late 1980s as a successor to the ABC
programming language and first released it in 1991 as Python 0.9.0. Python 2.0 was released in 2000
and introduced new features such as list comprehensions, cycle-detecting garbage collection,
reference counting, and Unicode support. Python 3.0, released in 2008, was a majorrevision that is not
completely backward-compatible with earlier versions. Python 2 was discontinued in version 2.7.18
in 2020.
Python consistently ranks as one of the most popular programming languages.
Why Python?
Python has multiple Libraries and Frameworks most popular libraries are TensorFlow, Scikit-Learn,
NumPy, Keras, Theano, and Pandas. Python is used in Big Data and Machine [Link] is used
in Web Development’s most popular sites in the world like Spotify, Instagram, Pinterest, Mozilla
Firefox, Yelp, etc.
Python's popularity has risen dramatically in recent years and shows no signs of slowing. According
to the Web, the programming language Python ranks second in popularity. Python has seen
impressive growth of about 50 percent in the last year. We anticipate that Python's ranking will rise
by another 50 percent in 2022.
1.2 Artificial Intelligence and ML
Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions. The term may also be applied to
Dept. of CSE Page 1
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
any machine that exhibits traits associated with a human mind such as learning and problem-solving.
The main characteristics of AI are autonomy and adaptivity.
Autonomy: The ability to perform tasks in complex environments without constant guidanceby a
user.
Adaptivity: The ability to improve performance by learning from experience.
Examples: self-driving cars, content recommendation (Netflix), GPS navigation systems for
providing suggestions on the best route, and smart assistants like Siri and Alexa.
The ideal characteristic of artificial intelligence is its ability to rationalize and take actions thathave
the best chance of achieving a specific goal. A subset of artificial intelligence is machine learning
(ML), which refers to the concept that computer programs can automatically learn from and adapt to
new data without being assisted by humans. Deep learning techniques enablethis automatic learning
through the absorption of huge amounts of unstructured data such as text, images, or video.
1.3 Motivation
Day by day competition among different shopping malls as well as big marts is getting more serious
and aggressive only due to the rapid growth of the global malls and on-line shopping. Every mall or
mart is trying to provide personalized and short-time offers for attracting more customers depending
upon the day, such that the volume of sales for each item can be predicted for inventory management
of the organization, logistics and transport service, etc. Present machine learning algorithm are very
sophisticated and provide techniques to predict or forecast the future demand of sales for an
organization, which also helps in overcoming the cheap availability of computing and storage systems.
Big Mart is a Grocery Super Market Brand. Big Mart Brand has started out its journey with free home
delivery offerings of food and grocery. Big Mart lets in you to walk far away from the drudgery of
grocery shopping and welcome a clean comfortable way of browsing and shopping for groceries.
Discover new merchandise and shop for all of your food and grocery desires from the comfort of your
private home or workplace. No greater getting stuck in traffic jams, procuring parking, standing in
long queues and wearing heavy bags – get everything you want when you want, right at the doorstep.
Dept. of CSE Page 2
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
1.4 Problem Statement and Objectives
Most of the business organizations heavily depend on a knowledge base and demand prediction of
sales trends. Sales forecasting is the process of estimating future sales. Accurate sales forecasts enable
companies to make informed business decisions and predict short-term and long-term performance.
Companies can base their forecasts on past sales data, industrywide comparisons, and economic
trends. Sales forecasts help sales teams achieve their goals by identifying early warning signals in
their sales pipeline and course correct before it’s too late.
➢ Objectives
The goal is to improve the accuracy from the existing project. So that the sales and profit could be
increased for the companies. Choosing an efficient algorithm from comparing different algorithms to
improve the prediction further more.
The objectives of this thesis are:
• Performing data pre-processing and exploratory data analysis on the data set.
• Performing data encoding to convert categorical features into numerical form.
• Splitting the data set into training and testing sets
• Perform feature scaling to normalize the range of individual features.
• Building ML models using training data.
• Evaluating the models on the test set using the metrics such as Accuracy
• Comparing the performances of these models and identifying the efficient algorithm.
Dept. of CSE Page 3
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 2
INTERNSHIP WORKFLOW
Day 1: Discussion on the area of internship work.
Day 2: Discussion on the OS platform to work on.
Day 3: Installation of python IDE.
Day 4: Discussion on the basic Python like print and input.
Day 5: Discussion on the python datatypes like list, tuple.
Day 6: Discussion on the datatypes like dictionaries.
Day 7: Discussion on the if and elif statements.
Day 8: Discussion on the for and while loops.
Day 9: Discussion on the functions and scopes.
Day 10: More Discussion on functions and recursion.
Day 11: Discussion on the OOPs like class,objects.
Day 12: Discussion on the OOPs concepts like Abstraction, Polymorphism .
Day 13: Discussion on the OOPs concepts like inheritance.
Day 14: Discussion on the matplotlib library.
Day 15: Discussion on the numpy library.
Day 16: Discussion on the pandas library.
Day 17: Discussion on the seaborn libraries.
Day 18: Implementation of seaborn on different plots.
Day19: Discussion on different types of graphs in data visualization.
Day 20: Implementation of the libraries of numpy,pandas , matplotlib on ipl dataset .
Day 21: Implementation of libraries on tips dataset.
Day 22: Basics of data science like mean, median, mode and SD.
Dept. of CSE Page 4
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Day 23: Implementation of data science on probability and logistic regression.
Day 24: A data science project with the analysis of dataset.
Day 25: Discussion on Machine learning basics .
Day 26: Discussion on machine learning algorithms.
Day 27: A final project demo on data science and machine learning of project work..
Dept. of CSE Page 5
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 3
LITERATURE SURVEY
➢ Sunitha cheriyan “Intelligent Sales Prediction Using Machine Learning Techniques.”
The detailed study and analysis of comprehensible predictive models to improve future sales
predictions are carried out in this research. Traditional forecast systems are difficult to deal with
the big data and accuracy of sales forecasting. The models implemented for prediction are Random
Forest, Gradient Boosting and Extremely Randomized Trees (Extra Trees) [Link]
Trees was confirmed to be a very effective.[1]
➢ Shuyun Ren “Forecasting the Retail Sales of China’s Catering Industry Using Support
Vector Machines.”
The forecast of China's catering retail sales was studied in this paper. The seasonal impact was
considered in the forecasting. The retail sales were predicted using the seasonal auto-regressive
integrated moving average (ARIMA) model. ARIMA, SVM. SVM method is obviously superior
to the seasonal ARIMA method regardless of the long-term forecasting or the shortterm
forecasting.[2]
➢ Avinash kumar Sharma “An Intelligent Model For Predicting the Sales of a Product.”
The approach shown in this paper is a systematic, accurate and precise model building to be used
in computing and predicting current scenario and future projection of a product in market
respectively. Random forest algorithm, neural network. Neural network.[3]
➢ Renesa Ray “Sales Prediction Using Machine Learning Algorithms.”
The aim of this paper is to propose a dimension for predicting the future sales of Big Mart
Companies keeping in view the sales of previous years. A comprehensive study of sales prediction
is done using Machine Learning models. Linear Regression, K-Neighbours Regressor, XGBoost,
Regressor and Random Forest Regressor. Random Forest Algorithm is found to be the most
suitable[4]
➢ Pratik patil “Comparison of Different Machine Learning Algorithms for Multiple
Regression on Black Friday Sales Data.”
This study focuses on the field of prediction models to develop an accurate and efficient algorithm
Dept. of CSE Page 6
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
to analyze the customer spending in the past and output the future spending of the customers with
same features. Regression, Decision Tree,XGBoost.[5]
➢ Auddie heichmeir“Forecasting of Walmart Sales using Machine Learning Algorithm.”
The ability to predict data accurately is extremely valuable in a vast array of domains such as
stocks, sales, weather or even sports.,consisting of weekly retail sales numbers from different
departments in Walmart retail outlets all over the United States of [Link] models
implemented for prediction are Random Forest, Gradient Boosting and Extremely Randomized
Trees (Extra Trees) Classifiers. Random Trees was confirmed to be a very effective.[6]
➢ Narayana R “Sales Prediction For Big Mart”.
A retailer company wants a model that can predict accurate sales so that it can keep track of
customers future demand and update in advance the sale inventory. In this work, we propose a
technique to optimize the parameters and select the best tuning hyper parameters, further ensemble
with Xgboost techniques for forecasting the future sales of a retailer company such as Big Mart
and we found our model produces the better result. Xgboost techniques. Experimental analysis
found our technique produce more accurate[7]
➢ Kaneko and Yada “A Deep Learning Approach for the Prediction of Retail Store Sales.”
The purpose of this research is to construct a sales prediction model for retail stores using the deep
learning approach, which has gained significant attention in the rapidly developing field of machine
learning in recent years. Using such a model for analysis, an approach to store management could
be formulated . Logistic regression model The accuracy decreased by around 13% when the logistic
regression model was used.[8]
Dept. of CSE Page 7
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 4
SOFTWARE INSTALLATIONS
4.1 Installation of Python IDE
1) To download and install Python, visit the official website of Python
[Link] and choose your version.
2) Once the download is completed, run the .exe file to install Python. Now click on InstallNow.
3) You can see Python installing at this point.
4) When it finishes, you can see a screen that says the Setup was successful. Now click on“Close”.
4.2 Installation of Jupyter Notebook
1) Go to the Anaconda website ([Link] and download the
Anaconda distribution for your operating system (Windows, macOS, or Linux).
2) Follow the installation instructions for your platform, and make sure to add Anaconda to your
system's PATH during installation.
3) After installing Anaconda, you can open Anaconda Navigator, which is a graphical user interface
for managing your Anaconda environment and packages.
4) Install Jupyter Notebook:
➢ In Anaconda Navigator, switch to the "Home" tab.
➢ Select your desired environment (or the base environment if you skipped step 3).
➢ In the "Applications on" dropdown, select "Not Installed."
➢ Search for "Jupyter Notebook" and click the "Install" button next to it.
5) Once Jupyter Notebook is installed, you can launch it from Anaconda Navigator. In the "Home"
tab, select your environment (or the base environment), and click the "Launch" button next to
"Jupyter Notebook."
6) After launching Jupyter Notebook, your web browser should open to the Jupyter Notebook
dashboard. From there, you can create a new Jupyter Notebook by clicking on the "New" button
and selecting a Python environment (e.g., Python 3) for your new notebook.
Dept. of CSE Page 8
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
4.3 Installation of python-based AI libraries
1) Start by opening Jupyter Notebook.
2) In the Jupyter Notebook dashboard, create a new Jupyter Notebook by clicking on the "New"
button and selecting a Python environment (e.g., Python 3) for your new notebook.
3) You can install Python-based AI libraries directly within a Jupyter Notebook cell by using the !
(exclamation mark) to run terminal commands. For example, to install libraries like NumPy,
pandas, Matplotlib, and scikit-learn.
4) you can create a new cell and run: ” !pip install numpy pandas matplotlib scikit-learn ”
5) After the installation is complete, you can import the libraries into your Jupyter Notebook and verify
that they work.
Dept. of CSE Page 9
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 5
ALGORITHMS AND METHODOLOGY
5.1 Algorithm :
In this research, various supervised machine learning algorithms are described, demonstrated and
assessed in their ability to predict sales. This section provides a general overview of the theory
behind these algorithms.
1. Decision Tree (DT)
Decision tree is a supervised method which builds classification or regression models in a tree-like
structure. It is an established method that was first published in 1963 by Morgan and Sonquist .
The decision tree method is:
(1) conceptually easy yet powerful ;
(2) intuitive for interpretation;
(3) capable of handling missing values and mixed features ; and
(4) able to select variables automatically .
However, its predictive power is not overly competitive. Decision tree is usually not stable with high
model variance and small variations in the input data would result in a large effect on the tree
structure .
2. Random Forests (RF)
Random forests take an ensemble approach that provides an improvement over the basic decision
tree structure by combining a group of weak learners to form a stronger learner (see the paper by
Breiman). Ensemble methods utilize a divide-and- conquer approach to improve algorithm
performance. In random forests, a number of decision trees, i.e., weak learners, are built on
bootstrapped training sets, and a random sample of m predictors are chosen as split candidates from
the full set P predictors for each decision tree. As m P, the majority of the predictors are not
considered. In this case, all of the individual trees are unlikely to be dominated by a few influential
predictors. By taking the average of these uncorrelated trees, a reduction in variance can be attained
making the final result less variable and more reliable
Dept. of CSE Page 10
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
3. Bagging and Boosting Classifier
Bagging and Boosting are two types of Ensemble Learning. These two decrease the variance
of a single estimate as they combine several estimates from different models. So the result may
be a model with higher stability. Let’s understand these two terms in a glimpse.
• Bagging
Bootstrap Aggregating, also known as bagging, is a machine learning ensemble meta-algorithm
designed to improve the stability and accuracy of machine learning algorithms used in statistical
classification and regression. It decreases the variance and helps to avoid overfitting. It is
usually applied to decision tree methods. Bagging is a special case of the model averaging
approach.
▪ Implementation Steps of Bagging
Step 1: Multiple subsets are created from the original data set with equal tuples, selecting
observations with replacement.
Step 2: A base model is created on each of these subsets.
Step 3: Each model is learned in parallel with each training set and independent of each other.
Step 4: The final predictions are determined by combining the predictions from all the models.
• Boosting Algorithms
There are several boosting algorithms. The original ones, proposed by Robert
Schapire and Yoav Freund were not adaptive and could not take full advantage of the weak
learners. Schapire and Freund then developed AdaBoost, an adaptive boosting algorithm that won
the prestigious Gödel Prize. AdaBoost was the first really successful boosting algorithm
developed for the purpose of binary classification. AdaBoost is short for Adaptive Boosting and
is a very popular boosting technique that combines multiple “weak classifiers” into a single
“strong classifier”.
▪ Algorithm:
1. Initialise the dataset and assign equal weight to each of the data point.
2. Provide this as input to the model and identify the wrongly classified data points.
Dept. of CSE Page 11
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
3. Increase the weight of the wrongly classified data points and decrease the weights of correctly
classified data points. And then normalize the weights of all data points
4. if (got required results)
Goto step 5
else
Goto step 2
5. End
4. Linear Regression (LR)
Linear Regression is one of the common ML and data analysis technique. This algorithm is helpful
for forecasting based on linear regression equation. The Linear regression technique is the type of
regression, which combines the set of independent features(x) to predict the output value(y) or
dependent variable. The linear equation assigns a factor to each independent variable called
coefficients represented by β.
5. Ridge Method (RM)
Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios
where the independent variables are highly correlated. It has been used in many fields including
econometrics, chemistry, and engineering. Also known as Tikhonov regularization, named
for Andrey Tikhonov, it is a method of regularization of ill-posed problems. It is particularly useful to
mitigate the problem of multicollinearity in linear regression, which commonly occurs in models with
large numbers of parameters. In general, the method provides improved efficiency in parameter
estimation problems in exchange for a tolerable amount of bias (see bias–variance tradeoff).
6. Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent algorithm that is used for
optimizing machine learning models. It addresses the computational inefficiency of traditional
Gradient Descent methods when dealing with large datasets in machine learning projects. In SGD,
instead of using the entire dataset for each iteration, only a single random training example (or a
small batch) is selected to calculate the gradient and update the model [Link] random
selection introduces randomness into the optimization process, hence the term “stochastic” in
stochastic Gradient Descen.
Dept. of CSE Page 12
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Fig 5:Bigmart flow chart
7. K-Nearest Neighbors (KNN)
K-nearest neighbors is a non-parametric algorithm used for classification and regres- sionproblems.
For classification problems, the idea is to identify the K data points in the training data that are closest
to the new instance and classify this new instance by a majority vote of its K neighbors. In practice,
the popular distance measures include the Euclidean distance, the Manhattan distance as well as the
Minkowski distance. For regression problems, the idea is to calculate the new instance value by
taking the average of its K neighbors. KNN could work well with a small number of features, but it
struggles when the feature dimensions increase drastically. See the book by Friedman, Hastie and
Tibshirani and the book by Murphy for further information.
Dept. of CSE Page 13
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
5.2 Methodology:
Sales prediction is preferably a regression problem than a time series problem. Practice shows that the
use of regression procedures can often supply us better results comparing with time series techniques.
Machine learning algorithms make it possible to find patterns in the time series. BigMart sales dataset
consists of 2013 sales data for 1559 products throughout 10 special stores in unique towns.
We have 2 dataset the train dataset which has 8523 rows and 12 features and the test dataset which has
5681 rows and 11 columns. The train dataset has 1 extra column which is the target variable. We will
predict this target variable for the test dataset. Calculations done in the Python environment using the
main packages pandas, sklearn, numpy, matplotlib, seaborn etc. To conduct the analysis, we will be
using Jupyter Notebook.
The goal of the BigMart sales prediction ML challenge is to build a regression model for expecting the
sales of every of 1559 products for the following year in every of the 10 specific BigMart stores. The
BigMart sales dataset additionally includes certain attributes for each product and store. This model
allows BigMart to know the properties of products and stores that play an essential position in growing
their universal sales. We divided the entire analysis process to following five stages:
1. Exploratory data analysis (EDA)
2. Data Pre-processing
3. Feature engineering & Feature Transformation
4. Modeling
5. Hyperparameter tuning and Evaluation
Each step is explained below in details.
1. Exploratory data analysis (EDA)
In this phase useful information about the data has been extracted from the dataset. That is trying
to identify the information from hypotheses vs available data. Which shows that the attributes
Outlet size and Item weight face the problem of missing values, also the minimum value of Item
Visibility is zero which is not actually practically possible. Establishment year of Outlet varies
from 1985 to 2009. These values may not be appropriate in this form. So, we need to convert them
into how old a particular outlet is. There are 1559 unique products, as well as 10 unique outlets,
present in the dataset. The attribute Item type contains 16 unique values. Where as two types of
Item Fat Content are there but some of them are misspelled as regular instead of ’Regular’ and low
Dept. of CSE Page 14
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
fat, LF instead of Low Fat.
2. Data Cleaning
It was observed from the previous section that the attributes Outlet Size and Item Weight has
missing values. In our work in case of Outlet Size missing value we replace it by the mode of that
attribute and for the Item Weight missing values we replace by mean of that particular attribute.
The missing attributes are numerical where the replacement by mean and mode diminishes the
correlation among imputed attributes. For our model we are assuming that there is no relationship
between the measured attribute and imputed attribute
3. Feature Engineering & Feature Transformation
Some nuances were observed in the data-set during data exploration phase. So, this phase is used
in resolving all nuances found from the dataset and make them ready for building the appropriate
model. During this phase it was noticed that the Item visibility attribute had a zero value, practically
which has no sense. So, the mean value item visibility of that product will be used for zero values
attribute. This makes all products likely to sell. All categorical attributes discrepancies are resolved
by modifying all categorical attributes into appropriate ones. In some cases, it was noticed that
non-consumables and fat content property are not specified. To avoid this, we create a third
category of Item fat content i.e. none. In the Item Identifier attribute, it was found that the unique
ID starts with either DR or FD or NC. So, we create a new attribute Item Type New with three
categories like Foods, Drinks and Non-consumables. Finally, for determining how old a particular
outlet is, we add an additional attribute Year to the dataset.
4. Model Building
After completing the previous phases, the dataset is now ready to build proposed model. Once the
model is built it is used as predictive model to forecast sales of Big Mart. In our work, we make
model based on different algorithms such as Random Forest algorithm, Linear regression, Lasso
Regression, Ridge regression, Decision tree etc. and compare it with other machine learning
techniques. All models received features as input, which are then segregated into training and test
set. The test dataset is used for sales prediction.
5. Hyperparameter tuning and Evaluation
The next and final step in our project is the tuning of different parameters in every model and saw
improvement in model performance. While this is an important step in modeling, it is by no means
the only way to improve performance.
Dept. of CSE Page 15
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 6
TESTING
Maximum
AUC AUC Run-time Memory
Algorithm (Training) (Holdout) (Training) Utilization
(Of 16 GB)
XGBoost 0.88 0.86 16 min 12 sec 12%
Logistic Regression 0.66 0.50 52 sec 20%
Naïve Bayesian 0.64 0.59 59 sec 20%
Random Forest
(Depth controlled) 23 min 10 sec 29%
0.79 0.51
SVM (RBF 105 min 30 sec 21%
kernel) 0.68 0.52
LDA 0.74 0.52 6 min 51 sec 35%
KNN
(Euclidean distance) 0.52 0.5 180 min 12 seca 35%
TABEL 6 :- MODEL TESTING
Dept. of CSE Page 16
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 7
RESULTS
Fig 7.1: Head of Dataset
Fig 7.2: item_Outlet Sales Distribution
Dept. of CSE Page 17
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Fig 7.3: Corelation Heat map
Fig 7.4: Barplot between outlet and item_type
Fig 7.5: Barplot between count and outlet size
Dept. of CSE Page 18
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Fig 7.6: Distribution of Variable Outlet location type.
Fig 7.7: Barplot between count and Outlet type.
Fig 7.8: Impact of outlet identifier on outlet sale.
Dept. of CSE Page 19
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
Chapter 8
CONCLUSION
In present era of digitally connected world every shop demand of product sales or user demands.
Extensive research in this area at enterprise level is happening for accurate sales prediction. As the
profit made by a company is directly proportional to the accurate predictions of sales, the Big marts are
desiring more accurate prediction algorithm so that the company will not su er any ff losses. In this
work, we have designed a predictive model by modifying Random Forest technique and experimented
it on the 2013 Big Mart dataset for predicting sales of the product from a particular outlet. Experiments
support that our technique produces more accurate prediction compared to than other available
techniques like decision trees, ridge regression etc.
Dept. of CSE Page 20
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
KEY TAKEAWAYS FROM THE INTERNSHIP
▪ Through this internship, we learned about the importance of teamwork.
▪ It helped us to put our knowledge and skills in the respective fields to practice.
▪ It helped us in the improvement of communication skills and learned about theimportance of good
communication.
▪ It helped us understand the importance of feedback and work on the same.
▪ It helped us improve debugging skills in the written lines of code.
▪ It helped us to discover different software and packages available.
Dept. of CSE Page 21
BIGMART SALES PREDICTION USING DATA SCIENCE 2023-24
REFERENCES
[1] Sunitha Cheriyan, Shaniba Ibrahim, Saju Mohanan & Susan Treesa (2018) Intelligent Sales Prediction
Using Machine Learning Techniques.
[2] Xiangsheng Xie & Gang Hu (2008). Forecasting the Retail Sales of China’s Catering Industry.
[3] Avinash kumar, Neha Gopal & Jatin Rajput(2020). An Intelligent Model For Predicting the Sales of
Product.
[4] Purvika Bajaj, Renesa Ray, Shivani Shedge & Shravani Vidhate(2020). SALES PREDICTION
USING MACHINE LEARNING ALGORITHMS.
[5] Ching-Seh (Mike) Wu. Pratik Patil & Saravana Gunaseelan(2018). Comparison of Different Machine
Learning Algorithms for Multiple Regression on Black Friday Sales Data.
[6 ] Nikhil Sunil Elias, Seema Singh(2019).FORECASTING of WALMART SALES using MACHINE
LEARNING ALGORITHMS.
[7] Yuta Kaneko & Katsutoshi Yada(2016). A Deep Learning Approach for the Prediction of Retail Store
Sales.
[8] Gopal Behera & Neeta Nain (2019). Sales Prediction For Big Mart.
Dept. of CSE Page 22
Sales Analysis: Analyze sales data for a small business, identify trends and making sales predictions 23
Dept. of CSE 2023-24