0% found this document useful (0 votes)
5 views7 pages

Missing at Random

Missing at Random (MAR) refers to a situation where the likelihood of data being missing is related to other observed variables but not to the missing value itself. It is important for maintaining the reliability of research, as it allows for the use of advanced statistical techniques to handle missing data without bias. Techniques such as Multiple Imputation and Maximum Likelihood Estimation are commonly used to address MAR in various fields, including finance and healthcare.

Uploaded by

vibrantvibes2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Missing at Random

Missing at Random (MAR) refers to a situation where the likelihood of data being missing is related to other observed variables but not to the missing value itself. It is important for maintaining the reliability of research, as it allows for the use of advanced statistical techniques to handle missing data without bias. Techniques such as Multiple Imputation and Maximum Likelihood Estimation are commonly used to address MAR in various fields, including finance and healthcare.

Uploaded by

vibrantvibes2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

MISSING AT RANDOM (MAR)

Introduction to Missing Data


Missing data refers to situations where some values are not
available in a dataset. This can happen for many reasons such
as mistakes during data entry, technical problems in systems,
non-response in surveys, or equipment failure.
When data is missing, it can affect the quality and reliability
of research. If important values are absent, the results may not
properly represent reality. Therefore, identifying the reason
behind missing data is very important before conducting
analysis.
Example:
Suppose a financial analyst is examining the daily closing
prices of a company listed on the stock market. Due to a
technical issue in the trading platform, the closing prices for
two specific days were not recorded. As a result, those days
show missing values in the dataset.

Impact of Missing Data


Missing data can create several problems in research and
analysis:
 It may introduce bias in results.
 It reduces statistical power because the effective sample
size becomes smaller.
 It can weaken the overall reliability and validity of the
research findings.
Types of Missing Data
There are three main categories of missing data:
1. Missing Completely at Random (MCAR)
2. Missing at Random (MAR)
3. Missing Not at Random (MNAR)

Meaning of Missing at Random (MAR)


Missing at Random (MAR) occurs when the likelihood of
data being missing is related to other observed variables in the
dataset, but not related to the actual missing value itself.
In simple words, the reason why the data is missing can be
explained by information that is already available in the
dataset. Once these observed factors are considered, the
missing value does not depend on its own actual value.
MAR is widely used in statistical analysis because many
advanced methods for handling missing data are based on this
assumption.
Example:
During peak trading hours in the stock market, some
transaction records may not be saved because the system
becomes overloaded. The missingness is related to trading
volume (which is observed), not to the actual transaction
value.

Characteristics of Missing at Random


1. Missingness Depends on Observed Variables
In MAR, the probability that a value is missing is connected
to other recorded variables. For example, in financial
research, missing transaction data may depend on the time of
day or trading volume.
2. Not Dependent on the Missing Value Itself
After considering the observed variables, the missing data
does not depend on whether the missing value is high or low.
3. Detectable Patterns
Researchers can study patterns between missing values and
observed variables using statistical techniques such as
correlation or regression analysis.
4. Assumption-Based Concept
MAR cannot be directly tested with certainty. It is assumed
based on logical reasoning, subject knowledge, and
examination of data patterns.
5. Suitable for Advanced Techniques
When the MAR assumption holds, researchers can use
methods such as Multiple Imputation and Maximum
Likelihood Estimation to manage missing data effectively.

Importance of MAR
1. Reduces Bias
If appropriate statistical techniques are used, MAR helps in
obtaining unbiased and reliable results.
2. Practical and Realistic
In real-world situations, data is often missing due to known
and recorded factors. MAR reflects this practical condition
better than assuming data is completely random.
3. Supports Modern Statistical Methods
Many advanced analytical techniques are designed under the
MAR assumption, leading to better estimation and
interpretation.
4. Avoids Unnecessary Data Deletion
Instead of removing incomplete records, MAR allows
researchers to estimate missing values, which preserves
valuable information.
5. Maintains Statistical Power
By keeping more observations in the dataset, the accuracy and
strength of the study remain higher.
6. Useful in Business and Finance
Missing data frequently occurs in financial research and stock
market studies. Treating such data under MAR helps maintain
the accuracy of financial models and improves decision-
making.

Methods to Handle Missing at Random (MAR)


1. Multiple Imputation
This technique replaces missing values with estimated values
multiple times to create several complete datasets. Each
dataset is analyzed separately, and the results are combined. It
reduces bias and improves accuracy under MAR.
2. Maximum Likelihood Estimation (MLE)
MLE uses the available observed data to estimate parameters
directly without filling in missing values. It provides efficient
and unbiased estimates when MAR is assumed.
3. Expectation–Maximization (EM) Algorithm
The EM algorithm works in two repeating steps. First, it
estimates missing values (Expectation step). Then, it updates
model parameters (Maximization step). This process
continues until stable results are achieved.
4. Regression Imputation
In this method, missing values are predicted using a
regression equation based on other observed variables. It is
simple but must be applied carefully to avoid reducing
variability.
5. Full Information Maximum Likelihood (FIML)
FIML uses all available information in the dataset without
deleting incomplete cases. It directly estimates model
parameters and works well under MAR.
6. Weighted Estimation
This method assigns weights to observed data based on the
probability of missingness. It helps in adjusting bias,
especially in survey research.
Case Study: Missing at Random (MAR) in Mutual Fund
Investment Research
Title: Missing Investment Amount in Mutual Fund Study
Background
A financial research company conducted a study to examine
the relationship between investor age, investment experience,
type of mutual fund, and amount invested. While most
variables were recorded, some investors did not disclose their
investment amount.
Problem
Approximately 20% of the investment amount data was
missing.
After analyzing the dataset, researchers noticed that missing
values were more common among new investors. Since
investment experience was already recorded, the missingness
appeared to be linked to this observed variable.
Why It Is MAR
The probability of missing investment amounts depended on
investment experience, which was available in the dataset.
However, within each experience group, the missingness did
not depend on the actual investment amount.
Therefore, the data was categorized as Missing at Random
(MAR).
Method Used
Researchers applied Regression Imputation using age and
investment experience to estimate the missing investment
amounts instead of deleting incomplete records.

Conclusion
To conclude, Missing at Random (MAR) is a key concept in
data analysis and research methodology. It describes
situations where missing data depends on observed variables
but not on the missing value itself.
MAR is more realistic than assuming complete randomness
and is frequently observed in business, finance, healthcare,
and social science research. By applying appropriate
techniques such as Multiple Imputation or Maximum
Likelihood methods, researchers can minimize bias and
maintain accuracy.
Proper identification and treatment of MAR ensure valid,
reliable, and trustworthy research outcomes.

You might also like