Data Analysis: Variable Transformation Techniques
Data Analysis: Variable Transformation Techniques
Data Analysis – 1
Transformation of variables
• The process of changing data from original form to a form that is more suitable to perform data
analysis.
• This approach is restricted to receding variables where the categories or values have a
natural order from low to high.
• Often the meaning of a particular response to a question is best interpreted in relative
than in absolute terms.
• For example, how are we to regard the income level of a person who earns Rs. 30,000 a
month: is this low, medium or high?
• It depends on the other incomes with which it is compared. If most people earn less than
30,000, then it is relatively high; if most earn more, then it may be relatively low.
• Classify a particular value of a variable as high or low depending on the values of other people in
the sample.
• This approach to collapsing categories has the advantage of letting the data define what is low,
medium or high rather than us imposing some external, unrealistic definition.
Univariate Analysis
• Descriptive analysis
• 1. Tables,
• 2. Graphs, and
• 3. Summary statistics.
1. Frequency table
• Required information:
• Avoid unnecessary clutter. Normally the following information should be provided:
• 1 Number the table and give title;
• 2 Give labels for the categories of the variable;
• 3 Give column headings to indicate what the numbers in the column represent;
• 4 Give the total number on which the percentages are based;
• 5 Give the number of missing cases, if any;
• 6 Give the source of the data.
Let us take job satisfaction
• Assumption
• The variable was measured with multiple dimensions, elements, and statements (questions).
• It is an ordinal level variable.
• The data were converted into a score index which was further divided into three categories high
job satisfaction , medium job satisfaction, and low job satisfaction.
• SPSS program has give a frequency table.
• Table 1. Level of job satisfaction of the employees
Level of job Frequency Percent Valid percent Cumulative %
satisfaction
• Different ways:
• A scan of the valid percentages can quickly give a sense of the heterogeneity (diversity) and
homogeneity (similarity) in the sample.
• Cases divided into three categories of high level of motivation, medium level of motivation, and
low level of motivation.
2. Graphic analysis
• Univariate distributions can often be displayed effectively with graphs.
• Range of graphs that can be used to display the distributions.
• Bar chart, line graphs, area graphs, histogram, pie chart.
3. Summary of descriptive statistics
• Characteristics can be summarized with simple and concise statistical measures
• Range of univariate statistics designed for this purpose.
• The choice of statistic depends on the level of measurement of the variable and the aspect of
the distribution to be summarized.
• Central tendency: mean, median, mode.
Inferential Statistics
• Does the pattern in the sample reflect the pattern in the population from which the sample was
drawn?
• Answering this type of question is the purpose of inferential analysis.
• There are two main approaches to inferential analysis: significance testing and making interval
estimates.
• The particular methods of doing inferential analysis depends on the level of measurement of the
variables.
• Significance tests for nominal and ordinal variables
Logic of significance testing
• It is standard to begin analysis by assuming a particular pattern in the population.
• For example, assume that the distribution of cases is even across the categories of the variable
(e.g. 50 per cent in both categories of a two-category variable). This assumption about the
population is called a null hypothesis.
• Examine the actual pattern in the sample. Is there the same percentage of cases in each
category in the sample? It came out 56/44.
• It is unlikely that the distribution of cases in the sample will exactly match the assumption made
about the population.
• The sample observation of 56/44 deviates from the 50/50 assumption for the population.
• There are two ways of interpreting the discrepancy between our assumption and the sample
observation.
• 1 The sample is unrepresentative. Despite random sampling techniques we can still obtain poor
samples. This is called sampling error. samples.
• 2 The assumption of equal percentages in the population is incorrect. The difference between
the pattern in the sample and the assumption for the population is much greater than can be
accounted for by sampling error. If this is so then we would reject the null hypothesis of equal
percentages in both categories in the population.
Tests of statistical significance
• When we have two alternative ways of interpreting results (sampling error vs. real differences)
we have to have a way of working out which interpretation is correct. We do this with tests of
statistical significance.
• If there is no difference in the percentage of people in each category of the variable in the
population, how likely is it that we would obtain a random sample in which sampling error
produced a difference between categories as big as we have observed?
• It is conventional to say that there is a chance that not more than five out of 100 samples would
produce such differences due simply to sampling error.
• Our particular sample could have been one of those five.
Level of significance
• Level of confidence in the findings. 0.05 means 95% confidence.
• If we have a random sample then probability theory again provides the answer. If we took a
large number of random samples most will come up with percentage estimates close to that
which actually exists in the population. In only a few samples will the sample estimates be way
off the mark. In fact the sample estimates would approximate a ‘normal’ distribution
One sample chi-square test
• Where the variable has three or more categories test whether the differences between the
percentages across the categories is due to chance or is likely to reflect real percentage
differences in the population.
• Assuming that the percentages in the population will be the same in all categories of the
variable. Null hyp.
• Test to see if the sample fits this assumption.
• The one sample chi-square test is used to assess whether any misfit between the sample
patterns and population assumptions is likely to be due to sampling error.
Bivariate analysis
Relationship between the variables
• Two variables are associated or related when the distribution of values on one variable differs
for different values of the other.
• When subgroups (defined by belonging to one category or another of a given variable) differ
systematically on another variable the variables are associated.
• Bivariate analysis provides a systematic way of measuring whether two variables are related and
if so how strongly they are related.
Explanatory research
• Univariate analysis describes variation.
• Bivariate analysis provides explanation for the said variation in the variable.
• Variety of ways of establishing whether two variables are related.
• Which methods are used depends on:
• the level of measurement of the variables,
• the number of categories of each variable, and
• the audience to which the analysis is directed.
Tabular method
• Cross-tabulations: displaying data for detecting an association between two variables.
• A set of frequency tables set side by side in one table.
• Let us take the two variables portrayed in the two univariate tables. Cross-tabulate.
• Six categories of each variable in the tables.
• Collapse the categories. Recode.
Elements of the crosstable
• A cross-tabulation consists of:
• 1 Labels and title: the title indicates the two variables being cross-tabulated (dependent variable
by independent variable), labels are provided for variables and for each category of both
variables.
• 2 Rows and columns: one column is allocated for each category of one variable and one row for
each category of the 2nd variable.
• 3 Cells and cell contents: cells represent cases who have both the characteristics indicated by the
column and the characteristic indicated by the row. The contents of each cell may be the
number of cases that have the two characteristics or the percentage with those characteristics.
• 4. Marginals
Marginals
• Marginals: these represent the total number or percentage of cases in a particular category of a
variable. These numbers will be very similar to the numbers in a frequency table for the same
variable.
• Crosstabulation puts the data of two univariate (frequency) tables in one.
• One variable on one axis and the other on the other.
• Where XY axes meet the values of the XY variables are the lowest. Therefore the columns with
lower levels of the two variables should be there.
Marginals
Table 3. Employees level of motivation by their level of job
Level of job satisfaction .
• Level of Low Medium High Total .
• Motivation F F F F .
• High 380
• Medium 572
• Low 465 .
• Total 475 503 439 1417 .
• Missing cases = 279
• Source: Field data.
Complete this table
• Keeping in view the research hypothesis complete this in such a way that that the hypothesis is
validated.
• Hypothesis: There is a positive association between the employees level of motivation and their
level of job satisfaction. Prove that:
• Variables are associated.
• The association is significant.
• Association is in the proposed direction.
• Is it linear?
Table 3. Employees level of motivation by their level of job satisfaction
Level of job satisfaction .
• Level of Low Medium High Total .
• Motivation F F F F .
• High 25 30 325 380
• Medium 150 338 84 572
• Low 300 135 30 465 .
• Total 475 503 439 1417 .
• Missing cases = 279
• Source: Field data
Percentaging a cross-table
• Easier to interpret percentages than raw numbers when trying to detect association in a table.
• Convert cell frequencies into percentages.
• Can convert each cell frequency into three different percentages, each having an entirely
different meaning.
Steps in detecting relationship
• 1 Determine which variable is to be treated as X.
• 2 Choose appropriate cell percentages:
• 3 Compare the percentages for each subgroup of the independent variable within one category
of the dependent variable at a time.
• 4 If the independent variable is across the top, use column percentages and compare these
across the table. Any difference between these reflects some association.
• 5 If the independent variable is on the side use row percentages and compare these down the
table.
• Any difference between the percentages reflects some association. Amount of difference
determines the level of significance.
Table 4. Employees level of motivation by their level of job satisfaction
Level of job satisfaction .
• Level of Low Medium High Total .
• Motivation % % % % .
• High 5.3 6.0 74.1 26.8
• Medium 31.6 67.2 19.1 40.4
• Low 63.1 26.8 6.8 32.8 .
• Total 100.0 100.0 100.0 100.0
• (N= 475) (N= 503) (N= 439) (N=1417) .
• Missing cases = 279
• Source: Field data.
The character of relationship
• Once the relationship is determined from a table, describe its character.
• There are three aspects to look at:
• 1 strength
• 2 direction
• 3 nature.
Strength of relationship
• A strong relationship is one where the category of the independent variable to which a person
belongs makes a very substantial difference to their characteristics on the dependent variable.
• If there are large differences between subgroups (as defined by the categories of the
independent variable) there is a strong relationship.
• Can take arbitrary decision.
Direction of relationship
• Relationship can be positive or negative.
• A positive relationship is one in which people who score high on one variable are more likely
than others to score high on the other variable; those who score low on one variable are more
likely than others to score low on the other variable. The
• Simplest way of detecting a positive relationship between variables is to examine the first row of
the cross-tabulation (assuming the side variable is the dependent variable and is at least
measured at the ordinal level). Simply compare the column percentages across this first row. If
the percentages become larger as you move left to right across the row the relationship is
probably positive. Also check the bottom row. If the percentages decrease as you move left to
right across the bottom row the relationship probably is positive.
Nature of relationship
• Association of ordinal or interval variables can be linear, curvilinear or non-linear.
• A linear relationship means a ‘straight line’ relationship.
Table 4. Employees level of motivation by their level of job satisfaction
Level of job satisfaction .
• Level of Low Medium High Total .
• Motivation % % % % .
• High 5.3 6.0 74.1 26.8
• Medium 31.6 67.2 19.1 40.4
• Low 63.1 26.8 6.8 32.8 .
• Total 100.0 100.0 100.0 100.0
• (N= 475) (N= 503) (N= 439) (N=1417) .
• Missing cases = 279
• Source: Field data.
When to use tables
• Although tables provide maximum information they are often inappropriate, especially when
dealing with variables with a large number of categories.
• As a rule of thumb do not use tables when variables have more than six or seven categories and
even then only do so with relatively large samples so that there are sufficient numbers in the
categories.
Graphic presentation
• Outlines graphic methods and descriptive statistics suitable for relevant variables;
Using Summary Statistics
• While tables and graphs can provide detailed information about the way in which two variables
are associated, summary statistics can provide a very concise index of the extent to which two
variables are related.
• The main way in which to summarize the extent to which two variables are related is to use
correlation coefficients (also called measures of association).
• Appropriate test.
Bivariate analysis for interval level variables
• Interval-level variables frequently have a large number of different values.
• Explores techniques suitable to analyzing interval variables without having to collapse these
variables into a small number of categories;
• ■ Describes methods of analysis when the dependent variable is interval-level and the
independent variable is categorical (comparison of means);
• ■ Describes the use of correlation coefficients when both the independent and dependent
variables are interval-level (Pearson’s correlation and rank—order correlation);
• ■ Introduces the analysis technique called regression analysis;
• ■ Describes the use of tests of significance and interval estimates suitable for interval-level
variables.
Ethics and data analysis
• Data analysis is not just a technical matter. Social scientists have ethical responsibilities to
analyze data properly and report it fairly.
• Selective reporting and selective, distorted analysis can readily paint a highly misleading picture.
• Huff’s book How to Lie with Statistics (1954) provides plenty of lighthearted examples of how
this can be done.
• Plenty of examples in scientific literature where people have either fabricated results entirely or
changed figures to make them appear more impressive.
Unethical practices in data analysis
• Replication of experiments easier than sample surveys. [time and place become different]
• Results can be misrepresented without fabricating them. Inappropriate analysis of data.
• Inappropriate analysis may not be deliberate but may be due to lack of necessary skills to
analyze data.
• Inappropriate analysis can be just as misleading as deliberate falsification of data. Unethical
• Instead of allowing the facts to speak for themselves, we make the facts speak for us.
• Lecture # 11
Data Analysis-2
(Qualitative)
Data management
Interviews, field notes, texts, visual data, transcripts.
Some similarities between quantitative and qualitative research
Differences between the two methods of data collection and analysis.
Critical of each other.
But there are points of similarity in data collection, management, and analysis.
1. Both are concerned with data reduction
Both collect large amounts of data.
Both distill and make it manageable.
In quantitative research data reduction takes the form of statistical analysis. Frequency,
percentage, mean.
In qualitative research, develop concepts out of their often rich data.
2. Both are concerned with answering research questions
• Fundamentally concerned with answering research questions about the nature of social reality.
• More specific research questions in quantitative research.
• More open-ended research questions in qualitative research.
[Link] are concerned with relating data analysis to research literature
Both relate findings to points thrown up by the literature relevant to the topic.
Researcher’s findings take significance when related to a body of literature.
4. Both are concerned with variation
Seek to uncover variation.
How phenomenon under study differs?
What are the determinants of variation?
5. Both treat frequency as springboard for analysis
Frequency with which the phenomenon under study occurs.
In quantitative research it is in numbers (%).
In qualitative research it is reported like “often”, or “mostly”.
6. Both ensure that deliberate distortion does not occur
“Willful bias” or “ consciously motivated misrepresentation” does not occur.
No personal bias.
Objectivity talked by the quantitative analysts. Reach the subjects’ versions.
Directly get the subject’s interpretation of reality.
7. Both argue for the importance of transparency Research design
Both seek to be clear about their research procedures and how their findings were arrived at.
Lay down clear research design.
8. Both have research methods appropriate to the research questions
Both seek to ensure that, when they specify research questions, they select research methods
appropriate to address those questions.
Both at data collection and analysis stages.
Managing, analyzing and interpreting qualitative data
• Research design should include the appropriate plan.
• Some decision rules laid down.
• Mass of data. Like searching noodles from a soup.
Qualitative data management
• Like quantitative research there are no clear cut rules about how qualitative data analysis should
be carried out.
• QR generates a large, cumbersome database because of its reliance on prose in the form of such
media as field notes, interview transcripts, or documents.
• Qualitative data as “attractive nuisance”, because of its richness but difficulty in finding analytic
paths.
General strategy of data analysis
• Quantitative data analysis starts after data collection. Linear process
• Qualitative data analysis is iterative i.e. repetitive interplay between the collection and analysis
of data. Interwoven procedure.
• Nevertheless, some strategy needs to be spelled out in the research proposal.
Analytic induction strategy:
• A question is a prerequisite for every research.
• Strategy begins with a rough definition of a research question (problem),
• proceeds to hypothetical explanation of that problem, and
• then continues on to the collection of data (cases). Cases that support the explanation.
• In case of encountering an inconsistent case, the researcher either redefines the hypothesis or
excludes the deviant case. Process continues.
• Specifies the conditions that are sufficient for the occurrence of a phenomenon, but rarely
specifies the necessary conditions.
• Tells why some people have adopted some behavior (become drug addicts) but does not tell
why others have not done so.
• Does not tell how many cases to be studied for the confirmation of the validity of the
hypothetical explanation. No guidelines provided.
Grounded theory
• Strauss – name associated with grounded theory.
• A theory that is derived from data, systematically gathered and analyzed through the research
process.
• Data collection, analysis, and eventual theory stand in close relationship to one another.
• GT developed out of data and the approach is iterative –Data collection and analysis proceed in
tandem, repeatedly referring back to each other.
Take grounded theory as a framework for analysis of data
• GT is not a theory but an approach to the generation of theory out of data. A framework for
data analysis.
• In fact this approach generates concepts rather than a theory as such.
• Therefore:
• Grounded theory synonymous with inductive approach.
• It is a set of procedures (tools)
Procedures (tools) of grounded theory (4 Tools)
1. Theoretical sampling
• Theoretical sampling: process of data collection for generating theory whereby:
• the analyst jointly collects, codes, and analyzes his data and decides:
• - what data to collect next, and
• - where to find it, in order to develop his theory as it emerges. All based on researcher’s theory
(logic)
• The process of data collection is controlled by the emerging theory.
• On-going process
• Making comparisons to find out the variations
2. Theoretical coding
Theoretical coding is the procedure for analyzing data, which have been collected in order
to develop a grounded theory.
Coding is the key process in grounded theory.
Data are broken down into component parts which are given names.
Code the emerging data as it is collected. Based on researcher’s interpretation of data.
Can be different levels of coding.
3. Theoretical saturation
Saturation is a process that relates to two phases in grounded theory:
- coding of data: you reach a point in reviewing your data to see how well they fit in with
concepts or categories.
- Collection of data: once a category has been developed, you may wish to continue
collecting data to determine its nature and operation but then reach a point where new
data are no longer illuminating the concept. Saturation point.
4. Constant comparison between data and conceptualization
Maintaining a close connection between data and conceptualization, so that
correspondence between concepts and categories with their indicators is not lost. Study of
feminism.
Compare phenomenon being coded under certain category so that theoretical elaboration
of that category begins to emerge.
Advice: write a memo on the category after a few phenomena had been coded.
Outcomes of grounded theory
• Concept(s): Labels given to discrete phenomena; concepts are referred to as building blocks of
theory; concepts are produced through open coding.
• Category, categories: a concept that has been elaborated so that it is regarded as representing
real world phenomena. Category may subsume 2+ concepts
• Categories are at higher level of abstraction around which other categories pivot.
• Properties: attributes or aspects of a category
• Hypotheses: initial hunches about relationships between concepts.
• Theory: A set of well developed categories … that are systematically related through statements
of relationship to form a theoretical framework that explains some relevant social … or other
phenomena.
• Substantive theory: empirical evidence of a substantive area (occupational socialization).
• Formal theory: Higher level of abstraction and wider range of applicability. Requires data
collection in contrasting setting.
Memos
• Memos are notes that researchers might write for themselves and for those with whom they
work.
• Serve as reminders about what is meant by the terms being used.
• Help in crystallizing ideas and not to loose track on various topics.
Limitations of grounded theory
1. Theory neutral observations are not possible
Researchers are aware of the existing concepts and theories. Can they suspend their
awareness until quite late in the process of analysis. No, because
a)-- Conceptual armory of the discipline is already there. Researchers are aware and are
sensitive to it.
b)-- Observations are conditioned by many factors like what we already know about the
social world.
c)-- Researchers should be sensitive to the existing conceptualizations, and build on them.
d)-- Researchers are required to spell out how the present study will contribute to the body
of knowledge. They start with theories.
2. Doubtful whether the grounded theory really results in theory
-- Provides a rigorous approach to the generation of concepts, but it is often difficult to see
what theory, an explanation of something is being put forward.
-- Most grounded theories are substantive in character i.e. specific phenomenon.
3. Grounded theory still vague on certain points
What is the difference between concepts and categories?
4. GT is very much associated with an approach to data analysis
Instead of being constructionist, grounded theory is mostly objectivist.
Aims to uncover a reality that is external to social actors.
Concepts and categories are labeled by the researcher using his own conceptual armory.
Do not emerge out of the interaction of the researcher with the actors.
Lecture # 12
Data Analysis-2
(Qualitative)
Basic operations in qualitative data analysis
A continuum of analysis strategies
• Ideal types
• Prefigured technical (one extreme) Objectivist end. Categories stipulated in advance.
• Emergent intuitive (other extreme) Immersion/crystallization style. Templates. Developing
codes/names. Depends upon researcher’s intuitive capacities.
• Balance has to be struck. Problem with complexity of qualitative data.
• Nevertheless, analysis strategies have to be explained in the research proposal/R. design.
Generic Data Analysis Strategies
• Not a linear way. Not neat.
• Qualitative data analysis is a search for general statements about relationships and underlying
themes; building of grounded theory.
• Generic term ‘analysis’ includes three activities:
Three major activities
• 1. Description
• 2. Analysis
• 3. Interpretation
• The three activities are not mutually exclusive. Overlap. Each category shows varying emphasis.
• Typical researcher starts analyzing early. Needs to analyze as he goes along (to adjust his
observation strategies). Nevertheless, some steps.
Analysis is data reduction
• Reams of collected data brought into manageable chunks, and interpretations made.
• Raw data have no inherent meaning; the interpretive act brings meaning to those data.
• Interpretation is the process of bringing meaning to raw, inexpressive data.
• Qualitative analysis transforms data into findings. No formula exists for that transformation.
Only some guidance. No recipe.
ANALYTICAL PROCEDURES
• Typical analytical procedures fall into seven phases:
• 1. Organizing the data;
• 2. Immersion in the data;
• 3. Generating categories and themes;
• 4. Coding the data;
• 5. Offering interpretations;
• 6. Searching for alternative understandings; and
• 7. Writing the report.
1. Organizing the data
Revisiting the “huge piles” of data.
List on note cards the data that have been collected, perform the minor editing necessary to
make field notes retrievable.
Log the types of data according to dates, names, times, and places where, when, and with
whom they were gathered.
Log of data gathering activities example
• Date Place Activity Who what .
• ------ XYZ Focus group 6 teachers Strategies for
(names) doing research .
------ XYZ Observation Aisha’s Seeing how is
class room her teaching? .
----- Aisha’s Interview Aisha’s Challenges,
spouse supports .
The researcher could also enter the data in computer software program
Call it: Data Preparation phase
• What data to be analyzed? Make a transcript of in-depth interviews or FGD
• Transcribing the data: translate from oral to written language. It is truth.
• Several key issues how the data will be collected.
Key issues:
• Will you videotape or audiotape the session?
• Will you transcribe the entire data session? Summarize key passages.
• Will you transcribe all types of data you collect (laughter, pauses, emotions, non-verbal data –
hand gestures)?
• Who will transcribe your data?
• What format? How will you represent the respondent’s voice, nonverbal information, and so
on?
2. Immersion in the data
Reading, re-reading the data forces the researcher to be intimately familiar with people,
events, and quotations.
Description is of course part of it. Prepare appropriate schema for it (e.g. data recording
charts). Help in streamlining data management, ensures reliability across several
researchers.
Guard against losing serendipitous findings.
3. Generating categories and themes
This phase is the most difficult, complex, ambiguous, creative, and fun.
Category generation involves noting patterns evident in the setting.
Look at the meanings of categories internally convergent (consistent) and externally
divergent (distinct). In search of exhaustive and mutually exclusive similar to what is part of
positivism.
Categories become buckets or boxes into which segments of text is placed.
Since it is an inductive analysis, therefore:
• Discover patterns, themes, and categories in the data.
• Look for “indigenous typologies” created or expressed by the participants.
• Finally, these are analyst-constructed typologies grounded in the data (not explicitly used by
people). Observer’s world rather than the world under study.
• Call it a terminology development process. Indexing
• Through logical reasoning the categories could be cross-classified. Matrix.
• An empirical typology of teacher roles in dealing with high school dropouts
• Teachers’ Behaviors towards dropouts .
beliefs about Taking Shifting
how to intervene Responsibility Responsbility .
• Rehabilitation Counselor/friend: Referral agent;
help kids directly Refer them to other
agents
• Maintenance Traffic cop: Ostrich:
(caretaking) Just keep them Ignore the situation
through the and hope someone
system else does something
5. Offering interpretations
Interpretation of the salient findings, making sense of the findings. Locating hidden
realities.
Findings as offering explanations, drawing conclusions, extrapolating lessons, and making
inferences.
Provide integrative interpretations of what has been learnt. “Telling the story.” Story of OCB
i.e. what has been found? Ideal type. Model of OCB. Compare and contrast with ideal type.
Bring meaning and coherence to themes, patterns, and categories. Also develop linkages
and story line that makes sense.
Interpreting the reality
Compare and contrast with ideal types
Lesson 14
Thesis Writing
Every thesis is custom-made, yet some conventions of format
• Many universities have in-house, suggested formats or writing guides that researchers should be
aware of.
Make an original contribution to knowledge
• The distinguishing mark of research at your level is an original contribution to knowledge.
• The thesis is a formal document whose sole purpose is to prove that you have made an original
contribution to knowledge.
• Failure to prove that you have made such a contribution generally leads to failure.
• Therefore:
Thesis must show two important things:
• 1. you have identified a worthwhile problem or question which has not been previously
answered, and
• 2. you have solved the problem or answered the question.
• Your contribution to knowledge generally lies in your solution or answer.
Make a clear statement of the question
• A very clear statement of the question is essential.
• To prove the originality and value of your contribution, you must present a thorough review of
the existing literature on the subject, and on closely related subjects.
• Then, by making direct reference to your literature review, you must demonstrate that your
question
• (a) has not been previously answered, and
• (b) is worth answering.
• Describing how you answered the question becomes easier.
A Generic Thesis Format:
• The general plan of organization for the parts.
• Tailoring the format to the research will help:
• To obtain the proper level of formality, and
• To decrease the complexity of the report.
• Formally a thesis/dissertation submitted to the university.
• Usually bound with a permanent cover.
• Consultants write long report for the organizations.
Nearly all theses/dissertations begin with four elements
• A title
• An abstract
• A table of contents
• Introduction
• These are all routine matters.
• The impression created at the start of thesis/dissertation is very important.
• Therefore the writing of first few pages should never be regarded as a triviality. Some
suggestions:
The Title
• Give a short title of your research. Subject to change. Have an effective title (keep notes for
possible changes).
• Titles should catch the readers’ attention; also informing them about the main focus of study.
• Can also be a two part title. Breaking the title up into a title and subtitle when you have too
many words A snappy main title and then a subtitle. Policing the lying patient: Surveillance and
self-regulation on consultations with adolescent diabetics
• Have the most important words appear toward the beginning of the title.
• No use of ambiguous or confusing words,
• Include key words that will help researchers in the future to find your work.
• Follow a marketing approach.
• Should be short, catchy, meaningful. Do not use >15 words.
The Abstract
• Should cover the following:
• Your research problem.
• Why that problem is important and worth studying?
• Your data and methods.
• Your main findings.
• Their implications in the light of other research.
• Word limit usually of 100-150. Say as much as possible in as few words as possible.
• Make your abstract lively and informative.
• Emphasize your problem and content, not the fieldwork techniques.
The Table of Contents
• Not a trivial matter. A scrappy or uninformative table of contents will create terrible
impression.
• To achieve 2 ends:
• 1. To demonstrate that you are a logical thinker, able to write a thesis/dissertation with a
transparently clear organization.
• 2. To allow your readers to see this at once, to find their way easily between different parts of
the thesis/dissertation and to pin point matters in which they have most interest.
• Use of double numbering system. ??
Table of Contents
Chapter Title Page .
I Introduction 1
Background 1
Objectives 6
Significance 9
II Review of Literature 10
III Theoretical Framework 18
When do you prepare the table of contents?
• It is outline of the structure of the thesis.
• Headings of chapters may remain the same. Subheadings likely to change as you proceed.
• Finalized towards the end especially with respect to pagination.
List of Tables
Number Title Page
1 Demographic Characteristics 35
1: Introduction
• Introduction is to answer the question: What is this thesis about? i.e.
• Why have you chosen this topic?
• Why this topic interests you?
• The kind of research approach or academic discipline you will utilize.
• State your research questions or problems.
• Role is to orient your readers. Do it clearly and succinctly. Do not over stretch. Also do not
encroach upon other chapters (methodology).
• It is not just a description of the contents of each chapter.
• Chapter 1 to be last chapter, not in the literal sense. It should be rewritten and finalized at the
end.
2: The Literature Review
• What should literature review contain:
• What do you already know about the topic?
• What do you have to say critically about what is already known?
• Has anyone else ever done anything exactly the same?
• Has anyone else done anything that is related?
• Where does your work fit in with what has gone before?
• Why is your research worth doing in the light of what has already been done?
• It displays your scholarly skills and credentials.
• You organize this section by idea, and not by author or by publication.
Do you need a literature review chapter?
• Know the relevant literature but just don’t lump (dump) it into a chapter that remains
unconnected to the rest of the study.
• Draw upon the literature selectively and appropriately as needed in telling the story of your
research.
• So bring in appropriate literature as you need it, not in a separate chapter but in the course of
your data analysis and/or any other discussion. Such a decision may be too radical.
Write a conventional literature review chapter. Also cite literature in order to connect your
narrow research topic to the directly relevant concerns of the broader research community.
3. Theoretical Framework
Theory helps in understanding, explaining and prediction of the phenomenon.
TF presents the theory which explains why the problem under study exists.
Portrayal of how a particular theory provides explanation to the problem. Provides the structure
that holds or supports the logic of research work
This theory serves as a basis or foundation for conducting research on the issue.
Need a strong foundation
4. Hypothesis and operationalization
Hypothesis (es) or research questions.
Show how the hypothesis (es) has (have) been drawn from the theoretical framework.
Operationalization of the variables
5. Research Design
Technical procedures must be explained.
Supplement the material in this section with more details in the appendix. This part should
address six topics:
1. Purpose of study exploratory, descriptive, or explanatory. Why specific research design
suited to the study?
2. Data collection methods. Primary or secondary data used. How primary data were collected –
survey, experiment, observation. Multiple techniques used – triangulation.
3. Sample design: What was the target pop? Probability or non probability sample. Sampling
frame. Type of sample. Selection process.
4. Instrument of data collection: What instrument and why? Put a copy in appendix.
5. Fieldwork/Data collection: how many, type of field workers used? Training/supervision. How
was quality control assured?
6. Analysis strategy: How was the analysis carried – score index applied, statistics used.
Limitations:
• No report is perfect, so indicate its limitations. For example problems with:
• Sampling procedures.
• Non response.
• Avoid over emphasizing the weaknesses.
• Questions for a qualitative methods
• How did you go about your research?
• What overall strategy did you adopt and why?
• What design and techniques did you use?
• Why these and not others?
RD for Qualitative methods (Cont.)
To answer these questions describe the following:
• The data you have collected.
• How you obtained that data (e.g. issues of access and consent)
• What claims you are making about the data (representativeness of some pop./single case)
• The methods you have used to gather the data
• Why have you chosen these methods?
• How have you analyzed the data?
• The advantages and limitations of using your methods of data analysis.
• Spell out your theoretical assumptions.
• Explain how you can generalize from your analysis. [combining qualitative with quantitative,
purposive sampling guided by time and resources, theoretical sampling]
• Avoid over-defensiveness. Also self-confidence should not mean lack of self-criticism. Document
the rationale for your research design and data analysis. Ask colleagues’ critique.
Spencer et. al. gave the following guideline:
• Give an honest account of the conduct of the research.
• Provide full descriptions of what was actually done in regard to choosing your case (s) to study,
choosing your method (s), collecting and analyzing data.
• Explain and justify each of your decisions.
• Discuss the strengths and weaknesses of what you did.
• Be open about what helped you and held you back.
6. Analysis of the data
Present the findings in line with the objectives.
Organize as a continuous narrative, designed to be convincing.
Summary table and charts should be used
Tables and charts may serve as points of reference to the data being discussed and free the
prose from an excess figures.
Detailed charts may be reserved for appendix.
7. Summary, Conclusions, and Recommendations
Summary
The Summary of contributions will be much sought and carefully read by the examiners.
List the contributions of new knowledge that your thesis makes.
The thesis itself must substantiate any claims made here. There is often some overlap with the
Conclusions, but that's okay.
Concise numbered paragraphs are again the best.
Organize from most to least important.
Conclusions
• Conclusions are based on results.
• Conclusions are not a rambling summary of the thesis: they are short, concise statements of the
inferences that you have made because of your work.
• Do not restate the research findings. Don't waste time of the reader. You got the results. So
what?
• Organize conclusions as short numbered paragraphs, ordered from most to least important.
• All conclusions should be directly related to the research question.
Recommendations
• The biggest problem with this section is that the suggestions are often ones that could have
been made prior to you conducting your research.
• Suggestions should emanate from experiences of conducting the research and the findings that
have evolved.
• Make sure that your suggestions for further research serve to link your project with other
projects in the future and provide a further opportunity for the reader to better understand
what you have done.
References
• A bibliography is the listing of the works that are relevant to the topic of research interest
arranged in alphabetical order of the last names of authors.
A reference list is a subset of the bibliography, which includes details of all the citations used in
the literature survey and elsewhere in the report, arranged again, in the alphabetical order of
the last names of authors.
Goals of referencing
• Crediting the author (s).
• Enabling the reader to find the works cited.
Give reference by following a style
• All references given must be referred to in the main body of the thesis.
• Organize the list of references alphabetically by author surname., depending upon the style you
are using.
Different modes are followed
• APA, ASA, or any other.
Appendix:
• Appendix presents the “too …” material.
• Any material that is too technical or too detailed should be in appendix. Material of interest only
to some readers. Subsidiary materials.
• Any material which impedes the smooth development of presentation, but which is important
to justify the results of a thesis.
Some tips
Review two or three well organized and presented dissertations
• Examine their use of headings, overall style, typeface and organization.
• Use them as a model for the preparation of your own thesis/dissertation.
• In this way you will have an idea at the beginning of your writing what your finished
thesis/dissertation will look like.
• A most helpful perspective!
Don’t have to proceed writing chapters sequentially
• The major myth in writing a thesis/dissertation is that you start writing at Chapter One and then
proceed sequentially. This is seldom the case.
• The most productive approach in writing the thesis/dissertation is to begin writing those parts of
the thesis/dissertation that you are most comfortable with.
Go with what interests you, start your writing there, and then keep building!
Always keep the reader's backgrounds in mind.
• Who is your audience? How much can you reasonably expect them to know about the subject
before picking up your thesis?
• Usually your audience is pretty knowledgeable about the general problem, but they haven't
been intimately involved with the details over the last couple of years like you have:
• Spell difficult new concepts out clearly.
Don't make the readers work too hard!
• The harder the examiners have to work to ferret out your problem, your defense of the
problem, your answer to the problem, your conclusions and contributions, the worse mood they
will be in, and the more likely that your thesis will need major revisions.
• Spell things out carefully, highlight important parts by appropriate titles etc.
• There's a huge amount of information in a thesis: make sure you direct the readers to the
answers to the important questions.
Clear and unambiguous use of key concepts
• Prepare a list of key words that are important to your research and then your writing should use
this set of key words throughout.
• There is nothing so frustrating to a reader as a manuscript that keeps using alternate words to
mean the same thing.
• If you've decided that a key phrase for your research is "educational workshop", then do not try
substituting other phrases like "in-service program", "learning workshop", "educational
institute", or "educational program."
Table/figure presentation – a simple rule
• If you are presenting information in the form of a table or figure (graph, chart) make sure:
• You introduce the table or figure in your text.
• Insert the table/figure
• Following the insertion of the table/figure, make sure you discuss it.
• If there is nothing to discuss then you may want to question even inserting it.
Use the Table of Contents to help improve manuscript
• Use it to see:
• if you've left something out,
• if you are presenting your sections in the most logical order, or
• if you need to make your wording a bit more clear.
• See if the Table of Contents is clear and will make good sense to the reader. You will be amazed
at how
• It is easy to see areas that may need some more attention.
• Don't wait until the end to do your Table of Contents.
Avoid
• Avoid using phrases like "Clearly, this is the case..." or "Obviously, it follows that ..."; these imply
that, if the readers don't understand, then they must be stupid.
• Readers might not have understood because you explained it poorly.
• Avoid red flags, claims (like “Corporate social responsibility is the most important part of a
management system"). Has to be demonstrated by evidence.
Lesson 15
Thesis Writing in Qualitative Research
No absolutes in this area.
• Some types of qualitative research will call for a different sort of report
Nearly all theses/dissertations begin with four elements
• A title
• An abstract
• A table of contents
• An Introduction
• These are all routine matters. But very important
• The impression created at the start of thesis/dissertation is very important.
• Therefore the writing of first few pages should never be regarded as a triviality. Some
suggestions:
The title
• Give a short title of your research. Subject to change. Have an effective title (keep notes for
possible changes).
• Titles should catch the readers’ attention; also informing them about the main focus of study.
• Have the most important words appear toward the beginning of the title.
• No use of ambiguous or confusing words.
• Include key words that will help researchers in the future to find your work.
• Can also be a two part title. Breaking the title up into a title and subtitle when you have too
many words A snappy main title and then a subtitle. Building the brand by aligning employees:
Contribution of Internal Branding in the shaping of Brand citizenship behavior
• Follow a marketing approach.
• Should be short, catchy, meaningful. Do not use >15 words.
Abstract
• Should cover the following:
• Your research problem.
• Why that problem is important and worth studying?
• Your data and methods.
• Your main findings.
• Their implication in the light of other research.
• Word limit usually of 100 -150. Say as much as possible in as few words as possible.
• Make your abstract lively and informative.
• Emphasize your problem and content, not the fieldwork techniques.
• Written towards the end of the study.
Table of contents
• Not a trivial matter. A scrappy or uninformative table of contents (or none)will create terrible
impression.
• Provides the macrostructure: logical progression.
• To achieve 2 ends:
• 1. To demonstrate that you are a logical thinker, able to write a thesis with a transparently clear
organization. Not a confused thesis.
• 2. To allow your readers to see this at once, to find their way easily between different parts of
the thesis and to pin point matters in which they have most interest.
• Use of double numbering system. ??
. Table of contents
Chapter Title Page
I Introduction:
aims of the study 1
II Historical perspective 6
III Review of Literature 10
IV Research Design 15
Finalize towards the end of study
• It is an outline of the structure of the thesis.
• Headings of chapters may remain the same. Subheadings likely to change as you proceed.
• Finalized towards the end especially with respect to pagination.
1: Introduction
• Introduction is to answer the question: What is this thesis about? i.e.
• Why you have chosen this topic?
• Why this topic interests you?
• The kind of research approach or academic discipline you will utilize.
• Your research questions or problems.
• Role is to orient your readers. Do it clearly and succinctly. Do not over stretch. Also do not
encroach upon other chapters (methodology).
• It is not just a description of the contents of each chapter.
• Introduction sets the scene and puts the research in context.
• If the research was about, for example, coping with stress by house doctors, the reader needs to
know why the study was done and how it, broadly, relates to other researchs.
• It is useful to start with a sentence that describes exactly what this research is about.
• This is an account of a descriptive study of coping with stress in three groups of 10 house
doctors working in three hospitals of Lahore.
• The study was a qualitative one involving interviews with a convenience sample of house
doctors. Although there is a considerable amount of research carried out into whether or not
internship is stressful, there is little known about the stress experienced and the coping
strategies adopted by the house doctors.
• Here, the research question or the aim of the study is described.
• It can also be started directly like:
• The aim of this study was to address the question: ‘What coping strategies for stress are
adopted by the house doctors in their clinical and educational work settings’?
• At the end of the study, you are able to reflect back on the degree to which the aim was or was
not achieved.
2: Review of Literature
• Misconceptions about the literature review chapter:
• It is done just to display knowledge that ‘you know the area’.
• It is easier to do than your data analysis chapters.
• It is boring to read (and to write).
• It is best ‘to get out of the way’ at the start of research work.
• Begin by describing the literature that was searched.
• This involves describing the computer search engines used and the keywords entered into those
engines.
• Needs thorough coverage and systematic review.
• Was the ‘grey’ literature reviewed?
• Grey literature is defined as:
• that which is produced on all levels of government, academics, business and industry in print
and electronic formats, but which is not controlled by commercial publishers.
• No formula for reviewing. Yet the reader needs to know who did the research and when. What
was done and what was found? Thus an example of such reporting might be given:
• In a small scale study of 12 student nurses in a School of Nursing at the University of Health
Sciences, Lahore, Saadia (2013) undertook two rounds of interviews to establish the factors that
those students felt contributed to their ability to cope with stress. She found that most students
relied on family or friends for support. Some used stress reduction methods including breathing
exercises, physical activities and diary keeping. Few expressed the view that they were unable to
cope with stress. Ages and sex of the respondents were not quoted in the account of the study.
• Key research reports should be cited in this way. Make comparisons with others and give critical
evaluation
• Others can be grouped together. For example, if a number of studies have been carried out
using similar methods, with similar findings, these can be quoted thus:
• A number of studies, using the Pakistan Personal Stress Instrument (2001) – a free form
reporting instrument – reported high levels of stress amongst younger students (Give multiple
references).
• Finalize this chapter towards the end of your study, in order to cover the latest developments on
the subject area.
3: Research Design
• David Silverman (2005) called this chapter as natural history of research conducted. All based on
field notes, and diaries. Therefore:
• Be open and clearly state what actually happened during research.
• Demonstrate that you have the making of a competent researchers.
• The strategies, difficulties, and the way these were handled. Detective story.
• Document the rationale to back up your RD and data analysis.
(a) Questions to be looked at in RD
Natural history of the research project should look at question like:
How did you go about your research?
What overall strategy did you adopt and why?
What design and techniques did you use?
Why these and not others?
Context of the study.
Why go for qualitative research?
Why a particular population?
(b) Sample
Sample: It is probably the case that convenience sampling is the most frequently used in
qualitative studies. Applicable in case study of an organization.
The reader needs to know the size and type of sample used in the reported study.
If an unusual variant of sampling is used, it is useful to acknowledge the nature of it.
Other comments about the sampling process may be helpful.
A sample of 10 house doctors from each hospital, was invited to take part in the study. The
sample was a convenience one and the snowball approach to sampling was adopted (Ref).
Each house doctor was asked to recommend to the researcher another doctor who might be
able to articulate views about his/her stress.
There appears to be no general agreement about sample size in qualitative studies. Reports
describe single-person studies (Refs). Other commentators suggest sample sizes ranging
from 6 (refs) to 30 (refs). It was felt that (10 X 3 = 30) respondents should be able to supply
varied and detailed accounts for the purposes of this study.
(c ) Data collection/field operations
• Describe what the researcher was aiming to find out,
• How the field operations were carried out?
• Problems and solutions (entry, rapport, logistics).
• Observations – participant, non participant.
• In-depth interviews (give all details: where, how long, one/two time, recording procedure,
permission,)
• Focus- group discussions ( same as in interview).
• Surveys.
• Use of documents.
• Triangulation
• Tell story, not critique – entry, rapport, ethical issues.
(d)Data processing method
• Part of RD or part of data analysis chapter?
• A variation is to be found in the amount of detail of reporting in this section.
• It is possible to describe, in full, how the researcher handled the data (transcription, transfer of
data to computer, destroying the recordings on the completion of study).
• Or it is possible to write that ‘The interviews were recorded and transcribed. The researcher
then sorted those data into a range of categories and these are reported below’. Does not tell
much.
• A comfortable compromise between these two extremes is probably achieved by reporting a
little of what happened.
• Care should be taken with very general terms such as ‘content analysis’, when reporting data
analysis. The term is probably so broad as to have little meaning. An example of how part of this
section might be written is like this:
• All of the interview transcripts were read by the researcher and coded in the style of a grounded
theory approach to data analysis (refs).
• Eight category headings were generated from the data and under these all of the data were
accounted for.
• Two independent researchers were asked to verify the seeming accuracy of the category system
and after discussion with them, minor modifications were made to it. In the grounded theory
literature, a good category system is said to have ‘emerged’ from the data (refs).
• Other commentators have noted that, in the end, it is always the researcher who finds and
generates that system (refs).
[Link] Analysis
• In empirical studies data analysis chapter is the key basis for the evaluation of your thesis.
• Data collection might have been highly laborious, even hazardous is neither here nor there. Final
assessment is what do you do with your data. Actual writing up data analysis.
• Analysis may have begun earlier as part of data collection which helps you in theoretical
sampling, yet there has to be a separate chapter in the thesis.
• You need to develop the skills to present your analysis clearly and cogently to your readers.
• A decision needs to be made, here, about whether or not (a) the researcher presents the
findings on their own, without supporting discussion or (b) if he or she links the findings with the
work of other researchers.
• The main section: work through your data in terms of what you have already said.
• A conclusion: you summarize what you have shown and connect to next chapter.