0% found this document useful (0 votes)
85 views15 pages

Induction of Fuzzy Rules and Membership Functions From Training Examples'

Induction of fuzzy rules and membership functions from training examples
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views15 pages

Induction of Fuzzy Rules and Membership Functions From Training Examples'

Induction of fuzzy rules and membership functions from training examples
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ELSEVIER Fuzzy Sets and Systems 84 (19961 33 -47

Induction of fuzzy rules and membership functions


from training examples’
Tzung-Pei Hong”**, Chai-Ying Leeb
aDepartment of lnformutionManagement. Kaohsiung Polytechnic Institute. Kuohsiung 84008. Tailcon. ROC
I1Institute qf Electrical Engineering, Chung-Hua Polytechnic Institute, Hsinchu. 30067. Taiwun. ROC

Received January 1995; revised I August 1995

Abstract

Most fuzzy controllers and fuzzy expert systems must predefine membership functions and fuzzy inference rules to map
numeric data into linguistic variable terms and to make fuzzy reasoning work. In this paper, we propose a general
learning method as a framework for automatically deriving membership functions and fuzzy if-then rules from a set of
given training examples to rapidly build a prototype fuzzy expert system. Based on the membership functions and the
fuzzy rules derived, a corresponding fuzzy inference procedure to process inputs is also developed.

Keywords: Expert systems; Fuzzy clustering; Fuzzy decision rules: Fuzzy machine learning: Knowledge acquisition:
Membership functions

1. Introduction Among them, building an accurate mathematical


model to describe the complex environment is
Decision-making is one of the most important a good way. However, accurate mathematical
activities in the real world. With specific domain models neither always exist nor can they be derived
knowledge as a background, the task of decision- for all complex environments because the domain
making is to get an optimal or a nearly optimal may not be thoroughly understood. The first
solution from input information using an inference method is then limited and when it fails, an alterna-
procedure. Generally, there are three ways to make tive for making a good decision is to seek human
a decision in a complex environment: experts’ help. However, the cost of querying
(1) by building a mathematical model; an expert may be high, and there may be no
(2) by seeking human experts’ advice; human experts available when the decision must
(3) by building an expert system or controller. be made.
Recently, expert systems have been widely used
in domains for which the first two methods are not
‘This research was supported by the National Science
Council of the Republic of China under contract NSC84-2213-
suitable [S, 191. The knowledge base in an expert
E-214-013. system can grow incrementally and can be updated
*Corresponding author. E-mail: tphong@[Link]. dynamically, so that the performance of an expert

016%0114/96/$15.00 Copyright 11“ 1996 Elsevier Science B.V. All rights reserved
SSDl 0165-01 14(95)00305-3
system will become better and better as it develops. built by human experts or experienced users.
Also, the expert system approach can integrate The same problem as before then arises: if the
expertise from many fields, reduce the cost of query, experts are not available, then the membership
lower the probability of danger occurring, and functions cannot be accurately defined, or the
provide fast response [ 191. fuzzy systems developed may not perform
To apply expert systems in decision-making, well.
having the capacity to manage uncertainty and In this paper, we propose a general learning
noise is quite important. Several theories such as method as a framework to derive membership
fuzzy set theory [27,29], probability theory, D-S functions automatically and fuzzy if-then rules
theory [21], and approaches based on certainty from a set of given training examples and quickly
,factors [Z], have been developed to manage uncer- build a prototype of an expert system. Based on
tainty and noise. Fuzzy set theory is more and more the membership functions and the fuzzy rules
frequently used in expert systems and controllers, derived, a corresponding fuzzy inference pro-
because of its simplicity and similarity to human cedure to process the input in question is
reasoning. The theory has been applied to many developed.
fields such as manufacturing, engineering, diag- The remaining parts of this paper are organized
nosis, economics, and others [S, 9,281. However, as follows. In Section 2, the development of an
current fuzzy systems have the following general expert system is introduced. In Section 3, the con-
limitations. cept and terminology of fuzzy sets are briefly re-
(1) They have no common framework from viewed. In Section 4, the basic architecture of fuzzy
which to deal with different kinds of problems; in expert systems is provided. In Section 5, a new
other words, they are problem-dependent. learning method is given for automatically deriving
(2) Human experts play a very important role membership functions and fuzzy if-then rules from
in developing fuzzy expert systems and fuzzy a set of training instances. In Section 6, an inference
controllers. procedure for processing inputs based on the de-
Most fuzzy controllers and fuzzy expert systems rived rules is suggested. In Section 7, application to
can be seen as special rule-based systems that use Fisher’s iris data is presented, Conclusions are
fuzzy logic. A fuzzy rule-based expert system con- given in Section 8.
tains fuzzy rules in its knowledge base and derives
conclusions from the user inputs and the fuzzy
reasoning process [9,28]. A fuzzy controller is 2. Development of an expert system
a knowledge-based control scheme in which
scaling functions of physical variables are used Development of a classical expert system is illus-
to cope with uncertainty in process dynamics or trated in Fig. 1 [19]. A knowledge engineer first
the control environment [7]. They must usually establishes a dialog with a human expert in order to
predefine membership functions and fuzzy infer- elicit the expert’s knowledge. The knowledge engin-
ence rules to map numeric data into linguistic eer then encodes the knowledge for entry into the
variable terms (e.g. very high, young, . ) and knowledge base. The expert then evaluates the ex-
to make fuzzy reasoning work. The linguistic pert system and gives a critique to the knowledge
variables are usually defined as fuzzy sets with engineer. This process continues until the system’s
appropriate membership functions. Recently, many performance is judged to be satisfactory by the
fuzzy systems that automatically derive fuzzy if- expert. The user then supplies facts or other in-
then rules from numeric data have been developed formation to the expert system and receives expert
[3,13,18,22,23]. In these systems, prototypes of advice in response [19].
fuzzy rule bases can then be built quickly without Although a wide variety of expert systems have
the help of human experts, thus avoiding a develop- been built, a development bottleneck occurs in
ment bottleneck. Membership functions still need knowledge acquisition. Building a large-scale ex-
to be predefined, however, and thus are usually pert system involves creating and extending a large
T-P. Hong, C.-Y Lee / FUZZVSets and $stems 84 (1996) 33-47 35

3. Review of fuzzy set theory

Fuzzy set theory was first proposed by Zadeh


in 1965 [26], and was first used in control by
Mamdani [ 1.51. Fuzzy set theory is primarily con-
cerned with quantifying and reasoning using natu-
ral language in which many words have ambiguous
meanings. It can also be thought of as an extension
of the traditional crisp set, in which each element is
Explint Knowledge either in the set or not in the set.
Formally, the process by which individuals from
a universal set X are determined to be either mem-
bers or non-members of a crisp set can be defined
by a characteristic or discrimination function [l 11.
For a given crisp set A, this function assigns a value
Facts Expertise
/lA(x) to every .YE X such that
A
1 if and only if .YE A.
P/i(Y) = (1)
0 if and only if .x$A.

Thus, the function maps elements of the univer-


sal set to the set containing 0 and 1. This can be
indicated by

knowledge base over the course of many months or pA:X + (0,l). (2)
years. For instance, the knowledge base of the
This kind of function can be generalized such
XCON (RI) expert system has grown over the past
that the values assigned to the elements of the
10 years from 300 component descriptions and 750
universal set fall within a specified range and are
rules to 31000 component descriptions and 10000
referred to as the membership grades of these
rules [20]. Shortening the development time is
elements in the set. Larger values denote higher
then the most important factor for the success
degrees of set membership. Such a function is called
of an expert system. Recently, machine-learning
a membership function pA by which a fuzzy set A is
techniques have been developed to ease the know-
usually defined. This function can be indicated by
ledge-acquisition bottleneck. Among machine-
learning approaches, deriving inference rules from pa:X -+ [O, 11. (3)
training examples is the most common [12,16,17].
where X refers to the universal set defined in a spe-
Given a set of examples and counterexamples
cific problem, and [0, l] denotes the interval of real
of a concept, the learning program tries to induce
numbers from 0 to 1, inclusively.
general rules that describe all of the positive
Assume that A and B are two fuzzy sets with
training instances and none of the counter-
membership functions of pA and ,~a. The most com-
examples. If the training instances belong to more
monly used primitives for fuzzy union and fuzzy
than two classes, the learning program tries
intersection are as follows [23]:
to induce general rules that describe each class.
In addition to classical machine learning methods,
fuzzy learning methods (such as fuzzy ID3) (4)
[8,24,25] for inducing fuzzy knowledge have
also emerged recently. Machine learning then Although these two operations may cause the
provides a feasible way to build a prototype (fuzzy) problems of partially single operand dependency
expert system. and negative compensation [lo], they are the most
36 T.-P. Hong, C.-Y Lee J FUZZYSets and Systems 84 (1996) 33-47

commonly used because of their simplicity. These Explanation mechanism: A mechanism that ex-
two operators are also used in this paper in de- plains the inference process to users.
riving the fuzzy if-then rules and membership Working memory: A storage facility that saves
functions. user inputs and temporary results.
Knowledge-acquisition facility: An effective knowl-
edge-acquisition tool for conventional interview-
4. Architecture of a fuzzy expert system ing or automatically acquiring the expert’s
knowledge, or an effective machine-learning
Fig. 2 shows the basic architecture of a fuzzy approach to deriving rules and membership
expert system. Individual components are illus- functions automatically from training instances,
trated as follows. or both.
User interface: For communication between Here the membership functions are stored in
users and the fuzzy expert system. The interface a knowledge base (instead of being put in the inter-
should be as friendly as possible. face) since by our method, decision rules and mem-
Membership function base: A mechanism that bership functions are acquired by a learning
presents the membership functions of different method. When users input facts through the user
linguistic terms. interface, the fuzzy inference engine automatically
Fuzzy rule base: A mechanism for storing fuzzy reasons using the fuzzy rules and the membership
rules as expert knowledge. functions, and sends fuzzy or crisp results through
Fuzzy inference engine: A program that executes the user interface to the users as outputs.
the inference cycle of fuzzy matching, fuzzy con- In the next section, we propose a general learning
flict resolution, and fuzzy rule-firing according to method as a knowledge-acquisition facility for
given facts. automatically deriving membership functions and
T.-P. Hong, C.-Y. Lee / Fuzzy Sets and $stems 84 (1996) 33-47 31

fuzzy rules from a given set of training instances.


Based on the membership functions and the fuzzy
rules derived, a corresponding fuzzy inference pro- Environment
cedure to process user inputs is developed.

5. The knowledge acquisition facility

A new learning method for automatically deriv-


ing fuzzy rules and membership functions from examples functions and
fuzzy inference
a given set of training instances is proposed here as rules
the knowledge acquisition facility. Notation and
Fig. 3. Learning activity.
definitions are introduced below.

5.1. Notation and definitions


Step 5: rebuild membership functions in the sim-
plification process;
In a training instance, both input and desired
Step 6: derive decision rules from the decision
output are known. For a m-dimensional input
table.
space, the ith training example can then be de-
Details are illustrated by an example in the next
scribed as
section.
(xil*xi*> ... ,-xim;yi), (5)

where ?Ci*(1 < r < m) is the rth attribute value of the 5.3. Example
ith training example and yi is the output value of
the ith training example. As before, assume an insurance company decides
For example, assume an insurance company de- insurance fees according to age and property. Each

cides insurance fees according to two attributes: age training instance then consists of two attributes: age
and property. If the insurance company evaluates and propertJ9 (in ten thousands), and one output:
and decides the insurance fee for a person of age 20 insurance fee. The goal of the learning process is to
possessing property worth $30000 should be construct a membership function for each attribute
$1000, then the example is represented as (age = 20, (i.e. age and property), and to derive fuzzy decision
property = $30 000, insurance fee = $1000). rules to decide on reasonable insurance fees.
Assume the following eight training examples are
5.2. The algorithm available:

The learning activity is shown in Fig. 3 [23]. Age Property Insurance fee
A set of training instances are collected from the ( 20, 30; 2000 )
environment. Our task here is to generate auto- ( 25, 30; 2100 )
matically reasonable membership functions and ( 30, 10; 2200 )
appropriate decision rules from these training data, ( 45, 50; 2500 )
so that they can represent important features of the ( 50, 30: 2600 )
data set. The proposed learning algorithm can be ( 60, 10; 2700 )
divided into five main steps. ( 80, 30; 3200 )
Step 1: cluster and fuzzify the output data; ( 80, 40; 3300 )
Step 2: construct initial membership functions
for input attributes; The learning algorithm proceeds as follows.
Step 3: construct the initial decision table; Step 1: Cluster and fuzziJy the output data. In
Step 4: simplify the initial decision table; this step, the output values of all training instances
38 T.-P. Hong, C-Y Lee / Fuzzy Sets and Svstems 84 (IW6) 33-47

Find the membership

Fig. 4. The flow chart of Step I

are appropriately grouped by applying the cluster- where yi d yI+ 1 (for i = 1, . , n - 1).
ing procedure below, and appropriate membership
functions for output values are derived. Our clus- Example 1. For the training instances given in the
tering procedure considers training instances with example, Substep (la) proceeds as follows.
close output values as belonging to the same class Original order:
with high membership values. Six substeps are in- 2000,2100, 2200, 2500,2600, 2700, 3200, 3300
cluded in Step 1. The flow chart is shown in Fig. 4. 1 sorting
Details are as follows.
Modijied order:
Substep (la): Sort the output values of the training
2000,2100, 2200, 2500, 2600,2700, 3200, 3300
instances in an ascending order. It sorts the output
values of the training instances to find the relation- Substep (1 b): Find the difference between adjacent
ship between adjacent data. The modified order data. The difference between adjacent data pro-
after sorting is then vides the information about the similarity between
them. For each pair yi and yi+ I(i = 1,2, . . , n - l),
y;,yi, ‘.. ,Yb. (6) the difference is difi = y:+ 1 - yi.
T.-P. Hong, C.-Y. Lee lIIFuzzy Sets and $vstems 84 (1996) 33-47 39

Example 2. From the result of Substep( the


difference sequence is calculated as follows.
1
Modified order:
2000, 2100, 2200, 2500, 2600, 2700, 3200, 3300,

Diflerence sequence:
100, 100, 300, 100, 100, 500, 100. A,
ua b c
Substep (1~): Find the value of’ similarity bet- Fig. 5. A triangle membership function.
ween adjacent data. In order to obtain the value
of similarity between adjacent data, we convert
each distance difi to a real number si between If si < x then divide the two adjacent data into
0 and 1 according to the following formula [IS]: different groups;
else put them into the same group.
: diffi
Si =
I_

~a
5
ford% d C * as,
(7) After the above operation, we can obtain the
i
10 otherwise, result formed as (y:, Rj), meaning that the ith out-
put data will be clustered into the Rj, where
where Si represents the similarity between yi and
Ri means the jth produced fuzzy region.
yi+ i, di& is the distance between yi and y:+ ,, as is
the standard derivation of difi’s, and C is a control
parameter deciding the shape of the membership
Example 4. Assume a is set at 0.8. The training
functions of similarity. A larger C causes a greater
examples are then grouped as follows:
similarity.

Example 3. Assume the control parameter C = 4.


The standard deviation a is first calculated as
145.69. Each membership value of similarity is
shown as follows:

s, = 1 - 100/‘145.69*4 = 0.83, Substep (le): Determine membership jimctions of


the output space. For simplicity, triangle member-
s2 = 1 - 100,‘145.69*4 = 0.83, ship functions are used here for each linguistic
variable. A triangle membership function can be
s3 = 1 - 300/145.69*4 = 0.49,
defined by a triad (a, b,c) as Fig. 5 shows (not
sq = 1 - 100/145.69*4 = 0.83, necessarily symmetric).
In this substep, we propose a heuristic method
sg = 1 - 100/145.69*4 = 0.83, for determining these three parameters. First,
we assume that the center point b lies at the
se = 1 - 500,‘145.69 * 4 = 0.14,
center-of-gravity of the group. Next, we try to
s, = 1 - 100/145.69*4 = 0.83. find the membership values of two boundary train-
ing outputs in the group, where “boundary training
Substep (Id): Cluster the training instances ac- outputs” mean the minimum and maximum out-
cording to similarity. Here we use the x-cut of sim- puts in that group. The two end points a and c
ilarity to cluster the instances. The value of x of the output membership function can then be
determines the threshold for two adjacent data to found through the extrapolation of b and the
be thought of as belonging to the same class. Larger two boundary training outputs. The following
a will have a smaller number of groups. The pro- four procedures are then used to achieve this
cedure proceeds as follows: purpose.
40 T-P. Hong. C.-Y. Lee f Fuzzy Sets and $vstems X4 (1996) 33-47

Procedure 1: Find the central-vertex-point hi: similarity. The two boundary instances in a group
if y:, yj+ 1, , y; belong to the jth group, then have larger similarities, causing the member-
then the central-vertex-point hj in this group is ship function formed to be flatter.
defined as

Si + Si+ 1 Si+l + byi+


+ y;+2 * + ..’ +y;_r* +y;*&-r
2 2 2
si + Sit
J

1 si+l + si+2
si + ~ + .’
2 + 2

Procedure 2: Determine the membership of yi Substep (If): Find the membership value of he-
and y;. longing to the desired group jbr each instance.
From the above membership functions we can get
The minimum similarity in the group is chosen the fuzzy value of each output data formed as
as the membership value of the two boundary (yj, Rj, pij), referred to as the ith output data has the
points yi and y;. Restated, the following formulas fuzzy value LCij to the cluster Rj. Each training
are used to calculate pj(yi) and pj(y;), where ~j instance is then transformed as
represents the membership of belonging to the jth
(X13X23 . . . >-~,;(RI,PIL(R~>P~), ... ,(&,Pk)). (12)
group:

pj(Yi) = pj(Y;) = min(si,si+l, . ,Sk-l). (9) Example 6. Each output is transformed as follows:

This heuristic is quite rational since the two


boundary instances should be clustered with the
group having a lower probability than other in- (y;,R,, l),(!,‘,,R2,0.83),(y;,RJ,0.83),(yk,R3,0.83).
stances in the same group.
Combining with the input data, we obtain
(20, 30; (R,, 0.83)),
Procedure 3: Determine the vertex-point u:
(25, 30; (RI, I)),
According to the two points (bj, 1) and
(30, 10; (R,,0.83)),
(yi, pj( yi)), we can find the point (a, 0) by interpola-
(45, 50; (R,,O.83)),
tion as follows:
(50, 30; (R2,1)),

bj - yi (60, 10; (R2, OWL


U=bj- (10) (80, 30; (RA, 0.83)),
l - Pj(Yi)’
(80, 40; (R3, 0.83)).
Procedure 4: Determine the vertex-point c.
Step 2: Construct initial membership functions
According to the two points (bj, 1) and
for input uttributes. We assign each input attribute
(yh, pj( y;)), we can find the point (c, 0) by interpola-
an initial membership function, which is assumed
tion as follows:

YL - bj
c=bj+ (11)
1 - P;(Yb)’

Example 5. After the operation of Substep(


membership functions of the three output groups
the 1 /x)ol.37 )

are derived as shown in Fig. 6.


Note that the membership functions may not
1517.252017,25 2682.75 2958.63
necessarily overlap each other. The shape of the 3182.75 Insurance
FM
membership functions is affected by the choice of C.
As mentioned above, a larger C will cause a greater Fig. 6. Insurance fee membership functions.
T.-P. Hong, C.-Y. Lee 8 Fuzzy Sets and systems 84 (1996) 33-47 41

FT
FU=Y value
value 1

0 01 a2 a3 a4
an-1 an Input data Fig. 9. Initial membership functions of property.

Fig. 7. Initial membership functions.

1
FwaY
value
1

R2 q-
1 2 3 4 5 6 7 8 9 10 11 12 13 Age
0 20 25 30 35 40 45 50 55 60 65 70 75 80 Age
Fig. IO. Initial decision table for insurance problem

Fig. 8. Initial membership functions of age

for age are shown in Fig. 8, and for property in


Fig. 9.

to be a triangle (a, b,c) with b - u = c - b = the Step 3: Construct the initial decision table. In
smallest predefined unit. For example. if three this step we build a multi-dimensional decision
values of an attribute are lo,15 and 20, then the table (each dimension represents a correspond-
smallest unit is chosen to be 5. Here we let a0 be the ing attribute) according to the initial membership
smallest value for the attribute and a, be the biggest functions. Let a cell be defined as the contents
value for the attribute. Initial membership func- of a position in the decision table. Cellcd,,d2, ,,,,,)
tions for the attribute are viewed as in Fig. 7, where then represents the contents of the position
Ui-Ui-1 =Ui+l - ai = the smallest predefined (d,,& ... ,di, . . . ,d,) in the decision table, where
unit, and R, means the xth initial region m is the dimension of the decision table and di is the
(i = 2,3, , n - 1;x = 1,2, . . . , n). position value at the ith dimension. Each cell in the
At first sight, this definition seems unsuitable for table may be empty, or may contain a fuzzy region
problems with small units over big ranges (e.g. (with maximum membership value) of the output
1, 1.1, lOOO), since many initial membership func- data. Again, in practical implementation, only the
tions may exist. But in practical implementation, non-null cells are kept (through an appropriate
only the membership functions corresponding to data structure [ 141).
existing attribute values are kept and considered
(through an appropriate data structure [14]). The Example 8. The initial decision table for the insur-
membership functions corresponding to no at- ance fee problem is shown in Fig. 10.
tribute values are not kept here since they will be
merged in later steps. Step 4: Simplify the initial decision table. In this
step, we simplify the initial decision table to elimin-
Example 7. Let 5 be the smallest predefined unit of ate redundant and unnecessary cells. The five merg-
age and property. The initial membership functions ing operations defined here achieve this purpose.
42 T.-P. Hong. C.-Y Lee I Fuzzy Sets and Swtems 84 (1996) 33-47

Fig. 11. An example of Operation I Fig. 13. An example of Operation 2

. ,
PIOWItV Property
9
8
7
6
5 R-1
4
31
2 I
1 IRl/ ~ /R21
1 2 3 4 5 6 7 8 9 10 11 12 13 .%S
1 2 3 4 S 6 7 8 9 10 11 12 13
Fig. 12. The results after Operation I
Fig. 14. The results after Operation 3.

Operation 1: If cells in two adjacent columns (or


rows) are the same, then merge these two columns
(or rows) into one (see Fig. 11). For example, in
Fig. 10, all the cells in the neighboring columns
age = 1 and age = 2 are the same. The two columns
age = 1 and age = 2 are then merged into one.

Example 9. The decision table after merge opera-


tion 1 is shown in Fig. 12.
Fig. 15. An example of Operation 3
Operation 2: If two cells are the same or if either
of them is empty in two adjacent columns (or rows)
and at least one cell in both the columns (or rows) is Operation 4: If all cells in a column (or row) are
not empty, then merge these two columns (or rows) empty and if cells in its two adjacent columns (or
into one (see Fig. 13). For example, in Fig. 12, the rows) are the same or either of them is empty, then
two columns age = 6 and age = 7 are merged into merge these three columns or rows into one (see
one. Fig. 16). For example, in Fig. 14, the three columns
age = 7, age = 8 and age = 9 are merged into one.
Example 10. The decision table after merge opera-
tion 2 is shown in Fig. 14. Example 11. The decision table after merge opera-
tions 3 and 4 is shown in Fig. 17.
Operation 3: If all cells in a column (or row) are
empty and if cells in its two adjacent columns (or Operation 5: If all cells in a column (or row) are
rows) are the same, then merge these three columns empty and if all the non-empty cells in the column
(or rows) into one (see Fig. 15). (o row) to its left have the same region, and all the
T.-P. Hong, C.-Y. Lee 1 Fuz.q Sets and $vstems 84 (1996) 33-47 43

0 25 30 45 5417 60 80 Age

Fig. 20. Final membership functions for age


Fig. 16. An example of Operation 4

Property
9 Fuzzy
8 value
7
PI
6
5 Rl R2 R3
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 &e

0 10 20 30 40 50 Property
Fig. 17. The results after Operations 3 and 4
Fig. 11. Final membership function for property

Rl R2

. .
. .
. . example, in Fig. 17, the three columns age = 1,
age = 4 and qe = 6 are then merged into two
Rl R2
columns.

Fig. 18. An example of Operation 5 Example 12. The decision table after merge opera-
tion 5 is shown in Fig. 19.

9 / Step 5: Rebuild membership functions. When


8 each merging operation is executed, a correspond-
7
6 ing rebuilding of the membership functions for the
PI 5 Rl ; PI2 R3 dimension to be processed is accomplished. For
4
3
Operations 1 and 2, if (ai, hi, ci) and (aj, bj, cj) are
2 the membership functions for the ith attribute be-
1 ing di and di+ 1, then the new membership function
1 2 3 4 5 6 7 8 9 10 11 12 13 Age
is (ai,(bi + bj)/2, cj). For Operations 3 and 4, if
Al A2 A3
(ai, hi, ci), (aj, bj, cj), and (Q, bk, ck) are the member-
Fig. 19. The results after Operation 5 ship functions for the ith attribute being di - l,dj
and di + 1, then the new membership function is
(ai,(bi + bj + bk)/3,ck). For Operation 5, if
non-empty cells in the column (or row) to its right (Ui, hi, ci), (Uj, bj, cj), and (ok, bk, ck) are the member-
have the same region, but one different from the ship functions for the ith attribute being di - l,di
previously mentioned region, then merge these and di + 1, then the new membership functions are
three columns into two parts (see Fig. 18). For (Ui. hi, C,i) and (Uj, bk, Ck).
44 T.-P. Hong, C.-Y. Lee 1 Fuzq Sets and Systems X4 (1996) 33-47

Example 13. The membership functions as rebuilt Details are explained below.
after Step 5 are shown in Figs. 20 and 2 1, respectively. Step 1: Convert numeric input values to linguistic
terms according to the membership functions de-
Step 6: Derive decision rules from the decision rived. We map a numeric input (Ii, 12, ,I,) to
table. In this step, fuzzy if-then rules are derived their corresponding fuzzy linguistic terms with
from the decision table. Each cell cellcd,,d, _,d,J = membership values.
Ri in the decision table is used to derive a rule:
Example 15. Assume a new input datum 1(37,40)
If input, = dl, input2 = d2, . . . , and input,,, = d,
(i.e. age = 37, property = 40) is fed into the fuzzy
Then output = Ri. (13) inference process. It is first converted into the
following fuzzy terms through the membership
Example 14. After Step 6 we obtain the following functions derived:
rules:
PA,(Z) = 0.55,
If age is A, and property is Pi
Then insurance fee is RI * Rule,, /$(I) = 0.3,

If age is A2 and property is Pi /+,U) = 1.


Then insurance fee is Rz + Rule,,
Step 2: Match the linguistic terms with the deci-
If age is A, and property is P, sion rules to find the output groups. In this step, we
Then insurance fee is R3 => Rule3. match the fuzzy inputs with the fuzzy rules in order
From the example, it is easily seen that the at- to obtain the output regions.
tribute property is unimportant in determining the
insurance fee. The proposed method then also Example 16. The fuzzy input in Example 15
possesses the capability of conventional machine matches the following two rules:
learning approaches. If age is Al and property is Pi
Then insurance fee is RI * Rule,,

6. The fuzzy inference process If age is A2 and property is P1


Then insurance fee is R2 * Rule,.
In the fuzzy inference process, data are collected
The membership values of match to Rule 1 and
from the complex environment, and are used to
Rule 2 are, respectively, calculated as follows:
derive decisions through the fuzzy inference rules
and the membership functions. P~,(U = min(p,,U),pP,U)) = 0.55,
One advantage of the proposed learning method
is that the format of the knowledge derived from pR2tz) = min(~A2(z)~h,(z)) = o.3.

the learning process is consistent with that of com-


Step 3: Dejiizzif~ the output groups to form the
monly used fuzzy knowledge, thus the ordinary
final output values. The final output value is ob-
Mamdani type of inference process [lS] can then
tained by averaging the output groups [29]. Let the
be applied. The inference process required to obtain
membership function for the output group Ri be
a conclusion from the input space is as follows:
(ai, hi, ci). The output value is calculated by the
Step 1: Convert numeric input values to lin-
following formula:
guistic terms according to the membership func-
tions derived; 4’= If= PR,(~) bi
1 *
Step 2: Match the linguistic terms with the (14)
1:~ pR,tz) ’
1
decision rules to find the output groups;
Step 3: Defuzzify the output groups to form the where K is the number of possible output
final decision. groups.
T.-P, Hong, C.-Y Lee / Fuzzy Se& and Systems 84 (1996) 33-47 45

Table 1 Table 1. From Table 1 we can see that the testing


Testing training examples results are quite close to the outputs of the training
data. The average error rate is only 1.94%. This is
Training Testing Error Average error
further proof that our membership function is
examples result rate % rate (“%I)
rational.
(20, 30; 2000) 2100 5
(25, 30; 2100) 2100 0
(30, 10; 2200) 2185.704 0.65
7. Experimental results
(45, 50; 2500) 2442.8 18 2.28
(50, 30; 2600) 2528.522 2.75 1.94
(60, 10; 2700) 2746.709 1.73 To demonstrate the effectiveness of the proposed
(80, 30; 3200) 3250 1.56 fuzzy learning algorithm, we used it to classify
(80, 40; 3300) 3250 1.52 Fisher’s Iris Data containing 150 training instances
[4]. There are three species of iris flowers to be
distinguished: setosa, versicolor, and verginica.
There are 50 training instances for each class. Each
Example 17. The output value for Example 15 is
training instance is described by four attributes:
calculated as follows:
sepal width (SW), sepal length (SL), petal width
y= 0.55 * 2100 + 0.3 * 2600 = 2276 47, (PW), and petal length (PL). The unit for all four of
0.55 + 0.3 the attributes is centimeters, measured to the near-
est millimeter.
If we have the original training examples as the The data set was first split into a training set
testing data, then we can derive the results shown in and a test set, and the fuzzy learning algorithm

(9 <bj
FU=Y
value
SW0
1

nl n’
sepalL0lgth ” Sepal Width

(c) (d) ,
Fuw FU=J
value PLL, PL, value m, PW, PW3
PL2 pwo
1’ \ 1

V
OL 3.25 [Link] 5.7 0 030.5 1.01.3 [Link].1
Petal Length Petal Width

Fig. 22. The derived membership functions of the four attributes.


46 T.-P. Hong, C.-Y. Lee i Fuzz_vSets and Systems X4 (1996) 33-47

Table 2 of flower, the correct classification ratio was meas-


The eight derived fuzzy inference rules ured by averaging 200 runs. The number of derived
rules was also recorded. Experimental results are
Sepal Sepal Petal Petal Class
length width length width
shown in Table 3.
From Table 3, it can be seen that high accuracy
SLO SW0 PLO PWO Setosa was obtained from the fuzzy learning algorithm:
SLO SW0 PLO PWl Versicolor 100% accuracy for Setosa, 94% for Versicolor and
SLO SW0 PLI PWl Versicolor
92.72% for Virginica. Also, the average number of
SLO SW0 PLO PW3 Versicolor
PWI Virginica
rules in this experiments is 6.21, a small number
SLO SW0 PL2
SLO SW0 PLO PW2 Virginica compared to the number of instances. The fuzzy
SLO SW0 PLI PW3 Virginica learning algorithm can then be said to work well in
SLO SW0 PL2 PW3 Virginica automatically deriving membership functions and
inference rules.

Table 3
The average accuracy of the fuzzy learning algorithm for the Iris 8. Conclusions
Problem
In this paper, we have proposed a general learn-
Setosa Versicolor Virginica Average Number of rules
ing method for automatically deriving membership
100% 94% 92.72% 95.57% 6.21 functions and fuzzy if-then rules from a set of given
training examples. The proposed approach can sig-
nificantly reduce the time and effort needed to
develop a fuzzy expert system. Based on the mem-
was run on the training set to induce fuzzy classi- bership functions and the fuzzy rules derived, a cor-
fication rules and membership functions. The rules responding fuzzy inference procedure to process
and membership functions derived were then tested inputs was also applied. Using the Iris Data, we
on the test set to measure the percentage of found our model gives a rational result, few rules,
correct predictions. In each run, 50% of the and high performance.
Iris Data were selected at random for training,
and the remaining 50% of the data were used for
Acknowledgements
testing.
In the original data order, the derived member-
The authors would like to thank the anonymous
ship functions of the four attributes are shown in
referees for their very constructive comments.
Fig. 22 and the eight derived fuzzy inference rules
are shown in Table 2.
From Fig. 22, it is easily seen that the numbers of References
membership functions for the attributes sepal
length and sepal width are one, showing that these 111 H.R. Berenji, Fuzzy logic controller, in: R.R. Yager and
two attributes are useless in classifying the Iris L.A. Zadeh, Eds.. An Introduction to Fuzzy Logic Applica-
tions in Intelligent Systems (Kluwer Academic Publishers,
Data. Also, the initial membership functions of the
Dordrecht, 1992) 45596.
attribute petal length were finally merged into only
PI B.G. Buchanan and E.H. Shortliffe, Rule-Based Experr
three ranges, and the initial membership functions System: The M YCIN Experiments of‘the Stundford Heuris-
of the attribute petal width were finally merged into tic Proyramminq Projects (Addison-Wesley. Reading, MA,
only four ranges, showing that a small number of 1984).

membership functions are enough for a good rea- 131 D.G. Burkhardt and P.P. Bonissone, Automated fuzzy
knowledge base generation and tuning, IEEE Internut.
soning result. Conf: on Fuzzy Systems (San Diego, 1992) 179-l 88.
Experiments were then made to verify the accu- c41 R. Fisher, The use of multiple measurements in taxonomic
racy of the fuzzy learning algorithm. For each kind problems, Ann. Euyenics 7 (1936) 1799188.
T.-P. Hong. C.-Y Lee IIFuzzy Srrs mui $sterns ii4 (1996) 33~ 47 47

[S] 1. Graham and P.L. Jones, Expert Systems Knowledge, [I71 R.S. Michalski. J.G. Carbonell and T.M. Mitchell.
Uncertainty and Decision (Chapman and Computing, Mochine Leuminq: An Arfijcirrl It~trlli~qrm~c 4pprowh.
Boston. 1988) I 17-l 58. Vol. 2 (Morgan Kaufmann. Los Altos. CA. 1984).
[6] K. Hattori and Y. Tor, Effective algorithms for the nearest [I81 H. Nomura, 1. Hayashi and N. Wakami. A learning
neighbor method in the clustering problem. Ptrtton method of fuzzy inference rule\ by descent method.
Reuqnition 26 (1993) 741-746. /EEE Imcwtrr. Con/: on FI,::!, .S~wcw.s (San Diego. 1992)
[7] S. Isaka and A.V. Sebald. An optimization approach for 203~~210.
fuzzy controllers design. IEEE Trans. Swrms Mtrn C&r- 1191 G. Riley. Expert Sy~~~erns - Prinup/c\ orxl Proqrummifut
ncf. 22 (1992) 1469. 1473. (Pus-Kent. Boston, 1989) I- 59.
[X] 0. Itoh, H. Migita and A. Miyamoto, A method of design [20] J.C. Schimmer, Database consistency via inductive Icarn-
and adjustment of fuzzy control rules based on operation mg, Prrw. 8th Imrrmrt. Wor!,shop cm Mtrchiw Lrurnirlq
know-how. Proc. 3rd IEEE Conf. 011 Ftrxy Swiw~s (Morgan Kaufmann, San Mateo. CA. 1991).
(Orlando, FL, 1994) 492-497. [?I] G. Shafer and R. Logan. Implementing Dcmpster’s rule for
[9] A. Kandel. Fu::!, E,yprrt S~.sf~n.s (CRC Press. Boca Raton. hierarchical evidence. ,4rtiJicirr/ /rltc,//iqrmc 33 (I 987)
FL. 1992) Xm19. 271 ‘9X.
[lo] M.H. Kim. J.H. Lee and Y.J. Lee. Analysis of fuzzy oper- [27] T. Takagi and M. Sugeno, Fuzzy identification of systems
ators for high quality information retrieval. In/hrm. Pro- and its applications to modeling and control. IEEE T,arx.
wssirtg Left. 46 (1993) 25 I 256. S~5twn.s ,Zfa~l C&wet. 15 ( 1985) 1I6 137.
[1 l] G.J. Klir and T.A. Folger. F~c;zy SY~S. L’nc~rrttrinr~. and 1231 L.X. Wang and J.M. Mendel. Generating fuzzy rules by
Infirmrrtion (Prentice Hall. Englewood Cliffs. NJ. 1992) learning from examples, IEEE Trtrrls. Sw~ws ,Morl CJhrr-
4-- 14. I7VI. 22 (1992) 1414 1427.
[12] Y. Kodratofl’ and R.S Michalski, 12lachi~, Lwrniq A,? 1243 R. Weber. Furry-ID3: a class of methods for automatic
lrt$cia/ Infrlliyencc Approwh. Vol. 3 (Morgan Kauf- knowledge acquisition. Proc. 2nd Inrwrut. C‘orlf: ON
mann, San Mateo, CA. 1990). FKZJ~ Loqk rrnd .5’cwro/ h’c>fwrwLs (Iizuka. Japan. 1992)
[I31 C.C. Lee, Fuzzy logic in control system: fuzzy logic con- ‘65 268.
troller Part I and Part II. IEEE Turns. 5’wem.s !!lurl 1251 Y. Yuan and M.J. Shaw. Induction of furry decirion trees.
Cyhwnet. 20 (I 990) 404- 435. l:rr::~, Sers S~strn~ 69 (1995) l’75mI?‘).
[I41 C.Y. Lee. Automatic acquisition offuzzq knowledge, Mas- 1361 L.A. Zadch. Fu-_,, Set\. I,!form. trrul C,,[Link] X (1965)
ter Thesis. (Chung-Hua Polytechnic Institute. Hsinchu. 33%35?.
Taiwan. 199 5). [27] LA Zadeh. Fuzzy logic. IEEE COU~~U~.(1988) X3 93.
[15] E.H. Mamdani. Applications of fuzzy algorithms for control [2X] H.J. Zimmermann. Flcz:l &I Y. Dccisiorl AltrXim/ wul
of simple dynamic plant, II:‘EE Proc. 121 (1974) 1585-l 588. El-put S~~.st~w~s(Kluwer Academic Publishers. Boston,
[16] R.S. Michalski. J.G. Carbonell and T.M. Mitchell, 1987).
.Mochine Learniny: iln Arti/icin/ Iutellicqenw Apprrxrc,h, 1293 H.J. Zimmermann. Fu:~J, Ser Thrwr~~ tmd its .Ippl,crrtiow
Vol. 1 (Morgan Kaufmann. Los Altos, CA. IYX?). (Kluwer Academic Publisher. Boston. I991 1.

You might also like