Keywords: Diabetes Mellitus, Supervised Learning, Performance Analysis, Clinical
Decision Support Systems.
I. INTRODUCTION
Computer aided diagnosis plays an important role in medical field. Presently population
increasing and medical institutions becoming larger, this opens door for useful decision support
systems that can analyze large amount of information. It has been shown that the benefits of
introducing machine learning into medical analysis are to increase the diagnostic accuracy, to
reduce costs and to reduce human resources. In this study, the comparative performance
analysis of three supervised machine learning algorithms are studiedi.e, decision trees, logistic
regression and artificial neural network by using Waikato Environment for Knowledge
Analysis machine learning toolbox [3] for educational purposes. The algorithms are tested over
the Pima Indian diabetes dataset. Pima Indian Diabetes database had been examined with
several different machine learning methods in the past [59]. DiabetesMellitus refers to the
metabolic disorder that happens from malfunction in insulin secretion and action. It is
characterized by hyperglycemia. There are two types of diabetes disorder but generally the
symptomatic and lab results are same. The diagnosis of diabetes is very important now days
using various types of techniques.
II. MATERIALS AND METHODS
A. Data Collection
The data set was obtained from the UCI Repository of Machine Learning Databases
[3]. The data set was selected from a larger data set held by the National Institutes of Diabetes
93
and Digestive and Kidney Diseases. The patients in the PimaIndian dataset are women at least
21 years old and living near Phoenix, Arizona, USA. The dichotomousoutcomeattribute takes
the values ‘0’ or ‘1’, where ‘1’ means a positive test for diabetes and ‘0’ is a negative test for
diabetes. There are 500 (65.1%) cases in class ‘0’ and 268 (34.9%) cases in class ‘1’. The
dataset contains eight clinical findings which are:
1. Number of times pregnant
2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test
3. Diastolic blood pressure (mm Hg)
4. Triceps skin fold thickness (mm)
5. 2Hour serum insulin (mu U/ml)
6. Body mass index
7. Diabetes pedigree function
8. Age (years)
B. Decision Trees
A decision tree (DT) is a graph that uses a branching method to illustrate every possible
outcome of a decision for particular. The goal in building a tree is to identify a best splitting
attribute which is being found by Entropy and Information Gain. The detailed theoretical
background regarding decision trees can be found here [2].
C. Logistic Regression
The logistic regression (LRM) has wide range of implications in medical research field.
The LRM model is used for the classification of the attributes, which might help to classify the
outcome. The distinctive feature of the model is that the outcome variable is dichotomous. The
result is not bounded to a linear form. As a result, the created model can be used to classify a
newly provided data via placing them in a model for the probability P, the detailed information
is provided in [1].
D. Multi-Layer Perceptron Neural Networks
Multilayer perceptron neural networks (MLP)are processing devices, whichclosely
resemble a modelof the neuronal structure of the mammalian cerebral cortex. LargeMLPs
might have hundreds or thousands of processor units, whereas a mammalian brain has billions
of neurons with a corresponding increase in magnitude of their overall interaction and
emergent behavior. Generally the neural network has three components or layers. The first
layer is called input layer, through which it gets data inside network on our case disease related
attributes. The second layer is called hidden where all operations performed. The last layer
called out where network make final decision regarding patient’s condition. The detailed
information regarding neural networks provided in [1, 2].
E. Simulated Program
The Waikato Environment for Knowledge Analysis (WEKA),is one of the best tools in
teaching machine learning without going into details first. The tool is basedon Java platform
that contains a large number of algorithms for data preprocessing, feature selection,
classification, clustering, and finding the associative rule [4]. WEKA uses a common data
representation format, making comparisons easy. It has three operation modes i.e., GUI,
Command Line, and Java API.
F. Performance Measures
Evaluation of the classifier to measure the quality is commonly evaluated based on the
data in the confusion matrix. Several standard measures have been defined for correct and
incorrect classification results of the matrix. The most common practical measure to evaluate
the performance is accuracy, which is defined as the proportion of the total number of
instances that were classified correctly.
Recall is the mean proportion of actual positives which are correctly identified.
Precision is the mean proportion of positives which are relevant. Fmeasure is a harmonic
mean of recall and precision.
94
These performance metrics are calculated according to the data in the confusion matrix
which are obtained by the WEKA tool.
III. SIMULATION RESULTS
In this study, the Pima dataset of patients with diabetes disorder, which containing 9
original features by using three machine learning methods i.e., DT (C4.5), LRM and MLP used
for classification. The performance metrics like accuracy, recall, precision and fmeasure along
with error metrics forall features has been performed using 10fold crossvalidation.The
simulations were performed by using WEKA3.8 machine learning tool.
No of correct ins.
with %
No of incorrect
ins.
with %
Build time
DT
567 (74%)
201 (26%)
0.12 sec
LRM
593 (77%)
175 (23%)
0.15 sec
MLP-
NN
583 (76%)
185 (24%)
1.45 sec
Table 1: Accuracy metrics of ML algorithms
According to the results provided in Table 1, the LRM model outperforms remaining
methods with the overall accuracy of 77% even though the MLP is considered one of top
classification methods it came the second one in this race. During the analysis we have
identified that this problem was due to overfitting issue.
Figure 1: Performance measure metrics of ML algorithms
The Figure 1 contains the five different performance measure results for three machine
learning algorithms. Based on the results the LRM method outperforms MLP by 1% and DT
by 3% on overall. Even though the MLP considered the complex method for classification and
prediction the complex nature of it, which contained one hidden layer with 5 nodes fall into
problem of overfitting. In the literature survey we have find out that for three algorithms the
accuracy metric was varying from partition to partition. For example, when we see the
accuracy of C4.5 model, 77.08% in case of 7525% trainingtesting partitions, 76.72% in case
of 8515% trainingtesting partitions and 75.32% in case of 9010% trainingtesting partitions.
95
Figure 2: Error metrics of ML algorithms
The Figure 2 shows three error metric results i.e., Kappa statistics, Mean Absolute
Error (MAE) and Root mean square error (RMSE) for simulated machine learning algorithms.
The Kappa statistics measures the agreement of prediction with the true class. The value which
is bigger than zero indicates that algorithm is not performed based on chance but rather taking
more logical approach. The MAE measures the average magnitude of the errors in a set of
forecasts, without considering their direction and the value close to zero is better. It measures
accuracy for continuous variables. The RMSE measures the average magnitude of the error and
the value close to zero is better. Based on simulation results shown in Figure 2, the LRM
outperforms all other methods.
IV. CONCLUSION
In this research study threesupervised machine learning algorithms were applied to the
Pima Indians Diabetes (PID) medical dataset. The performance of LRM was the best for all
performance and error metrics. The LRM method outperforms MLP by 1% and DT by 3% on
overall. MLP is considered one of top classification methods it came the second one. During
the analysis we have identified that this problem was due to overfitting issue. As a result shows
that, LRM methods can be a good and practical choice to classify a medical data.
References:
1 Norvig P, Russell S. Artificial Intelligence: A Modern Approach, Prentice Hall. 2002.
2 Dunham MH. Data mining: Introductory and advanced topics, Pearson Education, 2006.
3 UCI Repository of Machine Learning Databases, University of California at Irvine,
Department of Computer Science. Available: https://archive.ics.uci.edu/ml/datasets/Diabetes
(Accessed: 7 Sept. 2016).
4 HallM., FrankE., and HolmesG., B. Pfahringer, P. Reutemann, and I. H. Witten, "The
WEKA data mining software: an update," ACM SIGKDD explorations newsletter, 2009. Vol.
11, pp. 1018.
5 Raj Anand, Vishnu Pratap Singh Kirar, Kavita Burse, “ KFold Cross Validation and
Classification Accuracy of PIMA Indian Diabetes Data Set Using Higher Order Neural
Network and PCA ”, IJSCE, 2013.Vol. 2, Issue6, pp. 436438.
6 Y. Angeline Christobel, P.Sivaprakasam, “ A New Classwise k Nearest Neighbor (CKNN)
Method for the Classification of Diabetes Dataset”, IJEAT, 2013. – Vol. 2, Issue3, pp. 396
400.
7 Kumari V. Anuja, Chitra R. Classification of Diabetes Disease Using Support Vector
Machine. International Journal of Engineering Research and Applications,2013. Vol. 3, pp.
17971801.
8 Carpenter G.A., Markuzon N., “ARTMAPIC and medical diagnosis: Instance counting and
inconsistent cases”, Neural Networks, 1998. 323336 pp.
96
9 Deng, D., Kasabov, N., “Online pattern analysis by evolving selforganizing maps”, Proc. of
the 5th Biannual Conference on Aritificial Neural Networks and Expert Systems (ANNES),
Dunedin, November, 2001. pp. 4651.
10 Farahmandian M., Lotfi Y., Maleki I. Data Mining Algorithms Application in Diabetes
Diseases Diagnosis: A Case Study. MAGNT Research Report, 2015. Vol. 3, pp. 989997.
UDC 004.8
Tolebi G.A.
M.Sc., Kazakh-British Technical University,
Almaty, Kazakhstan, e-mail: tolebi.glr@gmail.com
DESCRITION OF THE COMPUTATIONAL INTELLIGENCE TECHNIQUES FOR
ADAPTIVE TRAFFIC
SIGNAL CONTROLLER: REINFORCEMENT LEARNING
Аңдатпа. Аталмыш жұмыс бейімделетін бағдаршам жүйесін құру үшін
қолданылатын есептеу интеллект əдістерін сипаттауға арналған. Атап айтқанда, бағалау
арқылы үйрену технологиясының түрлерін қолдану жолдары берілген. Жолдағы
жағдайға байланысты əрекет етуі өзгеретін басқарушыға қолданылатын алгоритмдерге
салыстырмалы талдау жасалған.
Кілт сөздер. traffic signal controller, reinforcement learning, artificial neural network,
actorcritic method, Qlearning.
I INTRODUCTION.
Traffic congestion one of the big problem for the big cities. It becomes serious issue,
since growth of number of vehicles in magisterial. An extension of roads or building of new
ones can be considered as one of the solution of current problem. However, it requires many
expenses and human resources. Nowadays, Computational Intelligence (CI) is widely used to
perform the applied problems. Extremely development of techniques of CI gives powerful
tools for solving problems for nonlinear stochastic systems like traffic flows. Proper modeling
of the traffic flows is complex task. Therefore, many researchers consider way of solution of
problem without mathematical modeling of vehicles movement. In the traffic control problem
traditional supervised learning techniques, such as Support Vector Machine, Random Decision
Tree, feedforward artificial neural network cannot be used. Because an environment is
dynamic at the given problem. There are no labeled targets for learning. The traffic control
model requires unsupervised learning technology. In the current work, evaluation of existing
CI methods of traffic signal control is proposed. Methods for traffic signal management by
applying Reinforcement Learning (RL) technique are described. This is more promised
technique of CI, which is successfully applied to control problems.
In this paper Qtable, ANN and ActorCritic implementation of RL is adopted to
develop adaptive TSC for an intersection. The remaining part of this paper is organized as
follows: Section II is literature review; Section III describes the RL and Qtable, ANN and
ActorCritic implementation and evaluation. And Section IV is conclusion.
II. BACKGROUND
Management of traffic flows was starting from the XIX century in London [1]. It was
semaphores which are controlled manually. During the passing of time, the controller has been
changed and has automotive control system. According to working principle, all TSC are
divided into three groups: pretimed, actuated and adaptive or intelligent. Pretimed signal
controllers have fixed time plan. Duration of phase time and whole cycle time, phase sequence
are fixed. Optimal time for each phase, usually calculated based on the Webster formula and
using historical data [2, 3]. It is a mostly used type of TSC. Advantage of fixedtime is simple
algorithm, which can be easily implemented and action is predicted for the vehicles. Actuated
97
signal controller is the strategy which is between pretimed and adaptive once. The phase time
of green signal is not fixed. There are predefined set of time plans for different states. And
according to the environment changes, the phase time is changed. Intelligent TSC has the
dynamic management strategy. The signal plan totally depends on the current state of the
roads. In addition, it may take to consideration a lot of additional parameters, like weather,
time, day of week and etc. The last type of control is very promising. According the T.Ardana,
after the installation of ITSC in the Astana city, average flow speed and throughout capacity of
the roads was increasing by 20% [4].
Development of the adaptive control system based on RL for traffic flows widely used
topic for researches. Dai et al. used Backpropogation Neural Network training by RL for TSC
in isolated intersection [5]. The proposed method combines the features of actuated and fixed
time strategies. Mannion et al. presented Parallel RL for TSC [6]. The main idea is to have
several agents: master and slaves, for collecting information from one intersection. More
knowledge about the environment can be taken. However, the task is online, it is
problematically to combine all agent’s data in fastest way. Cai et al. proposed the TSC based
on RL and Adaptive Dynamic Programming (ADP) that compared with the best pretimed
controller and shows the high efficiency [7].
III. REINFORECEMENT LEARNING TECHIQUES
Reinforcement Learning is training algorithm for system without any knowledge about
environment. Learning process proceeds by replying of environment, where environment gives
some reward to system, based on actions. For good action system is encouraged, for bad action
is punished. System considered as agent. Agent must find optimal strategy to interact with
environment. One of the popular algorithms of RL is Qlearning.
Qlearning is to optimal mapping from state to action, such that reward will be
maximum [5]. Basic Qlearning sub elements: a policy, a reward function, a value function.
Policy is a strategy of agent behavior. Reward is the map between some real number and state
action pair. Value function (Q) determines what is good in long term [8].
Figure 1: The basic reinforcement learning scenario
Problem statement:
S — set of all possible environment states
A — set of possible actions
P
a
(r) —distribution of reward r
∈ R for ∀a ∈ A
Π
t
(a) — policy of agent at the moment t, for distribution of A
Qlearning algorithm:
1: Initialization of policy
( | ) and state of environment
2: for all t = 1, . . . T, . . .
3: agent chooses an action
~ ( | );
4: environment generates the award
~ ( | , ) and new state
~ ( | , );
5: agent makes changes in the strategy
( | )
Update rule for Qvalue, represent as following:
( , ) ≔
( , ) + [ + ∙ max
(
′
,
′
) − ( , )]
(1)
where,
′
is a next state,
′
is the next optimal action in state
′
, learning rate and
discount factor are free parameters [9]. ,
∈ (0,1) . The higher the gamma, the more far
sighted agent.
Reward
System
State
Action
Controller
98
Two types of Qlearning implementation and for the management of TSC considered at the
given work: Qtable, Artificial Neural Network. In addition ActorCritic method of
Reinforcement Learning is also described in this paper.
Q-table. This is the simplest method of Qlearning, where Qvalues are stored in table
named as Qtable. Initially values of actionstate table completed randomly. Qtable provides
optimal mapping from state to action, such that reward will be maximum [5]. After each action
Qvalue of particular pair is updated according to formula (1). This method is wellknown in
problems with small set of states and actions. For solving transport problem environment state
and actions can be discretized. Main sub elements of Qlearning are represented as a
following:
State space. Set of all possible environment states considers level of workload and can
be represented as very low, low, medium, high, and extremely high. S: {VL, L, M, H,
EH}. Two direction of one road taken altogether. Way of the calculation of workload
can be chosen by different methods: count the number of halting vehicles, number of
passing cars, speed, etc.
Action space. The action space represents changing the duration of green phases of the
signal plan. Action space consists of three actions: extension, no change and decrease:
{
= +
,
= 0,
= −
} .
Reward. Reward shows changing of the intersection bandwidth after update the time of
green light. If workload is decreasing, then system is encouraged and positive reward
will given. Otherwise, system is punished and negative reward will obtain.
The aim of the adaptive traffic signal control is choose the effectiveness signal control
scheme for unloading traffic congestion. In the proposed method data to acquire for determine
collision chosen as halting vehicles on a one direction. Sum of two routes considered as one
direction because of the same signal plan for them. Data acquisition occurs once per cycle.
Based on the obtained data green phases for the next cycle is updated.
Artificial Neural Network (ANN). For the problems with uncountable states, other
methods of approximation of Qvalue are used. One of them is ANN, which is powerful
approximator of mapping from input to output values. In the given problem, when considered a
large transport network Qtable has a large size. It becomes difficult to store and calculate Q
values. Therefore, ANN using as Qvalue calculator for Qlearning. Simple sigmoid feed
forward ANN with three layers can be used: input, hidden and output layers.
The transfer function: LogSigmoid function:
y =
.
Training algorithm is error backpropagation algorithm.
Inputs: pair of action and state, additional information about state of intersection. Output layer
has one neuron: Qvalue for the given state and action. Hidden layer size determined by try and
error method. There is no explicit method for determining ANN structure. Most of the
architecture of NN steals up by experiments. Usage of ANN in Qlearning algorithm allows
taking into account big data that is affect to the environment. We can set additional information
as needed. Because ANN has ability to work and determine complex relationships between
input and outputs.
Another way to apply ANN for solving ITSC problem is use this technique to generate
action for the controller.
99
Figure 2 Three layer simple ANN
Input neurons: data about current state: green phase time, number of halting cars, state
in the neighbor intersections, information about weather, etc. Output layer has one neuron:
extension time (increase or decrease in seconds). The problem in such realization is in training
of ANN. As was mentioned before, ITSC is unsupervised learning problem. There are no
labeled targets for learning. The proposed mixed method of RL and ANN has ability to train
NN in right way. But it is difficult task to determine right way of punishment and
encouragement. An important issue in this problem is to assess the taken decisions: how to
determine the degree of rationality of decisions and how to give a reward. Possible options for
the evaluation: waiting time, average flow rate by length of congestion. The system can take
into account the ratio of the signal duration and the length of the congestion. Make balance of
the ratio of number of vehicles on the same crossroad one of the suggested methods. However,
in the case where all roads intersection is a long traffic jam, but with the same number of cars,
the last one will be invalid. Selection of the parameter for the assessment is an important part,
and therefore requires further refinement.
Actor-Critic Reinforcement learning method (ACRL). One of the effective methods
of RL for solving control problems. It has more complicated algorithm than previous proposed
techniques. In Qlearning optimum strategy is stored in action value function [10]. In ACRL
policy and state value functions are stored in different way. Actor is unit that stores the
strategy. Critic stores the state value. Main advantage of this method is separate techniques can
be used for Actor and Critic, such as ANN, Fuzzy logic, swarm intelligence, etc. For given
system, strategy of behavior of controller proposed uses the FuzzyLogic. By setting the rules,
policy of the agent is determined. To generate state value methods described above can be
used: ANN and Qtable implementation.
Достарыңызбен бөлісу: |