Medicine: Cancer Research: Hybrid TiO2 Photocatalysts for Water Splitting & Degradation of VOCs
By Dr. Florin Gorunescu
and Dr. Marina Gorunescu Abstract.
This paper focuses on a genetic algorithm approach helping the process of
cancer diagnosis and treatment. In both the diagnosis and treatment processes,
the physician usually compares numerical medical data against specific internal
parameters (thresholds) in order to determine an optimal decision. The goal of
this paper is to explore a GA-based approach in order to help the physician to
find the optimum threshold values related both to diagnose and treatment of the
cancer. 1. Introduction The occurrence of
particular types of cancer varies remarkably according to a wide range of
factors, including age, sex, calendar time, geography, etc. The oncologist
studies how the disease depends on a constellation of risk factors acting on
the population and uses this information to determine the best measures for
prevention and treatment [1]. The diagnosis of different types of cancer is
difficult, especially in the early stages, most of the patients being diagnosed
in advanced stages. For instance, a complex analysis process involving the alpha-fetoprotein
(AFP), the use of imaging modalities (e.g. power Doppler, harmonic imaging,
pulse inversion, etc.), combined with micro bubble contrast agents and a better
understanding of the importance of the main serum enzymes values (e.g. ALT,
AST, BRT, GGT, etc.), has significantly improved the rate of detection for
early (small) hepatocellular carcinoma (HCC) [2]. Irrespective of the detection
factor, frequently expressed as numerical data, the most important step
consists in evaluating accurately the specific internal threshold values,
corresponding to a specific disease. When the diagnosis
problem has been successfully solved, attention is given to design a treatment
procedure. The design of the optimal treatment formula strongly depends on the
patient specific features and, consequently, requires methods of associating
some quantitative and qualitative patient medical data to a certain treatment
procedure. Irrespective of the particular patient characteristic influencing
the therapy type, for each of these parameters, there are threshold values
implying a decision concerning the appropriate therapy [3]. We aim to introduce here
a genetic algorithm-based approach in order to obtain optimal (or near optimal)
threshold values helping the cancer diagnose and treatment process. 2. Materials and methods Genetic algorithms Genetic algorithms (GAs)
were developed by Holland (1975) and extended by Goldberg (1989), to solve
difficult optimization problems by intelligent exploitation of a random search.
They are stochastic algorithms with the natural evolution metaphor behind their
building philosophy. Since the classical genetic algorithms, operating on
binary strings, require the modification of the original problem, we will use
here an evolution program [6], [7], which leaves the problem unchanged,
modifying the chromosomes representation and applying appropriate genetic
operators. We give here only a
brief remainder necessary to describe the genetic algorithms context.
Generally, a genetic algorithm may be considered to be composed of three
essential components: §
A
set of potential solutions called individuals or chromosomes that
will evolve during a number of iterations (generations). This set of solutions
is also called population; §
An
evaluation mechanism (fitness function) that allows assessing the
quality or fitness of each individual of the population; §
An
evolution procedure that is based on some "genetic" operators such as
selection, crossover and mutation. Concretely, a genetic
algorithm (as well as any evolution program) solving a particular problem
consists in [8]: §
A
genetic representation for potential solutions to the problem; §
A
way to create an initial population of potential solutions; §
An
evaluation function that plays the role of the environment rating solutions in
terms of their "fitness"; §
Genetic
operators that alter the composition of "parents", thus producing
"children"; §
Values
for various parameters that the genetic algorithm uses (population size,
probabilities of applying genetic operators, etc.). In
our GAs approach, let us firstly consider a number of S individuals
either in healthy state or suffering from a certain type of cancer. In order to
help a good decision making, we present here a genetic algorithm approach which
allows the classification of a certain individual into k classes
related, on the one hand, to (k – 1) types of cancer and, on the other
hand, to a class corresponding to the healthy state case. This algorithm is
quite simple: it compares, the same way the physician does, the values of n
parameters (V_{ij}), i = 1, 2,…, n, As concerns the therapy procedure, let us consider, as above, a number of S individuals suffering from a certain type of cancer. Generally, for any type of disease, a certain number k of treatment procedures might be considered. In order to help a good decision-making concerning the appropriate therapy taking into account the specific characteristics of each individual (that is, specific data with the corresponding thresholds), we consider the same genetic algorithm approach as above, allowing the classification of the treatment formulas into one of the k classes, depending on the specific patient features. Irrespective of the
situation (i.e. diagnose or treatment), it is easy to see that, for k different
classes, the number of thresholds is (k - 1) for each of the n
parameters (V_{ij}) and,
consequently, there are, totally, a number of n(k - 1)
thresholds, denoted by X_{i} and seen as chromosomes in our GAs
approach. Next, selection is carried out by the Monte Carlo procedure, the
classical one-point crossover is used to generate new chromosomes and for the
mutation the simple translation (_{}one step, randomly) technique is used. To evaluate the
fitness of a chromosome, we run a simple classification algorithm, given by: The classification algorithm IF _{}j = 1,…, S, _{}i = 1,…, n, V_{ij} _{} X_{i}
THEN Class = C_{1} ELSE IF _{}j = 1,…, S, _{}i = 1,…, n,
V_{ij} _{} X_{n+i}
THEN Class = C_{2} ………………………………………………………………. ELSE IF _{}j = 1,…, S, _{}i = 1,…, n, V_{ij} _{} X_{n}_{(k-2)+i}
THEN Class = C_{k-}_{1} ELSE Class = C_{k}. The cost function is given by the sum of individuals that are classified in the right way. These classified individuals are those for which the class determined by the classification algorithm is the same as the known class given by the physician. The aim is, obviously, to maximize this function. The stop condition is reached when the number of the current generation becomes the number of generations that is set in the beginning of the algorithm. Java
implementation What is important about the Java implementation of the program is that all data about patients collected by physicians can, at any time, be added, modified or deleted, with no change in the source of the program whatsoever. That is so because for the processing of the data we have used JDBC (Java Database Connectivity). Let us also note that physicians can also modify the structure of that table, adding new parameters that may prove to be important to the diagnostic, and still the program remains functional. 3. Results and discussion In order to check up the
efficiency of this approach we have tested it both in the diagnose process and
the treatment evaluation on a small learning data set. Firstly, we have
considered a number S = 15 subjects (7 in healthy state and 8 with HCC).
For the diagnose process we have considered 5 parameters consisting in 4 serum
enzymes (ALT -alanine aminotransferase, AST -aspartate
aminotransferase, BRT -total bilirubin and GGT -gamma glutamyl
transpeptidase) plus the subject age. As concerns the classification
classes we have considered two classes: C_{1} = {individuals
without hepatic cancer} and C_{2} = {individuals with hepatic
cancer}. In this case we have obtained on average 72% individuals with the
right diagnosis. Secondly, we have tested
our GAs model on a number of S = 17 females with breast cancer. In this
case, we have considered the following two standard treatment procedures, seen
in an increasing complexity order: §
chemotherapy
(CT); §
chemotherapy
(CT) + hormone therapy (HT); We obtained on average 67% accuracy for the treatment formula design. Let us mention that a population of Y = 30 chromosomes was used in our study. 4.
Conclusion This experiment seems to
be satisfactory enough from a practical point of view. It shows that the
algorithm is able to find the (near) optimal solution with a good enough
accuracy. Since the number of subjects was small, the thresholds obtained using
the GAs approach are strongly related to this particular database. Obviously,
with much more subjects in the study and a larger number of medical
characteristics the classification problem will become more complicate. On the
other hand, we have to consider alternative crossovers, mutations and
corresponding probabilities in order to improve the classification accuracy.
Clearly, much work still needs to be done before this method could be brought
into practice. References
[1] M. Abellof, J. Armitage, A. Lichter, J. Niedehuber (Eds.), Clinical Oncology, 2nd Edition. Churchill Livingstone, 2000. [2] A.
Saftoiu, T. Ciurea, F. Gorunescu, Hepatic arterial blood flow in large
hepatocellular carcinoma with or without portal vein thrombosis:
assessment by transcutaneous Duplex Doppler sonography. European Journal of
Gastroenterology & Hepatology, Vol. 14(2), p. 167-176, 2002. [3] F.
Gorunescu, M. Gorunescu, F. Badulescu, A. Badulescu, Data mining techniques
in uterus cancer. In: Proceedings of the 2nd Romanian Congress of Surgical
Oncology, October 2002, Cluj, Romania (abstract), Romanian Journal of Surgical
Oncology, 3, 1, p. 29, 2002. [4] J.H. Holland, Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor, 1975. [5] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA, 1989. [6] D.B. Fogel, Evolutionary Computation: Toward a New Philosophy of Machine Intelligence. IEEE Press, Piscataway, NJ, 1995. [7] D. Dumitrescu, Genetic algorithms and evolution strategies -Applications in Artificial Intelligence and connex domains. Microinformatica, Cluj-Napoca, 2000. [8] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Second, Extended Edition. Springer-Verlag, 1994. BWW Society member was born on February 5th,
1953 in Craiova, Romania. In 1976
graduated from the Faculty of Mathematics and Informatics of the University of
Craiova, and in 1979 he received his Ph. D. in Mathematics from the University
of Bucharest. Dr. Professor of
Mathematics, Statistics and Informatics at the University of Medicine and
Pharmacy of Craiova. Professor Gorunescu presently serves as Deputy Dean of the
Faculty of Pharmacy, Department of Mathematics, Biostatistics and Informatics
at Romania's University of Medicine and Pharmacy of Craiova. Dr. Gorunescu has published
six books and more than 90 scientific papers, and serves as a reviewer for
Zentralblatt fur Mathematik. He is a
Member of the French Society of Statistics, and has received academic
scholarships from both l' Universite Libre de Bruxelles, Belgium and the
University of Ulster, in the United Kingdom. [ BWW Society Home Page ] © 2003 The BWW Society/The Institute for the Advancement of Positive Global Solutions |