Prof. Gualberto Asencio Cortés, Ph.D. is a Computer Science Engineer (University of Seville, 2008), Master in Software Engineering and Technology (University of Seville, 2010), Ph.D. (University of Pablo de Olavide, 2013) and he has an Executive Master in Innovation (EOI, Spain, 2016). He is Associate Professor of Computer Science (Profesor Titular de Universidad), in the area of Languages and Information Systems at the University of Pablo de Olavide. He is the author of more than 28 publications in impact journals according to JCR (20 of them between Q1 and Q2) and author of more than 30 articles in international and national conferences, most of them published in LNCS and LNBI. He has participated in three projects of the National Plan and three more of the Andalusian Research Plan. He is an editor of PLOS ONE (IF: 2.806, Q1), a regular reviewer of journals indexed in JCR (PLOS ONE, Bioinformatics, Neurocomputing, Computer and Geosciences, etc.) and member of the program committee in numerous international conferences. He has participated in more than 12 technology transfer contracts between the university and the company, including ISOTROL, Red Eléctrica Española and DETEA. He has 5 months of international research stays and 3 national months.
The research lines of Prof. Gualberto Asencio Cortés, Ph.D. are focused on data mining, machine learning, prediction of time series and bioinformatics, with different fields of application: prediction of natural series (seismic, air quality, meteorological, agronomic, …), prediction of electricity consumption and market prices, prediction of urban traffic, as well as bioinformatics in prediction of biological structures. He has also been data scientist and member of the steering committee responsible for artificial intelligence and data science technologies at the private company easytosee AgTech SL for more than 2 years (2015-2017).
Publications
2017 |
N. Bokde and A. Troncoso and G. Asencio-Cortés and K. Kulat and F. Martínez-Álvarez Pattern sequence similarity based techniques for wind speed forecasting Conference ITISE International Work-Conference on Time Series Analysis, 2017. @conference{ITISE2017, |
N. Bokde and G. Asencio-Cortes and F. Martinez-Alvarez and K. Kulat PSF: Introduction to R Package for Pattern Sequence Based Forecasting Algorithm Journal Article In: R Journal, vol. 1, no. 9, pp. 324-333, 2017, ISSN: 2073-4859. @article{Bokde2016a, This paper discusses about an R package that implements the Pattern Sequence based Forecasting (PSF) algorithm, which was developed for univariate time series forecasting. This algorithm has been successfully applied to many different fields. The PSF algorithm consists of two major parts: clustering and prediction. The clustering part includes selection of the optimum number of clusters. It labels time series data with reference to such clusters. The prediction part includes functions like optimum window size selection for specific patterns and prediction of future values with reference to past pattern sequences. The PSF package consists of various functions to implement the PSF algorithm. It also contains a function which automates all other functions to obtain optimized prediction results. The aim of this package is to promote the PSF algorithm and to ease its implementation with minimum efforts. This paper describes all the functions in the PSF package with their syntax. It also provides a simple example of usage. Finally, the usefulness of this package is discussed by comparing it to auto.arima and ets, well-known time series forecasting functions available on CRAN repository. |
J. L. Amaro-Mellado and A. Morales-Esteban and G. Asencio-Cortes and F. Martinez-Alvarez Comparing seismic parameters for different source zone models in the Iberian Peninsula Journal Article In: Tectonophysics, no. 717, pp. 449-472, 2017, ISSN: 0040-1951. @article{Amaro-Mellado2017, Seismical parameters of five seismogenic zonings for the Iberian Peninsula have been determined in this work. For that purpose, this research has two key goals. The first is to generate a seismic catalog. The second to calculate the seismical parameters of all the zones of the seismogenic zonings selected. The first key goal has been the creation of a catalog of earthquakes for the Iberian Peninsula and adjacent areas. First, the National Geographic Institute of Spain's catalog has been completed and reviewed with the information from other catalog reviews and specific studies. Second, all magnitude calculations have been homogenized. Third, all dependent data have been eliminated through declustering. Finally, the year of completeness for each magnitude has been considered. The Quaternary active faults database of Iberia has also been used as input data. All of this information has been integrated into a geographic information system. The second key aim is the calculation of the seismical parameters. The first parameter obtained has been the b-value. A method which considers different years of completeness in accordance with the magnitude has been used. Also, the annual rate of earthquakes per square kilometer has been calculated. Moreover, the maximum magnitude known that Quaternary active faults might generate and maximum magnitude recorded in the catalog have been determined. Finally, based solely on the statistical parameters obtained, a critical discussion of the seismogenic zonings of the Iberian Peninsula has been conducted. The results show that some zonings possess insufficient data for a proper calculation of the seismic parameters, from a statistical point of view. |
M. J. Fernández-Gómez and G. Asencio-Cortés and A. Troncoso and F. Martínez-Álvarez Large earthquake magnitude prediction in Chile with imbalanced classifiers and ensemble learning Journal Article In: Applied Sciences, vol. 7, no. 6, pp. 625, 2017. @article{APSCI2017, This work presents a novel methodology to predict large magnitude earthquakes with horizon of prediction of five days. For the first time, imbalanced classification techniques are applied in this field by attempting to deal with the infrequent occurrence of such events. So far, classical classifiers were not able to properly mine these kind of datasets and, for this reason, most of the methods reported in the literature were only focused on moderate magnitude prediction. As an additional step, outputs from different algorithms are combined by applying ensemble learning. Since false positives are quite undesirable in this field, due to the social impact that they might cause, ensembles have been designed in order to reduce these situations. The methodology has been tested on different cities of Chile, showing very promising results in terms of accuracy. |
G. Asencio-Cortés and F. Martínez-Álvarez and A. Troncoso and A. Morales-Esteban Medium-Large earthquake magnitude prediction in Tokyo with artificial neural networks Journal Article In: Neural Computing and Applications, vol. 28, no. 5, pp. 1043-1055, 2017. @article{NCA2017, This work evaluates artificial neural networks’ accuracy when used to predict earthquakes magnitude in Tokyo. Several seismicity indicators have been retrieved from the literature and used as input for the networks. Some of them have been improved and parameterized in order to extract more valuable knowledge from datasets. The experimental set-up includes predictions for five consecutive datasets referring to year 2015, earthquakes with magnitude larger than 5.0 and for a temporal horizon of seven days. Results have been compared to four well-known machine learning algorithms, reporting very promising results in terms of all quality parameters evaluated. The statistical tests applied conclude that differences between the proposed artificial neural network and the other methods are significant. |
G. Asencio-Cortes and S. Scitovski and R. Scitovski and F. Martinez-Alvarez Temporal analysis of croatian seismogenic zones to improve earthquake magnitude prediction Journal Article In: Earth Science Informatics, vol. 3, no. 10, pp. 303-320, 2017, ISSN: 1865-0481. @article{AsencioCortes2017, |
G. Asencio-Cortés and F. Martínez-Álvarez and A. Morales-Esteban and J. Reyes and A. Troncoso Using principal component analysis to improve earthquake magnitude prediction in Japan Journal Article In: Logical Journal of the IGPL, vol. 25, no. 6, pp. 949-966, 2017. @article{IGPL2017, Increasing attention has been paid to the prediction of earthquakes with data mining techniques during the last decade. Several works have already proposed the use of certain features serving as inputs for supervised classifiers. However, they have been successfully used without any further transformation so far. In this work, the use of principal component analysis (PCA) to reduce data dimensionality and generate new datasets is proposed. In particular, this step is inserted in a successfully already used methodology to predict earthquakes. Tokyo, one of the cities mostly threatened by large earthquakes occurrence in Japan, is studied. Several well-known classifiers combined with PCA have been used. Noticeable improvement in the results is reported. |
2016 |
G. Asencio-Cortés and E. Florido and A. Troncoso and F. Martínez-Álvarez A novel methodology to predict urban traffic congestion with ensemble learning Journal Article In: Knowledge and Information Systems, vol. 20, pp. 4205–4216, 2016. @article{ASENCIO16, |
G. Asencio-Cortés and F. Martínez-Álvarez Supervised learning applied to urban traffic congestion forecasting Conference KOI 16th International Conference on Operational Research, 2016, ISBN: 1849-5141. @conference{ASENCIO16-2, |
G. Asencio-Cortés and E. Florido and A. Troncoso and F. Martínez-Álvarez A novel methodology to predict urban traffic congestion with ensemble learning Journal Article In: Soft Computing, vol. 20, no. 11, pp. 4205-4216, 2016. @article{SOFTCO2016, Urban traffic congestion prediction is a very hot topic due to the environmental and economical impacts that currently implies. In this sense, to be able to predict bottlenecks and to provide alternatives to the circulation of vehicles becomes an essential task for traffic management. A novel methodology, based on ensembles of machine learning algorithms, is proposed to predict traffic congestion in this paper. In particular, a set of seven algorithms of machine learning has been selected to prove their effectiveness in the traffic congestion prediction. Since all the seven algorithms are able to address supervised classification, the methodology has been developed to be used as a binary classification problem. Thus, collected data from sensors located at the Spanish city of Seville are analyzed and models reaching up to 83 % are generated. |
N. Bokde and K. Kulat and M. Beck W and G. Asencio-Cortes R package imputeTestbench to compare imputations methods for univariate time series Journal Article In: R Journal, 2016, ISSN: 2073-4859. @article{Bokde2016, This paper describes the R package imputeTestbench that provides a testbench for comparing imputation methods for missing data in univariate time series. The imputeTestbench package can be used to simulate the amount and type of missing data in a complete dataset and compare filled data using different imputation methods. The user has the option to simulate missing data by removing observations completely at random or in blocks of different sizes. Several default imputation methods are included with the package, including historical means, linear interpolation, and last observation carried forward. The testbench is not limited to the default functions and users can add or remove additional methods using a simple two-step process. The testbench compares the actual missing and imputed data for each method with different error metrics, including RMSE, MAE, and MAPE. Alternative error metrics can also be supplied by the user. The simplicity of use and significant reduction in time to compare imputation methods for missing data in univariate time series is a significant advantage of the package. This paper provides an overview of the core functions, including a demonstration with examples. |
G. Asencio-Cortes and F. Martinez-Alvarez and A. Morales-Esteban and J. Reyes A sensitivity study of seismicity indicators in supervised learning to improve earthquake prediction Journal Article In: Knowledge-Based Systems, no. 101, pp. 15-30, 2016, ISSN: 0950-7051. @article{Asencio-Cortes2016, The use of different seismicity indicators as input for systems to predict earthquakes is becoming increasingly popular. Nevertheless, the values of these indicators have not been systematically obtained so far. This is mainly due to the gap of knowledge existing between seismologists and data mining experts. In this work, the effect of using different parameterizations for inputs in supervised learning algorithms has been thoroughly analyzed by means of a new methodology. Five different analyses have been conducted, mainly related to the shape of training and test sets, to the calculation of the b-value, and to the adjustment of most collected indicators. Outputs sensitivity has been determined when any of these factors is not properly taken into consideration. The methodology has been applied to four Chilean zones. Given its general-purpose design, it can be extended to any location. Similar conclusions have been drawn for all the cases: a proper selection of the sets length and a careful parameterization of certain indicators leads to significantly better results, in terms of prediction accuracy. |
2015 |
F. Martínez-Álvarez and A. Troncoso and G. Asencio-Cortés and J. C. Riquelme A Survey on Data Mining Techniques Applied To Electricity-Related Time Series Forecasting Journal Article In: Energies, vol. 8, no. 11, pp. 13162-13193, 2015. @article{Energies2015, Data mining has become an essential tool during the last decade to analyze large sets of data. The variety of techniques it includes and the successful results obtained in many application fields, make this family of approaches powerful and widely used. In particular, this work explores the application of these techniques to time series forecasting. Although classical statistical-based methods provides reasonably good results, the result of the application of data mining outperforms those of classical ones. Hence, this work faces two main challenges: (i) to provide a compact mathematical formulation of the mainly used techniques; (ii) to review the latest works of time series forecasting and, as case study, those related to electricity price and demand markets. |
A. E. Marquez-Chamorro and G. Asencio-Cortes and C. E. Santiesteban-Toca and J. S. Aguilar-Ruiz Soft computing methods for the prediction of protein tertiary structures: A survey Journal Article In: Applied Soft Computing, no. 35, pp. 398-410, 2015, ISSN: 1568-4946. @article{Marquez-Chamorro2015, The problem of protein structure prediction (PSP) represents one of the most important challenges in computational biology. Determining the three dimensional structure of proteins is necessary to understand their functions at molecular level. The most representative soft computing approaches for solving the protein tertiary structure prediction problem are summarized in this paper. These approaches have been categorized following the type of methodology. A total of 90 relevant works published in last 15 years in the field of protein structure prediction have been reported, including the best competitors in last CASP editions. However, despite large research effort in last decades, a considerable scope for further improvement still remains in this area. |
G. Asencio-Cortes and J. S. Aguilar-Ruiz and A. E. Marquez-Chamorro An Efficient Nearest Neighbor Method for Protein Contact Prediction Conference Hybrid Artificial Intelligent Systems, 2015, ISBN: 978-3-319-19644-2. @conference{10.1007/978-3-319-19644-2_5b, A variety of approaches for protein inter-residue contact prediction have been developed in recent years. However, this problem is far from being solved yet. In this article, we present an efficient nearest neighbor (NN) approach, called PKK-PCP, and an application for the protein inter-residue contact prediction. The great strength of using this approach is its adaptability to that problem. Furthermore, our method improves considerably the efficiency with regard to other NN approaches. Our NN-based method combines parallel execution with k-d tree as search algorithm. The input data used by our algorithm is based on structural features and physico-chemical properties of amino acids besides of evolutionary information. Results obtained show better efficiency rates, in terms of time and memory consumption, than other similar approaches. |