Prof. Gualberto Asencio Cortés, Ph.D. is a Computer Science Engineer (University of Seville, 2008), Master in Software Engineering and Technology (University of Seville, 2010), Ph.D. (University of Pablo de Olavide, 2013) and he has an Executive Master in Innovation (EOI, Spain, 2016). He is Associate Professor of Computer Science (Profesor Titular de Universidad), in the area of Languages and Information Systems at the University of Pablo de Olavide. He is the author of more than 28 publications in impact journals according to JCR (20 of them between Q1 and Q2) and author of more than 30 articles in international and national conferences, most of them published in LNCS and LNBI. He has participated in three projects of the National Plan and three more of the Andalusian Research Plan. He is an editor of PLOS ONE (IF: 2.806, Q1), a regular reviewer of journals indexed in JCR (PLOS ONE, Bioinformatics, Neurocomputing, Computer and Geosciences, etc.) and member of the program committee in numerous international conferences. He has participated in more than 12 technology transfer contracts between the university and the company, including ISOTROL, Red Eléctrica Española and DETEA. He has 5 months of international research stays and 3 national months.
The research lines of Prof. Gualberto Asencio Cortés, Ph.D. are focused on data mining, machine learning, prediction of time series and bioinformatics, with different fields of application: prediction of natural series (seismic, air quality, meteorological, agronomic, …), prediction of electricity consumption and market prices, prediction of urban traffic, as well as bioinformatics in prediction of biological structures. He has also been data scientist and member of the steering committee responsible for artificial intelligence and data science technologies at the private company easytosee AgTech SL for more than 2 years (2015-2017).
Publications
2024 |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and I. S. Brito and F. Martínez-Álvarez and G. Asencio-Cortés Embedded feature selection for neural networks via learnable drop layer Journal Article In: Logic Journal of the IGPL, pp. jzae062, 2024. @article{JIMENEZ-NAVARRO24b, Feature selection is a widely studied technique whose goal is to reduce the dimensionality of the problem by removing irrelevant features. It has multiple benefits, such as improved efficacy, efficiency and interpretability of almost any type of machine learning model. Feature selection techniques may be divided into three main categories, depending on the process used to remove the features known as Filter, Wrapper and Embedded. Embedded methods are usually the preferred feature selection method that efficiently obtains a selection of the most relevant features of the model. However, not all models support an embedded feature selection that forces the use of a different method, reducing the efficiency and reliability of the selection. Neural networks are an example of a model that does not support embedded feature selection. As neural networks have shown to provide remarkable results in multiple scenarios such as classification and regression, sometimes in an ensemble with a model that includes an embedded feature selection, we attempt to embed a feature selection process with a general-purpose methodology. In this work, we propose a novel general-purpose layer for neural networks that removes the influence of irrelevant features. The Feature-Aware Drop Layer is included at the top of the neural network and trained during the backpropagation process without any additional parameters. Our methodology is tested with 17 datasets for classification and regression tasks, including data from different fields such as Health, Economic and Environment, among others. The results show remarkable improvements compared to three different feature selection approaches, with reliable, efficient and effective results. |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Martínez-Álvarez and G. Asencio-Cortés Explaining deep learning models for ozone pollution prediction via embedded feature selection Journal Article In: Applied Soft Computing, vol. 157, pp. 111504, 2024. @article{JIMENEZ-NAVARRO24, Ambient air pollution is a pervasive global issue that poses significant health risks. Among pollutants, ozone (O3) is responsible for an estimated 1 to 1.2 million premature deaths yearly. Furthermore, O3 adversely affects climate warming, crop productivity, and more. Its formation occurs when nitrogen oxides and volatile organic compounds react with short-wavelength solar radiation. Consequently, urban areas with high traffic volume and elevated temperatures are particularly prone to elevated O3 levels, which pose a significant health risk to their inhabitants. In response to this problem, many countries have developed web and mobile applications that provide real-time air pollution information using sensor data. However, while these applications offer valuable insight into current pollution levels, predicting future pollutant behavior is crucial for effective planning and mitigation strategies. Therefore, our main objectives are to develop accurate and efficient prediction models and identify the key factors that influence O3 levels. We adopt a time series forecasting approach to address these objectives, which allows us to analyze and predict O3 future behavior. Additionally, we tackle the feature selection problem to identify the most relevant features and periods that contribute to prediction accuracy by introducing a novel method called the Time Selection Layer in Deep Learning models, which significantly improves model performance, reduces complexity, and enhances interpretability. Our study focuses on data collected from five representative areas in Seville, Cordova, and Jaen provinces in Spain, using multiple sensors to capture comprehensive pollution data. We compare the performance of three models: Lasso, Decision Tree, and Deep Learning with and without incorporating the Time Selection Layer. Our results demonstrate that including the Time Selection Layer significantly enhances the effectiveness and interpretability of Deep Learning models, achieving an average effectiveness improvement of 9% across all monitored areas. |
R. Pérez-Chacón and G. Asencio-Cortés and A. Troncoso and F. Martínez-Álvarez Pattern sequence-based algorithm for multivariate big data time series forecasting: Application to electricity consumption Journal Article In: Future Generation Computer Systems, vol. 154, pp. 397-412, 2024. @article{PEREZ24, Several interrelated variables typically characterize real-world processes, and a time series cannot be predicted without considering the influence that other time series might have on the target time series. This work proposes a novel algorithm to forecast multivariate big data time series. This new general-purpose approach consists first of a previous pattern recognition performed jointly using all time series that form the multivariate time series and then predicts the target time series by searching for similarities between pattern sequences. The proposed algorithm is designed to tackle multivariate time series forecasting problems within the context of big data. In particular, the algorithm has been developed with a distributed nature to enhance its efficiency in analyzing and processing large volumes of data. Moreover, the algorithm is straightforward to use, with only two parameters needing adjustment. Another advantage of the MV-bigPSF algorithm is its ability to perform multi-step forecasting, which is particularly useful in many practical applications. To evaluate the algorithm’s performance, real-world data from Uruguay’s power consumption has been utilized. Specifically, MV-bigPSF has been compared with both univariate and multivariate methods. Regarding the univariate ones, MV-bigPSF improved 12.8% in MAPE compared to the second-best method. Regarding the multivariate comparison, MV-bigPSF improved 44.8% in MAPE with respect to the second most accurate method. Regarding efficiency, the execution time of MV-bigPSF was 1.83 times faster than the second-fastest multivariate method, both in a single-core environment. Therefore, the proposed algorithm can be a valuable tool for practitioners and researchers working in multivariate time series forecasting, particularly in big data applications. |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Mártinez-Álvarez and A. Troncoso and G. Asencio-Cortés From Simple to Complex: A Sequential Method for Enhancing Time Series Forecasting with Deep Learning Journal Article In: Logic Journal of the IGPL, pp. jzae030, 2024. @article{JIMENEZ-NAVARRO23a, Time series forecasting is a well-known deep learning application field in which previous data are used to predict the future behavior of the series. Recently, several deep learning approaches have been proposed in which several nonlinear functions are applied to the input to obtain the output. In this paper, we introduce a novel method to improve the performance of deep learning models in time series forecasting. This method divides the model into hierarchies or levels from simpler to more complex ones. Simpler levels handle smoothed versions of the input, whereas the most complex level processes the original time series. This method follows the human learning process where general/simpler tasks are performed first, and afterward, more precise/harder ones are accomplished.Our proposed methodology has been applied to the LSTM architecture, showing remarkable performance in various time series. In addition, a comparison is reported including a standard LSTM and novel methods such as DeepAR, Temporal Fusion Transformer (TFT), NBEATS and Echo State Network (ESN). |
2023 |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Martínez-Álvarez and G. Asencio-Cortés Embedded Temporal Feature Selection for Time Series Forecasting Using Deep Learning Conference IWANN 17th International Work-Conference on Artificial Neural Networks, vol. 14135, Lecture Notes in Computer Science 2023. @conference{JIMENEZ-NAVARRO23_IWANN, |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Martínez-Álvarez and G. Asencio-Cortés A New Deep Learning Architecture with Inductive Bias Balance for Oil Temperature Forecasting Journal Article In: Journal of Big Data, vol. 10, pp. 80, 2023. @article{JIMENEZ-NAVARRO23c, Ensuring the optimal performance of power transformers is a laborious task in which the insulation system plays a vital role in decreasing their deterioration. The insulation system uses insulating oil to control temperature, as high temperatures can reduce the lifetime of the transformers and lead to expensive maintenance. Deep learning architectures have been demonstrated remarkable results in various fields. However, this improvement often comes at the cost of increased computing resources, which, in turn, increases the carbon footprint and hinders the optimization of architectures. In this study, we introduce a novel deep learning architecture that achieves a comparable efficacy to the best existing architectures in transformer oil temperature forecasting while improving efficiency. Effective forecasting can help prevent high temperatures and monitor the future condition of power transformers, thereby reducing unnecessary waste. To balance the inductive bias in our architecture, we propose the Smooth Residual Block, which divides the original problem into multiple subproblems to obtain different representations of the time series, collaboratively achieving the final forecasting. We applied our architecture to the Electricity Transformer datasets, which obtain transformer insulating oil temperature measures from two transformers in China. The results showed a 13% improvement in MSE and a 57% improvement in performance compared to the best current architectures, to the best of our knowledge. Moreover, we analyzed the architecture behavior to gain an intuitive understanding of the achieved solution. |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Martínez-Álvarez and G. Asencio-Cortés PHILNet: A Novel Efficient Approach for Time Series Forecasting using Deep Learning Journal Article In: Information Sciences, vol. 632, pp. 815-832, 2023. @article{JIMENEZ-NAVARRO23b, Time series is one of the most common data types in the industry nowadays. Forecasting the future of a time series behavior can be useful in planning ahead, saving time, resources, and helping avoid undesired scenarios. To make the forecasting, historical data is utilized due to the causal nature of the time series. Several deep learning algorithms have been presented in this area, where the input is processed through a series of non-linear functions to produce the output. We present a novel strategy to improve the performance of deep learning models in time series forecasting in terms of efficiency while reaching similar effectiveness. This approach separates the model into levels, starting with the easiest and continuing to the most difficult. The simpler levels deal with smoothed versions of the input, whereas the most sophisticated level deals with the raw data. This strategy seeks to mimic the human learning process, in which basic tasks are completed initially, followed by more precise and sophisticated ones. Our method achieved promising results, obtaining a 35% improvement in mean squared error and a 2.6 time decrease in training time compared with the best models found in a variety of time series. |
A. M. Chacón-Maldonado and G. Asencio-Cortés and F. Martínez-Álvarez and A. Troncoso FS-Studio: An extensive and efficient feature selection experimentation tool for Weka Explorer Journal Article In: SoftwareX, vol. 23, pp. 101401, 2023. @article{Chacon2023, |
P. Jiménez-Herrera and L. Melgar-García and G. Asencio-Cortés and A. Troncoso Streaming big time series forecasting based on nearest similar patterns with application to energy consumption Journal Article In: Logic Journal of the IGPL, vol. 31, no. 2, pp. 255-270, 2023. @article{jimenez2023, This work presents a novel approach to forecast streaming big time series based on nearest similar patterns. This approach combines a clustering algorithm with a classifier and the nearest neighbors algorithm. It presents two separate stages: offline and online. The offline phase is for training and finding the best models for clustering, classification and the nearest neighbors algorithm. The online phase is to predict big time series in real time. In the offline phase, data are divided into clusters and a forecasting model based on the nearest neighbors is trained for each cluster. In addition, a classifier is trained using the cluster assignments previously generated by the clustering algorithm. In the online phase, the classifier predicts the cluster label of an instance, and the proper nearest neighbors model according to the predicted cluster label is applied to obtain the final prediction using the similar patterns. The algorithm is able to be updated incrementally for online learning from data streams. Results are reported using electricity consumption with a granularity of 10 minutes for 4-hour-ahead forecasting and compared with well-known online benchmark learners, showing a remarkable improvement in prediction accuracy. |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and I. S. Brito and F. Martínez-Álvarez and G. Asencio-Cortés SAC 38th Annual ACM Symposium on Applied Computing, 2023. @conference{EVAPOCVOA23, The year 2022 was the driest year in Portugal since 1931 with 97% of territory in severe drought. Water is especially important for the agricultural sector in Portugal, as it represents 78% total consumption according to the Water Footprint report published in 2010. Reference evapotranspiration is essential due to its importance in optimal irrigation planning that reduces water consumption. This study analyzes and proposes a framework to forecast daily reference evapotranspiration at eight stations in Portugal from 2012 to 2022 without relying on public meteorological forecasts. The data include meteorological data obtained from sensors included in the stations. The goal is to perform a multi-horizon forecasting of reference evapotranspiration using the multiple related covariates. The framework combines the data processing and the analysis of several state-of-the-art forecasting methods including classical, linear, tree-based, artificial neural network and ensembles. Then, an ensemble of all trained models is proposed using a recent bioinspired metaheuristic named Coronavirus Optimization Algorithm to weight the predictions. The results in terms of MAE and MSE are reported, indicating that our approach achieved a MAE of 0.658. |
2022 |
M. Á. Molina and M. J. Jiménez-Navarro and R. Arjona and F. Mártinez-Álvarez and G. Asencio-Cortés DIAFAN-TL: An instance weighting-based transfer learning algorithm with application to phenology forecasting Journal Article In: Knowledge-Based Systems, vol. 254, pp. 109644, 2022. @article{MOLINA22, The agricultural sector has been, and still is, the most important economic sector in many countries. Due to advances in technology, the amount and variety of available data have been increasing over the years. However, compared to other economic sectors, there is not always enough quality data for one particular domain (crops, plantations, plots) to obtain acceptable forecasting results with machine learning algorithms. In this context, transfer learning can help extract knowledge from different but related domains with enough data to transfer it to a target domain with scarce data. This process can overcome forecasting accuracy compared to training models uniquely with data from the target domain. In this work, a novel instance weighting-based transfer learning algorithm is proposed and applied to the phenology forecasting problem. A new metric named DIAFAN is proposed to weight samples from different source domains according to their relationship with the target domain, promoting the diversity of the information and avoiding inconsistent samples. Additionally, a set of validation schemes is specifically designed to ensure fair comparisons in terms of data volume with other benchmark transfer learning algorithms. The proposed algorithm, DIAFAN-TL, is tested with a proposed dataset of 16 plots of olive groves from different places, including information fusion from satellite images, meteorological stations and human field sampling of crop phenology. DIAFAN-TL achieves a remarkable improvement with respect to 15 other well-known transfer learning algorithms and three nontransfer learning scenarios. Finally, several performance analyses according to the different phenological states, prediction horizons and source domains are also performed. |
A. M. Chacón-Maldonado and M. A. Molina and A. Troncoso and F. Martínez-Álvarez and G. Asencio-Cortés HAIS 17th International Conference on Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science 2022. @conference{HAIS22_Andres, |
A. Gómez-Losada and G. Asencio-Cortés and N. Duch-Brown Automatic Eligibility of Sellers in an Online Marketplace: A Case Study of Amazon Algorithm Journal Article In: Information, vol. 13, no. 44, pp. 1–16, 2022. @article{losada2022, Purchase processes on Amazon Marketplace begin at the Buy Box, which represents the buy click process through which numerous sellers compete. This study aimed to estimate empirically the relevant seller characteristics that Amazon could consider featuring in the Buy Box. To that end, 22 product categories from Italy’s Amazon web page were studied over a ten-month period, and the sellers were analyzed through their products featured in the Buy Box. Two different experiments were proposed and the results were analyzed using four classification algorithms (a neural network, random forest, support vector machine, and C5.0 decision trees) and a rule-based classification. The first experiment aimed to characterize sellers unspecifically by predicting their change at the Buy Box. The second one aimed to predict which seller would be featured in it. Both experiments revealed that the customer experience and the dynamics of the sellers’ prices were important features of the Buy Box. Additionally, we proposed a set of default features that Amazon could consider when no information about sellers was available. We also proposed the possible existence of a relationship or composition among important features that could be used for sellers to be featured in the Buy Box. |
M.A. Castán-Lascorz and P. Jiménez-Herrera and A. Troncoso and G. Asencio-Cortés A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting Journal Article In: Information Sciences, vol. 586, pp. 611–627, 2022. @article{castan2022, Time series forecasting has become indispensable for multiple applications and industrial processes. Currently, a large number of algorithms have been developed to forecast time series, all of which are suitable depending on the characteristics and patterns to be inferred in each case. In this work, a new algorithm is proposed to predict both univariate and multivariate time series based on a combination of clustering, classification and forecasting techniques. The main goal of the proposed algorithm is first to group windows of time series values with similar patterns by applying a clustering process. Then, a specific forecasting model for each pattern is built and training is only conducted with the time windows corresponding to that pattern. The new algorithm has been designed using a flexible framework that allows the model to be generated using any combination of approaches within multiple machine learning techniques. To evaluate the model, several experiments are carried out using different configurations of the clustering, classification and forecasting methods that the model consists of. The results are analyzed and compared to classical prediction models, such as autoregressive, integrated, moving average and Holt-Winters models, to very recent forecasting methods, including deep, long short-term memory neural networks, and to well-known methods in the literature, such as k nearest neighbors, classification and regression trees, as well as random forest. |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and I. S. Sousa Brito and F. Martínez-Álvarez and G. Asencio-Cortés Feature-Aware Drop Layer (FADL): A Nonparametric Neural Network Layer for Feature Selection Conference SOCO 17th International Conference on Soft Computing Models in Industrial and Environmental Applications, vol. 531, Lecture Notes in Networks Systems 2022. @conference{FADL23, |