Prof. Gualberto Asencio Cortés, Ph.D. is a Computer Science Engineer (University of Seville, 2008), Master in Software Engineering and Technology (University of Seville, 2010), Ph.D. (University of Pablo de Olavide, 2013) and he has an Executive Master in Innovation (EOI, Spain, 2016). He is Associate Professor of Computer Science, in the area of Languages and Information Systems at the University of Pablo de Olavide. He is the author of 25 publications in impact journals according to JCR (19 of them between Q1 and Q2) and author of more than 25 articles in international and national conferences, most of them published in LNCS and LNBI. He has participated in three projects of the National Plan and three more of the Andalusian Research Plan. He is an editor of PLOS ONE (IF: 2.806, Q1), a regular reviewer of journals indexed in JCR (PLOS ONE, Bioinformatics, Neurocomputing, Computer and Geosciences, etc.) and member of the program committee in numerous international conferences. He has participated in more than 8 technology transfer contracts between the university and the company, including ISOTROL, Red Eléctrica Española and DETEA. He has 5 months of international research stays and 3 national months.
The research lines of Prof. Gualberto Asencio Cortés, Ph.D. are focused on data mining, machine learning, prediction of time series and bioinformatics, with different fields of application: prediction of natural series (seismic, air quality, meteorological, agronomic, …), prediction of electricity consumption and market prices, prediction of urban traffic, as well as bioinformatics in prediction of biological structures. He has also been data scientist and member of the steering committee responsible for artificial intelligence and data science technologies at the private company easytosee AgTech SL for more than 2 years (2015-2017).
Publications
2023 |
M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Mártinez-Álvarez and A. Troncoco and G. Asencio-Cortés From Simple to Complex: A Sequential Method for Enhancing Time Series Forecasting with Deep Learning Journal Article Logic Journal of the IGPL, in press , 2023. @article{JIMENEZ-NAVARRO23a, title = {From Simple to Complex: A Sequential Method for Enhancing Time Series Forecasting with Deep Learning}, author = {M. J. Jiménez-Navarro and M. Martínez-Ballesteros and F. Mártinez-Álvarez and A. Troncoco and G. Asencio-Cortés}, year = {2023}, date = {2023-01-20}, journal = {Logic Journal of the IGPL}, volume = {in press}, abstract = {Time series forecasting is a well-known deep learning application field in which previous data are used to predict the future behavior of the series. Recently, several deep learning approaches have been proposed in which several nonlinear functions are applied to the input to obtain the output. In this paper, we introduce a novel method to improve the performance of deep learning models in time series forecasting. This method divides the model into hierarchies or levels from simpler to more complex ones. Simpler levels handle smoothed versions of the input, whereas the most complex level processes the original time series. This method follows the human learning process where general/simpler tasks are performed first, and afterward, more precise/harder ones are accomplished.Our proposed methodology has been applied to the LSTM architecture, showing remarkable performance in various time series. In addition, a comparison is reported including a standard LSTM and novel methods such as DeepAR, Temporal Fusion Transformer (TFT), NBEATS and Echo State Network (ESN).}, keywords = {}, pubstate = {published}, tppubtype = {article} } Time series forecasting is a well-known deep learning application field in which previous data are used to predict the future behavior of the series. Recently, several deep learning approaches have been proposed in which several nonlinear functions are applied to the input to obtain the output. In this paper, we introduce a novel method to improve the performance of deep learning models in time series forecasting. This method divides the model into hierarchies or levels from simpler to more complex ones. Simpler levels handle smoothed versions of the input, whereas the most complex level processes the original time series. This method follows the human learning process where general/simpler tasks are performed first, and afterward, more precise/harder ones are accomplished.Our proposed methodology has been applied to the LSTM architecture, showing remarkable performance in various time series. In addition, a comparison is reported including a standard LSTM and novel methods such as DeepAR, Temporal Fusion Transformer (TFT), NBEATS and Echo State Network (ESN). |
2022 |
M. Á. Molina and M. J. Jiménez-Navarro and R. Arjona and F. Mártinez-Álvarez and G. Asencio-Cortés DIAFAN-TL: An instance weighting-based transfer learning algorithm with application to phenology forecasting Journal Article Knowledge-Based Systems, 254 , pp. 109644, 2022. @article{MOLINA22, title = {DIAFAN-TL: An instance weighting-based transfer learning algorithm with application to phenology forecasting}, author = {M. Á. Molina and M. J. Jiménez-Navarro and R. Arjona and F. Mártinez-Álvarez and G. Asencio-Cortés}, url = {https://www.sciencedirect.com/science/article/pii/S0950705122008322}, doi = {https://doi.org/10.1016/j.knosys.2022.109644}, year = {2022}, date = {2022-10-22}, journal = {Knowledge-Based Systems}, volume = {254}, pages = {109644}, abstract = {The agricultural sector has been, and still is, the most important economic sector in many countries. Due to advances in technology, the amount and variety of available data have been increasing over the years. However, compared to other economic sectors, there is not always enough quality data for one particular domain (crops, plantations, plots) to obtain acceptable forecasting results with machine learning algorithms. In this context, transfer learning can help extract knowledge from different but related domains with enough data to transfer it to a target domain with scarce data. This process can overcome forecasting accuracy compared to training models uniquely with data from the target domain. In this work, a novel instance weighting-based transfer learning algorithm is proposed and applied to the phenology forecasting problem. A new metric named DIAFAN is proposed to weight samples from different source domains according to their relationship with the target domain, promoting the diversity of the information and avoiding inconsistent samples. Additionally, a set of validation schemes is specifically designed to ensure fair comparisons in terms of data volume with other benchmark transfer learning algorithms. The proposed algorithm, DIAFAN-TL, is tested with a proposed dataset of 16 plots of olive groves from different places, including information fusion from satellite images, meteorological stations and human field sampling of crop phenology. DIAFAN-TL achieves a remarkable improvement with respect to 15 other well-known transfer learning algorithms and three nontransfer learning scenarios. Finally, several performance analyses according to the different phenological states, prediction horizons and source domains are also performed.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The agricultural sector has been, and still is, the most important economic sector in many countries. Due to advances in technology, the amount and variety of available data have been increasing over the years. However, compared to other economic sectors, there is not always enough quality data for one particular domain (crops, plantations, plots) to obtain acceptable forecasting results with machine learning algorithms. In this context, transfer learning can help extract knowledge from different but related domains with enough data to transfer it to a target domain with scarce data. This process can overcome forecasting accuracy compared to training models uniquely with data from the target domain. In this work, a novel instance weighting-based transfer learning algorithm is proposed and applied to the phenology forecasting problem. A new metric named DIAFAN is proposed to weight samples from different source domains according to their relationship with the target domain, promoting the diversity of the information and avoiding inconsistent samples. Additionally, a set of validation schemes is specifically designed to ensure fair comparisons in terms of data volume with other benchmark transfer learning algorithms. The proposed algorithm, DIAFAN-TL, is tested with a proposed dataset of 16 plots of olive groves from different places, including information fusion from satellite images, meteorological stations and human field sampling of crop phenology. DIAFAN-TL achieves a remarkable improvement with respect to 15 other well-known transfer learning algorithms and three nontransfer learning scenarios. Finally, several performance analyses according to the different phenological states, prediction horizons and source domains are also performed. |
A. Gómez-Losada and G. Asencio-Cortés and N. Duch-Brown Automatic Eligibility of Sellers in an Online Marketplace: A Case Study of Amazon Algorithm Journal Article Information, 13 (44), pp. 1–16, 2022. @article{losada2022, title = {Automatic Eligibility of Sellers in an Online Marketplace: A Case Study of Amazon Algorithm}, author = {A. Gómez-Losada and G. Asencio-Cortés and N. Duch-Brown}, url = {https://www.mdpi.com/2078-2489/13/2/44}, doi = {10.3390/info13020044}, year = {2022}, date = {2022-01-01}, journal = {Information}, volume = {13}, number = {44}, pages = {1--16}, abstract = {Purchase processes on Amazon Marketplace begin at the Buy Box, which represents the buy click process through which numerous sellers compete. This study aimed to estimate empirically the relevant seller characteristics that Amazon could consider featuring in the Buy Box. To that end, 22 product categories from Italy’s Amazon web page were studied over a ten-month period, and the sellers were analyzed through their products featured in the Buy Box. Two different experiments were proposed and the results were analyzed using four classification algorithms (a neural network, random forest, support vector machine, and C5.0 decision trees) and a rule-based classification. The first experiment aimed to characterize sellers unspecifically by predicting their change at the Buy Box. The second one aimed to predict which seller would be featured in it. Both experiments revealed that the customer experience and the dynamics of the sellers’ prices were important features of the Buy Box. Additionally, we proposed a set of default features that Amazon could consider when no information about sellers was available. We also proposed the possible existence of a relationship or composition among important features that could be used for sellers to be featured in the Buy Box.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Purchase processes on Amazon Marketplace begin at the Buy Box, which represents the buy click process through which numerous sellers compete. This study aimed to estimate empirically the relevant seller characteristics that Amazon could consider featuring in the Buy Box. To that end, 22 product categories from Italy’s Amazon web page were studied over a ten-month period, and the sellers were analyzed through their products featured in the Buy Box. Two different experiments were proposed and the results were analyzed using four classification algorithms (a neural network, random forest, support vector machine, and C5.0 decision trees) and a rule-based classification. The first experiment aimed to characterize sellers unspecifically by predicting their change at the Buy Box. The second one aimed to predict which seller would be featured in it. Both experiments revealed that the customer experience and the dynamics of the sellers’ prices were important features of the Buy Box. Additionally, we proposed a set of default features that Amazon could consider when no information about sellers was available. We also proposed the possible existence of a relationship or composition among important features that could be used for sellers to be featured in the Buy Box. |
M.A. Castán-Lascorz and P. Jiménez-Herrera and A. Troncoso and G. Asencio-Cortés A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting Journal Article Information Sciences, 586 , pp. 611–627, 2022. @article{castan2022, title = {A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting}, author = {M.A. Castán-Lascorz and P. Jiménez-Herrera and A. Troncoso and G. Asencio-Cortés}, url = {https://www.sciencedirect.com/science/article/pii/S0020025521012226?via%3Dihub}, doi = {10.1016/j.ins.2021.12.001}, year = {2022}, date = {2022-01-01}, journal = {Information Sciences}, volume = {586}, pages = {611--627}, abstract = {Time series forecasting has become indispensable for multiple applications and industrial processes. Currently, a large number of algorithms have been developed to forecast time series, all of which are suitable depending on the characteristics and patterns to be inferred in each case. In this work, a new algorithm is proposed to predict both univariate and multivariate time series based on a combination of clustering, classification and forecasting techniques. The main goal of the proposed algorithm is first to group windows of time series values with similar patterns by applying a clustering process. Then, a specific forecasting model for each pattern is built and training is only conducted with the time windows corresponding to that pattern. The new algorithm has been designed using a flexible framework that allows the model to be generated using any combination of approaches within multiple machine learning techniques. To evaluate the model, several experiments are carried out using different configurations of the clustering, classification and forecasting methods that the model consists of. The results are analyzed and compared to classical prediction models, such as autoregressive, integrated, moving average and Holt-Winters models, to very recent forecasting methods, including deep, long short-term memory neural networks, and to well-known methods in the literature, such as k nearest neighbors, classification and regression trees, as well as random forest.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Time series forecasting has become indispensable for multiple applications and industrial processes. Currently, a large number of algorithms have been developed to forecast time series, all of which are suitable depending on the characteristics and patterns to be inferred in each case. In this work, a new algorithm is proposed to predict both univariate and multivariate time series based on a combination of clustering, classification and forecasting techniques. The main goal of the proposed algorithm is first to group windows of time series values with similar patterns by applying a clustering process. Then, a specific forecasting model for each pattern is built and training is only conducted with the time windows corresponding to that pattern. The new algorithm has been designed using a flexible framework that allows the model to be generated using any combination of approaches within multiple machine learning techniques. To evaluate the model, several experiments are carried out using different configurations of the clustering, classification and forecasting methods that the model consists of. The results are analyzed and compared to classical prediction models, such as autoregressive, integrated, moving average and Holt-Winters models, to very recent forecasting methods, including deep, long short-term memory neural networks, and to well-known methods in the literature, such as k nearest neighbors, classification and regression trees, as well as random forest. |
J. Roiz-Pagador and A. Chacon-Maldonado and R. Ruiz and G. Asencio-Cortes Earthquake Prediction in California using Feature Selection techniques Conference 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021), Advances in Intelligent Systems and Computing 2022. @conference{roiz2022, title = {Earthquake Prediction in California using Feature Selection techniques}, author = {J. Roiz-Pagador and A. Chacon-Maldonado and R. Ruiz and G. Asencio-Cortes}, url = {https://link.springer.com/chapter/10.1007/978-3-030-87869-6_69}, year = {2022}, date = {2022-01-01}, booktitle = {16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021)}, series = {Advances in Intelligent Systems and Computing}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
P. Jiménez-Herrera and L. Melgar-García and G. Asencio-Cortés and A. Troncoso Streaming big time series forecasting based on nearest similar patterns with application to energy consumption Journal Article Logic Journal of the IGPL, (in press) , pp. 1–20, 2022. @article{jimenez2022, title = {Streaming big time series forecasting based on nearest similar patterns with application to energy consumption}, author = {P. Jiménez-Herrera and L. Melgar-García and G. Asencio-Cortés and A. Troncoso}, url = {https://academic.oup.com/jigpal/advance-article-abstract/doi/10.1093/jigpal/jzac017/6534493?redirectedFrom=fulltext}, doi = {https://doi.org/10.1093/jigpal/jzac017}, year = {2022}, date = {2022-01-01}, journal = {Logic Journal of the IGPL}, volume = {(in press)}, pages = {1--20}, abstract = {This work presents a novel approach to forecast streaming big time series based on nearest similar patterns. This approach combines a clustering algorithm with a classifier and the nearest neighbors algorithm. It presents two separate stages: offline and online. The offline phase is for training and finding the best models for clustering, classification and the nearest neighbors algorithm. The online phase is to predict big time series in real time. In the offline phase, data are divided into clusters and a forecasting model based on the nearest neighbors is trained for each cluster. In addition, a classifier is trained using the cluster assignments previously generated by the clustering algorithm. In the online phase, the classifier predicts the cluster label of an instance, and the proper nearest neighbors model according to the predicted cluster label is applied to obtain the final prediction using the similar patterns. The algorithm is able to be updated incrementally for online learning from data streams. Results are reported using electricity consumption with a granularity of 10 minutes for 4-hour-ahead forecasting and compared with well-known online benchmark learners, showing a remarkable improvement in prediction accuracy.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This work presents a novel approach to forecast streaming big time series based on nearest similar patterns. This approach combines a clustering algorithm with a classifier and the nearest neighbors algorithm. It presents two separate stages: offline and online. The offline phase is for training and finding the best models for clustering, classification and the nearest neighbors algorithm. The online phase is to predict big time series in real time. In the offline phase, data are divided into clusters and a forecasting model based on the nearest neighbors is trained for each cluster. In addition, a classifier is trained using the cluster assignments previously generated by the clustering algorithm. In the online phase, the classifier predicts the cluster label of an instance, and the proper nearest neighbors model according to the predicted cluster label is applied to obtain the final prediction using the similar patterns. The algorithm is able to be updated incrementally for online learning from data streams. Results are reported using electricity consumption with a granularity of 10 minutes for 4-hour-ahead forecasting and compared with well-known online benchmark learners, showing a remarkable improvement in prediction accuracy. |
2021 |
M. J. Jiménez-Navarro and F. Martínez-Álvarez and A. Troncoso and G. Asencio-Cortés HLNet: A Novel Hierarchical Deep Neural Network for Time Series Forecasting Conference SOCO International Conference on Soft Computing Models in Industrial and Environmental Applications, 1401 , Advances in Intelligent Systems and Computing 2021. @conference{JIMENEZ-NAVARRO21, title = {HLNet: A Novel Hierarchical Deep Neural Network for Time Series Forecasting}, author = {M. J. Jiménez-Navarro and F. Martínez-Álvarez and A. Troncoso and G. Asencio-Cortés}, doi = {https://doi.org/10.1007/978-3-030-87869-6_68}, year = {2021}, date = {2021-09-01}, booktitle = {SOCO International Conference on Soft Computing Models in Industrial and Environmental Applications}, volume = {1401}, pages = {717-727}, series = {Advances in Intelligent Systems and Computing}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
M. A. Molina and M. J. Jiménez-Navarro and F. Martínez-Álvarez and G. Asencio-Cortés A Model-Based Deep Transfer Learning Algorithm for Phenology Forecasting Using Satellite Imagery Conference HAIS, 12886 , Lecture Notes in Computer Science 2021. @conference{MOLINA21, title = {A Model-Based Deep Transfer Learning Algorithm for Phenology Forecasting Using Satellite Imagery}, author = {M. A. Molina and M. J. Jiménez-Navarro and F. Martínez-Álvarez and G. Asencio-Cortés}, url = {https://link.springer.com/chapter/10.1007/978-3-030-86271-8_43}, doi = {https://doi.org/10.1007/978-3-030-86271-8_43}, year = {2021}, date = {2021-09-01}, booktitle = {HAIS}, volume = {12886}, pages = {511-523}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
M. A. Molina and G. Asencio-Cortés and J. C. Riquelme and F. Martínez-Álvarez A Preliminary Study on Deep Transfer Learning Applied to Image Classification for Small Datasets Conference 15th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2020), Advances in Intelligent Systems and Computing 2021. @conference{molina2021, title = {A Preliminary Study on Deep Transfer Learning Applied to Image Classification for Small Datasets}, author = {M. A. Molina and G. Asencio-Cortés and J. C. Riquelme and F. Martínez-Álvarez}, url = {https://link.springer.com/chapter/10.1007/978-3-030-57802-2_71}, year = {2021}, date = {2021-01-01}, booktitle = {15th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2020)}, series = {Advances in Intelligent Systems and Computing}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
A. J. Pérez-Pulido and G. Asencio-Cortés and A. M. Brokate-Llanos and G. Brea-Calvo and M. R. Rodríguez-Griñolo and A. Garzón and M. J. Muñoz Briefings in Bioinformatics, 22 (2), pp. 1038–1052, 2021. @article{pulido2021, title = {Serial co-expression analysis of host factors from SARS-CoV viruses highly converges with former high-throughput screenings and proposes key regulators}, author = {A. J. Pérez-Pulido and G. Asencio-Cortés and A. M. Brokate-Llanos and G. Brea-Calvo and M. R. Rodríguez-Griñolo and A. Garzón and M. J. Muñoz}, url = {https://academic.oup.com/bib/article/22/2/1038/6103172}, doi = {10.1093/bib/bbaa419}, year = {2021}, date = {2021-01-01}, journal = {Briefings in Bioinformatics}, volume = {22}, number = {2}, pages = {1038--1052}, abstract = {The current genomics era is bringing an unprecedented growth in the amount of gene expression data, only comparable to the exponential growth of sequences in databases during the last decades. This data allow the design of secondary analyses that take advantage of this information to create new knowledge. One of these feasible analyses is the evaluation of the expression level for a gene through a series of different conditions or cell types. Based on this idea, we have developed Automatic and Serial Analysis of CO-expression, which performs expression profiles for a given gene along hundreds of heterogeneous and normalized transcriptomics experiments and discover other genes that show either a similar or an inverse behavior. It might help to discover co-regulated genes, and common transcriptional regulators in any biological model. The present severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic is an opportunity to test this novel approach due to the wealth of data that are being generated, which could be used for validating results. Thus, we have identified 35 host factors in the literature putatively involved in the infectious cycle of SARS-CoV viruses and searched for genes tightly co-expressed with them. We have found 1899 co-expressed genes whose assigned functions are strongly related to viral cycles. Moreover, this set of genes heavily overlaps with those identified by former laboratory.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The current genomics era is bringing an unprecedented growth in the amount of gene expression data, only comparable to the exponential growth of sequences in databases during the last decades. This data allow the design of secondary analyses that take advantage of this information to create new knowledge. One of these feasible analyses is the evaluation of the expression level for a gene through a series of different conditions or cell types. Based on this idea, we have developed Automatic and Serial Analysis of CO-expression, which performs expression profiles for a given gene along hundreds of heterogeneous and normalized transcriptomics experiments and discover other genes that show either a similar or an inverse behavior. It might help to discover co-regulated genes, and common transcriptional regulators in any biological model. The present severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic is an opportunity to test this novel approach due to the wealth of data that are being generated, which could be used for validating results. Thus, we have identified 35 host factors in the literature putatively involved in the infectious cycle of SARS-CoV viruses and searched for genes tightly co-expressed with them. We have found 1899 co-expressed genes whose assigned functions are strongly related to viral cycles. Moreover, this set of genes heavily overlaps with those identified by former laboratory. |
2020 |
P. Jiménez-Herrera and L. Melgar-García and G. Asencio-Cortés and A. Troncoso A New Forecasting Algorithm Based on Neighbors for Streaming Electricity Time Series Conference HAIS 15th International Conference on Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science 2020. @conference{HAIS2020, title = {A New Forecasting Algorithm Based on Neighbors for Streaming Electricity Time Series}, author = {P. Jiménez-Herrera and L. Melgar-García and G. Asencio-Cortés and A. Troncoso}, url = {https://link.springer.com/chapter/10.1007/978-3-030-61705-9_43}, year = {2020}, date = {2020-11-04}, booktitle = {HAIS 15th International Conference on Hybrid Artificial Intelligence Systems}, pages = {522-533}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
F. Martínez-Álvarez and G. Asencio-Cortés and J. F. Torres and D. Gutiérrez-Avilés and L. Melgar-García and R. Pérez-Chacón and C. Rubio-Escudero and A. Troncoso and J. C. Riquelme Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model Journal Article Big Data, 8 (4), pp. 308-322, 2020. @article{MARTINEZ-ALVAREZ20, title = {Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model}, author = {F. Martínez-Álvarez and G. Asencio-Cortés and J. F. Torres and D. Gutiérrez-Avilés and L. Melgar-García and R. Pérez-Chacón and C. Rubio-Escudero and A. Troncoso and J. C. Riquelme}, url = {https://www.liebertpub.com/doi/full/10.1089/big.2020.0051}, doi = {10.1089/big.2020.0051}, year = {2020}, date = {2020-07-22}, journal = {Big Data}, volume = {8}, number = {4}, pages = {308-322}, abstract = {This work proposes a novel bioinspired metaheuristic, simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures or traveling rate are introduced into the model in order to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate and number of recoveries, the infected population gradually decreases. The Coronavirus Optimization Algorithm has two major advantages when compared to other similar strategies. Firstly, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Secondly, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multi-virus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, in order to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This work proposes a novel bioinspired metaheuristic, simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures or traveling rate are introduced into the model in order to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate and number of recoveries, the infected population gradually decreases. The Coronavirus Optimization Algorithm has two major advantages when compared to other similar strategies. Firstly, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Secondly, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multi-virus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, in order to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance. |
R. Pérez-Chacón and G. Asencio-Cortés and F. Martínez-Álvarez and A. Troncoso Big data time series forecasting based on pattern sequence similarity and its application to the electricity demand Journal Article Information Sciences, 540 , pp. 160-174, 2020. @article{PEREZ20, title = {Big data time series forecasting based on pattern sequence similarity and its application to the electricity demand}, author = {R. Pérez-Chacón and G. Asencio-Cortés and F. Martínez-Álvarez and A. Troncoso}, url = {https://www.sciencedirect.com/science/article/pii/S0020025520306010}, doi = {10.1016/j.ins.2020.06.014}, year = {2020}, date = {2020-06-06}, journal = {Information Sciences}, volume = {540}, pages = {160-174}, abstract = {This work proposes a novel algorithm to forecast big data time series. Based on the well-established Pattern Sequence Forecasting algorithm, this new approach has two major contributions to the literature. First, the improvement of the aforementioned algorithm with respect to the accuracy of predictions, and second, its transformation into the big data context, having reached meaningful results in terms of scalability. The algorithm uses the Apache Spark distributed computation framework and it is a ready-to-use application with few parameters to adjust. Physical and cloud clusters have been used to carry out the experimentation, which consisted in applying the algorithm to real-world data from Uruguay electricity demand.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This work proposes a novel algorithm to forecast big data time series. Based on the well-established Pattern Sequence Forecasting algorithm, this new approach has two major contributions to the literature. First, the improvement of the aforementioned algorithm with respect to the accuracy of predictions, and second, its transformation into the big data context, having reached meaningful results in terms of scalability. The algorithm uses the Apache Spark distributed computation framework and it is a ready-to-use application with few parameters to adjust. Physical and cloud clusters have been used to carry out the experimentation, which consisted in applying the algorithm to real-world data from Uruguay electricity demand. |
2019 |
C. Gómez-Quiles and G. Asencio-Cortés and A. Gastalver-Rubio and F. Martínez-Álvarez and A. Troncoso and J. Manresa and J. C. Riquelme and J. M. Riquelme A novel ensemble method for electric vehicle power consumption forecasting: application to the Spanish system Journal Article IEEE Access, 7 , pp. 120840-120856, 2019. @article{GOMEZ19, title = {A novel ensemble method for electric vehicle power consumption forecasting: application to the Spanish system}, author = {C. Gómez-Quiles and G. Asencio-Cortés and A. Gastalver-Rubio and F. Martínez-Álvarez and A. Troncoso and J. Manresa and J. C. Riquelme and J. M. Riquelme}, url = {https://ieeexplore.ieee.org/document/8807120}, doi = {https://doi.org/10.1109/ACCESS.2019.2936478}, year = {2019}, date = {2019-08-01}, journal = {IEEE Access}, volume = {7}, pages = {120840-120856}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
F. Martinez-Alvarez and A. Schmutz and G. Asencio-Cortes and J. Jacques A Novel Hybrid Algorithm to Forecast Functional Time Series Based on Pattern Sequence Similarity with Application to Electricity Demand Journal Article Energies, 12 (94), pp. 1-18, 2019, ISSN: 1996-1073. @article{en12010094b, title = {A Novel Hybrid Algorithm to Forecast Functional Time Series Based on Pattern Sequence Similarity with Application to Electricity Demand}, author = {F. Martinez-Alvarez and A. Schmutz and G. Asencio-Cortes and J. Jacques}, url = {http://www.mdpi.com/1996-1073/12/1/94}, doi = {10.3390/en12010094}, issn = {1996-1073}, year = {2019}, date = {2019-01-01}, journal = {Energies}, volume = {12}, number = {94}, pages = {1-18}, abstract = {The forecasting of future values is a very challenging task. In almost all scientific disciplines, the analysis of time series provides useful information and even economic benefits. In this context, this paper proposes a novel hybrid algorithm to forecast functional time series with arbitrary prediction horizons. It integrates a well-known clustering functional data algorithm into a forecasting strategy based on pattern sequence similarity, which was originally developed for discrete time series. The new approach assumes that some patterns are repeated over time, and it attempts to discover them and evaluate their immediate future. Hence, the algorithm first applies a clustering functional time series algorithm, i.e., it assigns labels to every data unit (it may represent either one hour, or one day, or any arbitrary length). As a result, the time series is transformed into a sequence of labels. Later, it retrieves the sequence of labels occurring just after the sample that we want to be forecasted. This sequence is searched for within the historical data, and every time it is found, the sample immediately after is stored. Once the searching process is terminated, the output is generated by weighting all stored data. The performance of the approach has been tested on real-world datasets related to electricity demand and compared to other existing methods, reporting very promising results. Finally, a statistical significance test has been carried out to confirm the suitability of the election of the compared methods. In conclusion, a novel algorithm to forecast functional time series is proposed with very satisfactory results when assessed in the context of electricity demand.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The forecasting of future values is a very challenging task. In almost all scientific disciplines, the analysis of time series provides useful information and even economic benefits. In this context, this paper proposes a novel hybrid algorithm to forecast functional time series with arbitrary prediction horizons. It integrates a well-known clustering functional data algorithm into a forecasting strategy based on pattern sequence similarity, which was originally developed for discrete time series. The new approach assumes that some patterns are repeated over time, and it attempts to discover them and evaluate their immediate future. Hence, the algorithm first applies a clustering functional time series algorithm, i.e., it assigns labels to every data unit (it may represent either one hour, or one day, or any arbitrary length). As a result, the time series is transformed into a sequence of labels. Later, it retrieves the sequence of labels occurring just after the sample that we want to be forecasted. This sequence is searched for within the historical data, and every time it is found, the sample immediately after is stored. Once the searching process is terminated, the output is generated by weighting all stored data. The performance of the approach has been tested on real-world datasets related to electricity demand and compared to other existing methods, reporting very promising results. Finally, a statistical significance test has been carried out to confirm the suitability of the election of the compared methods. In conclusion, a novel algorithm to forecast functional time series is proposed with very satisfactory results when assessed in the context of electricity demand. |