Federico Divina obtained his Ph.D. in Artificial Intelligence from the Vrije Universiteit of Amsterdam, and after that he worked as a postdoc at the University of Tilburg, within the European project NEWTIES. In 2006 he moved to the Pablo de Olavide University, where he is actually an Associate Professor.
He has been working on knowledge extraction since his Ph.D. thesis at the Vrije Universiteit of Amsterdam. He has extensive experience in the application of Machine Learning, especially techniques based on Soft Computing, for the extraction of knowledge from massive data.
His main research interests are:
- Bioinformatics
- Evolutionary Computation
- Machine Learning
- Big Data
Projects
Federico Divina has participated in various research project projects, for instance:
- Differential: this project aims to develop new tools and methods to manage and analyse information coming from several sources with the final goal of better understanding how and when energy is consumed in distributed facilities. This project was developed as a coordinated project with three complementary research groups from three different universities (Universidad de Granada, Universidad Pablo de Olavide and Universidad de Castilla La Mancha).
- GALICIAME: project that aimed at applying machine learning tools in order to extract knowledge from genetic data related to spinal muscular atrophy (SMA), in collaboration with the “Centro Andaluz de Biología del Desarrollo” (CABD).
- NEWTIES: EU project that aimed at developing an artificial society. This project involved the Vrije Universiteit van Amsterm, the University of Tilburg, the Napier University, University of Surrey, Napier University and Eötvös Loránd University.
Publications
For a complete list of my publications, please visit my Google Scholar Profile or my ORCID.
2021 |
A. Lopez-Fernandez and D. Rodriguez-Baena and F. Gomez-Vela and F. Divina and M. Garcia-Torres A multi-GPU biclustering algorithm for binary datasets Journal Article In: Journal of Parallel and Distributed Computing, vol. 147, pp. 209–219, 2021. @article{lopez2021multi, |
J. A. Gallardo and M. García-Torres and F. Gómez-Vela and F. Morales and F. Divina and D. Becerra-Alonso and G. Velázquez and F. Daumas-Ladouce and J. L. Vázquez Noguera and C. Ayala Sauer Forecasting Electricity Consumption Data from Paraguay Using a Machine Learning Approach Conference SOCO 16th International Conference on Soft Computing Models in Industrial and Environmental Applications, vol. 1401, Advances in Intelligent Systems and Computing 2021. @conference{gallardo2022forecasting, |
2020 |
F. Divina and J. F. Torres and M. García-Torres and F. Martínez-Álvarez and A. Troncoso Hybridizing deep learning and neuroevolution: Application to the Spanish short-term electric energy consumption forecasting Journal Article In: Applied Sciences, vol. 10, no. 16, pp. 5487, 2020. @article{DIVINA2020, The electric energy production would be much more efficient if accurate estimations of the future demand were available, since these would allow allocating only the resources needed for the production of the right amount of energy required. With this motivation in mind, we propose a strategy, based on neuroevolution, that can be used to this aim. Our proposal uses a genetic algorithm in order to find a sub-optimal set of hyper-parameters for configuring a deep neural network, which can then be used for obtaining the forecasting. Such a strategy is justified by the observation that the performances achieved by deep neural networks are strongly dependent on the right setting of the hyper-parameters, and genetic algorithms have shown excellent search capabilities in huge search spaces. Moreover, we base our proposal on a distributed computing platform, which allows its use on a large time-series. In order to assess the performances of our approach, we have applied it to a large dataset, related to the electric energy consumption registered in Spain over almost 10 years. Experimental results confirm the validity of our proposal since it outperforms all other forecasting techniques to which it has been compared. |
F. M. Delgado-Chaves and F. Gómez-Vela and F. Divina and M. García-Torres and D. S. Rodríguez-Baena Computational Analysis of the Global Effects of Ly6E in the Immune Response to Coronavirus Infection Using Gene Networks Journal Article In: Genes, vol. 11, no. 7, pp. 831-864, 2020. @article{Delgado-Chaves20, Gene networks have arisen as a promising tool in the comprehensive modeling and analysis of complex diseases. Particularly in viral infections, the understanding of the host-pathogen mechanisms, and the immune response to these, is considered a major goal for the rational design of appropriate therapies. For this reason, the use of gene networks may well encourage therapy-associated research in the context of the coronavirus pandemic, orchestrating experimental scrutiny and reducing costs. In this work, gene co-expression networks were reconstructed from RNA-Seq expression data with the aim of analyzing the time-resolved effects of gene Ly6E in the immune response against the coronavirus responsible for murine hepatitis (MHV). Through the integration of differential expression analyses and reconstructed networks exploration, significant differences in the immune response to virus were observed in Ly6E∆HSC compared to wild type animals. Results show that Ly6E ablation at hematopoietic stem cells (HSCs) leads to a progressive impaired immune response in both liver and spleen. Specifically, depletion of the normal leukocyte mediated immunity and chemokine signaling is observed in the liver of Ly6E∆HSC mice. On the other hand, the immune response in the spleen, which seemed to be mediated by an intense chromatin activity in the normal situation, is replaced by ECM remodeling in Ly6E∆HSC mice. These findings, which require further experimental characterization, could be extrapolated to other coronaviruses and motivate the efforts towards novel antiviral approaches. |
D. S. Rodríguez-Baena and F. Gómez-Vela and M. García-Torres and F. Divina and C. D. Barranco and N- Díaz-Díaz and M. Jimenez and G. Montalvo Identifying livestock behavior patterns based on accelerometer dataset Journal Article In: Journal of Computational Science, vol. 41, pp. 101076, 2020. @article{Rodriguez-Baena20, In large livestock farming it would be beneficial to be able to automatically detect behaviors in animals. In fact, this would allow to estimate the health status of individuals, providing valuable insight to stock raisers. Traditionally this process has been carried out manually, relying only on the experience of the breeders. Such an approach is effective for a small number of individuals. However, in large breeding farms this may not represent the best approach, since, in this way, not all the animals can be effectively monitored all the time. Moreover, the traditional approach heavily rely on human experience, which cannot be always taken for granted. To this aim, in this paper, we propose a new method for automatically detecting activity and inactivity time periods of animals, as a behavior indicator of livestock. In order to do this, we collected data with sensors located in the body of the animals to be analyzed. In particular, the reliability of the method was tested with data collected on Iberian pigs and calves. Results confirm that the proposed method can help breeders in detecting activity and inactivity periods for large livestock farming. |
T. Vanhaeren and F. Divina and M. García-Torres and F. Gómez-Vela and W. Vanhoof and P. M. Martínez-García A Comparative Study of Supervised Machine Learning Algorithms for the Prediction of Long-Range Chromatin Interactions Journal Article In: Genes, vol. 11, no. 9, pp. 985, 2020. @article{Vanhaeren20, The role of three-dimensional genome organization as a critical regulator of gene expression has become increasingly clear over the last decade. Most of our understanding of this association comes from the study of long range chromatin interaction maps provided by Chromatin Conformation Capture-based techniques, which have greatly improved in recent years. Since these procedures are experimentally laborious and expensive, in silico prediction has emerged as an alternative strategy to generate virtual maps in cell types and conditions for which experimental data of chromatin interactions is not available. Several methods have been based on predictive models trained on one-dimensional (1D) sequencing features, yielding promising results. However, different approaches vary both in the way they model chromatin interactions and in the machine learning-based strategy they rely on, making it challenging to carry out performance comparison of existing methods. In this study, we use publicly available 1D sequencing signals to model cohesin-mediated chromatin interactions in two human cell lines and evaluate the prediction performance of six popular machine learning algorithms: decision trees, random forests, gradient boosting, support vector machines, multi-layer perceptron and deep learning. Our approach accurately predicts long-range interactions and reveals that gradient boosting significantly outperforms the other five methods, yielding accuracies of about 95%. We show that chromatin features in close genomic proximity to the anchors cover most of the predictive information, as has been previously reported. Moreover, we demonstrate that gradient boosting models trained with different subsets of chromatin features, unlike the other methods tested, are able to produce accurate predictions. In this regard, and besides architectural proteins, transcription factors are shown to be highly informative. Our study provides a framework for the systematic prediction of long-range chromatin interactions, identifies gradient boosting as the best suited algorithm for this task and highlights cell-type specific binding of transcription factors at the anchors as important determinants of chromatin wiring mediated by cohesin |
2019 |
M. García-Torres and D. Becerra-Alonso and F. A Gómez-Vela and F. Divina and I. López Cobo and F. Martínez-Álvarez Analysis of Student Achievement Scores: A Machine Learning Approach Conference ICEUTE 10th International Conference on EUropean Transnational Education, Advances in Intelligent Systems and Computing 2019. @conference{Garcia2019, |
F. Gómez-Vela and F. M Delgado-Chaves and D.S. Rodríguez-Baena and M. García-Torres and F. Divina Ensemble and Greedy Approach for the Reconstruction of Large Gene Co-Expression Networks Journal Article In: Entropy, vol. 21, no. 12, pp. 1139, 2019. @article{Entropy2019, Gene networks have become a powerful tool in the comprehensive analysis of gene expression. Due to the increasing amount of available data, computational methods for networks generation must deal with the so-called curse of dimensionality in the quest for the reliability of the obtained results. In this context, ensemble strategies have significantly improved the precision of results by combining different measures or methods. On the other hand, structure optimization techniques are also important in the reduction of the size of the networks, not only improving their topology but also keeping a positive prediction ratio. In this work, we present Ensemble and Greedy networks (EnGNet), a novel two-step method for gene networks inference. First, EnGNet uses an ensemble strategy for co-expression networks generation. Second, a greedy algorithm optimizes both the size and the topological features of the network. Not only do achieved results show that this method is able to obtain reliable networks, but also that it significantly improves topological features. Moreover, the usefulness of the method is proven by an application to a human dataset on post-traumatic stress disorder, revealing an innate immunity-mediated response to this pathology. These results are indicative of the method’s potential in the field of biomarkers discovery and characterization. |
E.L. Mangas and A. Rubio and R. Álvarez-Marín and G. Labrador-Herrera and J. Pachón and M. Eugenia Pachón-Ibáñez and F. Divina and A.J. Pérez-Pulido In: Microbial Genomics, pp. mgen000309, 2019. @article{MG2019, Acinetobacter baumannii is an opportunistic bacterium that causes hospital-acquired infections with a high mortality and morbidity, since there are strains resistant to virtually any kind of antibiotic. The chase to find novel strategies to fight against this microbe can be favoured by knowledge of the complete catalogue of genes of the species, and their relationship with the specific characteristics of different isolates. In this work, we performed a genomics analysis of almost 2500 strains. Two different groups of genomes were found based on the number of shared genes. One of these groups rarely has plasmids, and bears clustered regularly interspaced short palindromic repeat (CRISPR) sequences, in addition to CRISPR-associated genes (cas genes) or restriction-modification system genes. This fact strongly supports the lack of plasmids. Furthermore, the scarce plasmids in this group also bear CRISPR sequences, and specifically contain genes involved in prokaryotic toxin–antitoxin systems that could either act as the still little known CRISPR type IV system or be the precursors of other novel CRISPR/Cas systems. In addition, a limited set of strains present a new cas9-like gene, which may complement the other cas genes in inhibiting the entrance of new plasmids into the bacteria. Finally, this group has exclusive genes involved in biofilm formation, which would connect CRISPR systems to the biogenesis of these bacterial resistance structures. |
F. Divina and M. García-Torres and F. Goméz-Vela and J.L. Vázquez Noguera A Comparative Study of Time Series Forecasting Methods for Short Term Electric Energy Consumption Prediction in Smart Buildings Journal Article In: Applied Sciences, vol. 12, no. 10, pp. 1934, 2019. @article{Energies2019b, Smart buildings are equipped with sensors that allow monitoring a range of building systems including heating and air conditioning, lighting and the general electric energy consumption. Thees data can then be stored and analyzed. The ability to use historical data regarding electric energy consumption could allow improving the energy efficiency of such buildings, as well as help to spot problems related to wasting of energy. This problem is even more important when considering that buildings are some of the largest consumers of energy. In this paper, we are interested in forecasting the energy consumption of smart buildings, and, to this aim, we propose a comparative study of different forecasting strategies that can be used to this aim. To do this, we used the data regarding the electric consumption registered by thirteen buildings located in a university campus in the south of Spain. The empirical comparison of the selected methods on the different data showed that some methods are more suitable than others for this kind of problem. In particular, we show that strategies based on Machine Learning approaches seem to be more suitable for this task. |
G. Sosa-Cabrera and M. García-Torres and S. Gómez-Guerrero and C.E. Schaerer and F. Divina A multivariate approach to the symmetrical uncertainty measure: Application to feature selection problem Journal Article In: Information Sciences, vol. 494, pp. 1–20, 2019. @article{IS-2019, In this work we propose an extension of the Symmetrical Uncertainty (SU) measure in order to address the multivariate case, simultaneously acquiring the capability to detect possible correlations and interactions among features. This generalization, denoted Multivariate Symmetrical Uncertainty (MSU), is based on the concepts of Total Correlation (TC) and Mutual Information (MI) extended to the multivariate case. The generalized measure accounts for the total amount of dependency within a set of variables as a single monolithic quantity. Multivariate measures are usually biased due to several factors. To overcome this problem, a mathematical expression is proposed, based on the cardinality of all features, which can be used to calculate the number of samples needed to estimate the MSU without bias at a pre-specified significance level. Theoretical and experimental results on synthetic data show that the proposed sample size expression properly controls the bias. In addition, when the MSU is applied to feature selection on synthetic and real-world data, it has the advantage of adequately capturing linear and nonlinear correlations and interactions, and it can therefore be used as a new feature subset evaluation method. |
F. M Delgado-Chaves and F. Gómez-Vela and M. García-Torres and F. Divina and J.L. Vázquez Noguera Computational Inference of Gene Co-Expression Networks for the identification of Lung Carcinoma Biomarkers: An Ensemble Approach Journal Article In: Genes, vol. 10, no. 12, pp. 962, 2019. @article{Genes2019, Gene Networks (GN), have emerged as an useful tool in recent years for the analysis of different diseases in the field of biomedicine. In particular, GNs have been widely applied for the study and analysis of different types of cancer. In this context, Lung carcinoma is among the most common cancer types and its short life expectancy is partly due to late diagnosis. For this reason, lung cancer biomarkers that can be easily measured are highly demanded in biomedical research. In this work, we present an application of gene co-expression networks in the modelling of lung cancer gene regulatory networks, which ultimately served to the discovery of new biomarkers. For this, a robust GN inference was performed from microarray data concomitantly using three different co-expression measures. Results identified a major cluster of genes involved in SRP-dependent co-translational protein target to membrane, as well as a set of 28 genes that were exclusively found in networks generated from cancer samples. Amongst potential biomarkers, genes NCKAP1L and DMD are highlighted due to their implications in a considerable portion of lung and bronchus primary carcinomas. These findings demonstrate the potential of GN reconstruction in the rational prediction of biomarkers. |
V.E. Jiménez Chaves and M. García-Torres and J.L. Vázquez Noguera and C.D. Cabrera Oviedo and A.P. Riego Esteche and F. Divina and M. Marrufo-Vázquez ICEUTE 10th International Conference on EUropean Transnational Education , Advances in Intelligent Systems and Computing 2019. @conference{Chaves2019, |
2018 |
P. Manuel Martínez-García and M. García-Torres and F. Divina and F. Gómez-Vela and F. Cortés-Ledesma International Conference on the Applications of Evolutionary Computation, 2018. @conference{Top2B2018b, |
G. Sosa-Cabrera and M. García-Torres and S. Gómez Guerrero and C.E. Schaerer and F. Divina Understanding a multivariate semi-metric in the search strategies for attributes subset selection Conference Proceeding Series of the Brazilian Society of Computational and Applied Mathematics, 2018. @conference{Sosa2018b, |