Prof. David Gutiérrez Avilés, Ph.D. is a Computer Science Engineer (University of Seville, 2010), Master in Software Engineering and Technology (University of Seville, 2013), Ph.D. (University of Seville, 2015). He is an Assistant Professor in the Languages and Systems Department of the University of Seville.
His main scientific achievement is the TrLab methodology for mining and evaluating behavior patterns from large time-dependent datasets. This novel method extracts patterns from 3D large data using triclustering and genetic algorithms techniques. Through this research, several research productions and goals have been obtained: five JCR papers published, six conferences, a stay abroad in the University of Chile, belongings to one R&D team, a Regional project, and a National project; His Ph.D. thesis and intellectual property for the TrLab application.
The research lines of Prof. David Gutiérrez Avilés, Ph.D. are focused on: Electricity fraud detection in Big Data environments, On-line machine learning from Big data streaming, Analysis of Internet of Things protocols, and sensor data analysis.
Teaching
TUTORIALS:
Please request an appointment.
- Wednesday from 10:45h to 14:45h
- Friday from 10:45h to 12:45h
Publications
2023 |
A. M.Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Mártinez-Álvarez A new Apache Spark-based framework for big data streaming forecasting in IoT networks Journal Article Journal of Supercomputing, in press , 2023. @article{FERNANDEZ23, title = {A new Apache Spark-based framework for big data streaming forecasting in IoT networks}, author = {A. M.Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Mártinez-Álvarez}, year = {2023}, date = {2023-02-02}, journal = {Journal of Supercomputing}, volume = {in press}, abstract = {Analyzing time-dependent data acquired in a continuous flow is a major challenge for various fields, such as big data and machine learning. Being able to analyze a large volume of data from various sources, such as sensors, networks, and the internet, is essential for improving the efficiency of our society's production processes. Additionally, this vast amount of data is collected dynamically in a continuous stream. The goal of this research is to provide a comprehensive framework for forecasting big data streams from Internet of Things networks and serve as a guide for designing and deploying other third-party solutions. Hence, a new framework for time series forecasting in a big data streaming scenario, using data collected from Internet of Things networks, is presented. This framework comprises of five main modules: Internet of Things network design and deployment, big data streaming architecture, stream data modeling method, big data forecasting method, and a comprehensive real-world application scenario, consisting of a physical Internet of Things network feeding the big data streaming architecture, being the linear regression the algorithm used for illustrative purposes. Comparison with other frameworks reveals that this is the first framework that incorporates and integrates all the aforementioned modules.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Analyzing time-dependent data acquired in a continuous flow is a major challenge for various fields, such as big data and machine learning. Being able to analyze a large volume of data from various sources, such as sensors, networks, and the internet, is essential for improving the efficiency of our society's production processes. Additionally, this vast amount of data is collected dynamically in a continuous stream. The goal of this research is to provide a comprehensive framework for forecasting big data streams from Internet of Things networks and serve as a guide for designing and deploying other third-party solutions. Hence, a new framework for time series forecasting in a big data streaming scenario, using data collected from Internet of Things networks, is presented. This framework comprises of five main modules: Internet of Things network design and deployment, big data streaming architecture, stream data modeling method, big data forecasting method, and a comprehensive real-world application scenario, consisting of a physical Internet of Things network feeding the big data streaming architecture, being the linear regression the algorithm used for illustrative purposes. Comparison with other frameworks reveals that this is the first framework that incorporates and integrates all the aforementioned modules. |
2022 |
L. Melgar-García and D. Gutiérrez-Avilés and M. T. Godinho and R. Espada and I. S. Brito and F. Martínez-Álvarez and A. Troncoso and C. Rubio-Escudero A new big data triclustering approach for extracting three-dimensional patterns in precision agriculture Journal Article Neurocomputing, 500 , pp. 268-278, 2022. @article{MELGAR21_NEUCOMb, title = {A new big data triclustering approach for extracting three-dimensional patterns in precision agriculture}, author = {L. Melgar-García and D. Gutiérrez-Avilés and M. T. Godinho and R. Espada and I. S. Brito and F. Martínez-Álvarez and A. Troncoso and C. Rubio-Escudero}, url = {https://www.sciencedirect.com/science/article/abs/pii/S0925231222006415}, doi = {https://doi.org/10.1016/j.neucom.2021.06.101}, year = {2022}, date = {2022-01-01}, journal = {Neurocomputing}, volume = {500}, pages = {268-278}, abstract = {Precision agriculture focuses on the development of site-specific harvest considering the variability of each crop area. Vegetation indices allow the study and delineation of different characteristics of each field zone, generally invisible to the naked-eye. This paper introduces a new big data triclustering approach based on evolutionary algorithms. The algorithm shows its capability to discover three-dimensional patterns on the basis of vegetation indices from vine crops. Different vegetation indices have been tested to find different patterns in the crops. The results reported using a vineyard crop located in Portugal depicts four areas with different moisture stress particularities that can lead to changes in the management of the vineyard. Furthermore, scalability studies have been performed, showing that the proposed algorithm is suitable for dealing with big datasets.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Precision agriculture focuses on the development of site-specific harvest considering the variability of each crop area. Vegetation indices allow the study and delineation of different characteristics of each field zone, generally invisible to the naked-eye. This paper introduces a new big data triclustering approach based on evolutionary algorithms. The algorithm shows its capability to discover three-dimensional patterns on the basis of vegetation indices from vine crops. Different vegetation indices have been tested to find different patterns in the crops. The results reported using a vineyard crop located in Portugal depicts four areas with different moisture stress particularities that can lead to changes in the management of the vineyard. Furthermore, scalability studies have been performed, showing that the proposed algorithm is suitable for dealing with big datasets. |
2021 |
K.-T. T. Bui and J. F. Torres and D. Gutiérrez-Avilés and V. H. Nhu and F. Martínez-Álvarez and D. T. Bui Computer-Aided Civil and Infrastructure Engineering, 37 , pp. 1368-1386, 2021. @article{BUI22b, title = {Deformation forecasting of a hydropower dam by hybridizing a Long Short-Term Memory deep learning network with the Coronavirus Optimization Algorithm}, author = {K.-T. T. Bui and J. F. Torres and D. Gutiérrez-Avilés and V. H. Nhu and F. Martínez-Álvarez and D. T. Bui}, url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/mice.12810}, doi = {https://doi.org/10.1111/mice.12810}, year = {2021}, date = {2021-11-24}, journal = {Computer-Aided Civil and Infrastructure Engineering}, volume = {37}, pages = {1368-1386}, abstract = {The safety operation and management of hydropower dam play a critical role in social-economic development and ensure people's safety in many countries; therefore, modeling and forecasting the hydropower dam's deformations with high accuracy is crucial. This research aims to propose and validate a new model based on deep learning long short-term memory (LSTM) and the coronavirus optimization algorithm (CVOA), named CVOA-LSTM, for forecasting the deformations of the hydropower dam. The second-largest hydropower dam of Vietnam, located in the Hoa Binh province, is focused. Herein, we used the LSTM to establish the deformation model, whereas the CVOA was utilized to optimize the three parameters of the LSTM, the number of hidden layers, the learning rate, and the dropout. The efficacy of the proposed CVOA-LSTM model is assessed by comparing its forecasting performance with state-of-the-art benchmarks, sequential minimal optimization for support vector regression, Gaussian process, M5' model tree, multilayer perceptron neural network, reduced error pruning tree, random tree, random forest, and radial basis function neural network. The result shows that the proposed CVOA-LSTM model has high forecasting capability (R2 = 0.874, root mean square error = 0.34, mean absolute error = 0.23) and outperforms the benchmarks. We conclude that CVOA-LSTM is a new tool that can be considered to forecast the hydropower dam's deformations.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The safety operation and management of hydropower dam play a critical role in social-economic development and ensure people's safety in many countries; therefore, modeling and forecasting the hydropower dam's deformations with high accuracy is crucial. This research aims to propose and validate a new model based on deep learning long short-term memory (LSTM) and the coronavirus optimization algorithm (CVOA), named CVOA-LSTM, for forecasting the deformations of the hydropower dam. The second-largest hydropower dam of Vietnam, located in the Hoa Binh province, is focused. Herein, we used the LSTM to establish the deformation model, whereas the CVOA was utilized to optimize the three parameters of the LSTM, the number of hidden layers, the learning rate, and the dropout. The efficacy of the proposed CVOA-LSTM model is assessed by comparing its forecasting performance with state-of-the-art benchmarks, sequential minimal optimization for support vector regression, Gaussian process, M5' model tree, multilayer perceptron neural network, reduced error pruning tree, random tree, random forest, and radial basis function neural network. The result shows that the proposed CVOA-LSTM model has high forecasting capability (R2 = 0.874, root mean square error = 0.34, mean absolute error = 0.23) and outperforms the benchmarks. We conclude that CVOA-LSTM is a new tool that can be considered to forecast the hydropower dam's deformations. |
L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach Journal Article Information Sciences, 558 , pp. 174-193, 2021. @article{Melgar21_IS, title = {Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach}, author = {L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso}, url = {https://www.sciencedirect.com/science/article/pii/S0020025521000220}, doi = {10.1016/j.ins.2020.12.089}, year = {2021}, date = {2021-01-01}, journal = {Information Sciences}, volume = {558}, pages = {174-193}, abstract = {Triclustering algorithms group sets of coordinates of 3-dimensional datasets. In this paper, a new triclustering approach for data streams is introduced. It follows a streaming scheme of learning in two steps: offline and online phases. First, the offline phase provides a summary model with the components of the triclusters. Then, the second stage is the online phase to deal with data in streaming. This online phase consists in using the summary model obtained in the offline stage to update the triclusters as fast as possible with genetic operators. Results using three types of synthetic datasets and a real-world environmental sensor dataset are reported. The performance of the proposed triclustering streaming algorithm is compared to a batch triclustering algorithm, showing an accurate performance both in terms of quality and running times. }, keywords = {}, pubstate = {published}, tppubtype = {article} } Triclustering algorithms group sets of coordinates of 3-dimensional datasets. In this paper, a new triclustering approach for data streams is introduced. It follows a streaming scheme of learning in two steps: offline and online phases. First, the offline phase provides a summary model with the components of the triclusters. Then, the second stage is the online phase to deal with data in streaming. This online phase consists in using the summary model obtained in the offline stage to update the triclusters as fast as possible with genetic operators. Results using three types of synthetic datasets and a real-world environmental sensor dataset are reported. The performance of the proposed triclustering streaming algorithm is compared to a batch triclustering algorithm, showing an accurate performance both in terms of quality and running times. |
L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso Nearest neighbours-based forecasting for electricity demand time series in streaming Conference Conference of the Spanish Association for Artificial Intelligence (CAEPIA'21), Lecture Notes in Artificial Intelligence 2021. @conference{CAEPIA21_Laura, title = {Nearest neighbours-based forecasting for electricity demand time series in streaming}, author = {L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso }, year = {2021}, date = {2021-01-01}, booktitle = {Conference of the Spanish Association for Artificial Intelligence (CAEPIA'21)}, series = {Lecture Notes in Artificial Intelligence}, abstract = {This paper presents a forecasting algorithm for time series in streaming. The methodology has two well-differentiated stages: the algorithm searches for the nearest neighbors to generate an initial prediction model in the batch phase. Then, an online phase is carried out when the time series arrives in streaming. In particular, the nearest neighbor of the streaming data from the training set is computed and the nearest neighbors, previously computed in the batch phase, of this nearest neighbor are used to obtain the predictions. Results using the electricity consumption time series are reported, showing a remarkable performance of the proposed algorithm in terms of forecasting errors when compared to a nearest neighbors-based benchmark algorithm. The running times for the predictions are also remarkable.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } This paper presents a forecasting algorithm for time series in streaming. The methodology has two well-differentiated stages: the algorithm searches for the nearest neighbors to generate an initial prediction model in the batch phase. Then, an online phase is carried out when the time series arrives in streaming. In particular, the nearest neighbor of the streaming data from the training set is computed and the nearest neighbors, previously computed in the batch phase, of this nearest neighbor are used to obtain the predictions. Results using the electricity consumption time series are reported, showing a remarkable performance of the proposed algorithm in terms of forecasting errors when compared to a nearest neighbors-based benchmark algorithm. The running times for the predictions are also remarkable. |
2020 |
L. Melgar-García and M. T. Godinho and R. Espada and D. Gutiérrez-Avilés and I. S. Brito and F. Martínez-Álvarez and A. Troncoso and C. Rubio-Escudero Discovering Spatio-Temporal Patterns in Precision Agriculture Based on Triclustering Conference SOCO 15th International Conference on Soft Computing Models in Industrial and Environmental Applications, Advances in Intelligent Systems and Computing 2020. @conference{SOCO20, title = {Discovering Spatio-Temporal Patterns in Precision Agriculture Based on Triclustering}, author = {L. Melgar-García and M. T. Godinho and R. Espada and D. Gutiérrez-Avilés and I. S. Brito and F. Martínez-Álvarez and A. Troncoso and C. Rubio-Escudero}, url = {https://link.springer.com/chapter/10.1007/978-3-030-57802-2_22}, year = {2020}, date = {2020-08-29}, booktitle = {SOCO 15th International Conference on Soft Computing Models in Industrial and Environmental Applications}, pages = {226-236}, series = {Advances in Intelligent Systems and Computing }, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
F. Martínez-Álvarez and G. Asencio-Cortés and J. F. Torres and D. Gutiérrez-Avilés and L. Melgar-García and R. Pérez-Chacón and C. Rubio-Escudero and A. Troncoso and J. C. Riquelme Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model Journal Article Big Data, 8 (4), pp. 308-322, 2020. @article{MARTINEZ-ALVAREZ20, title = {Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model}, author = {F. Martínez-Álvarez and G. Asencio-Cortés and J. F. Torres and D. Gutiérrez-Avilés and L. Melgar-García and R. Pérez-Chacón and C. Rubio-Escudero and A. Troncoso and J. C. Riquelme}, url = {https://www.liebertpub.com/doi/full/10.1089/big.2020.0051}, doi = {10.1089/big.2020.0051}, year = {2020}, date = {2020-07-22}, journal = {Big Data}, volume = {8}, number = {4}, pages = {308-322}, abstract = {This work proposes a novel bioinspired metaheuristic, simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures or traveling rate are introduced into the model in order to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate and number of recoveries, the infected population gradually decreases. The Coronavirus Optimization Algorithm has two major advantages when compared to other similar strategies. Firstly, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Secondly, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multi-virus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, in order to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This work proposes a novel bioinspired metaheuristic, simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures or traveling rate are introduced into the model in order to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate and number of recoveries, the infected population gradually decreases. The Coronavirus Optimization Algorithm has two major advantages when compared to other similar strategies. Firstly, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Secondly, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multi-virus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, in order to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance. |
A. M. Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez Automated Deployment of a Spark Cluster with Machine Learning Algorithm Integration Journal Article Big Data Research, 19-20 , pp. 100135, 2020. @article{FERNANDEZ20, title = {Automated Deployment of a Spark Cluster with Machine Learning Algorithm Integration}, author = {A. M. Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez}, url = {https://www.sciencedirect.com/science/article/pii/S2214579620300034}, doi = {10.1016/j.bdr.2020.100135}, year = {2020}, date = {2020-05-12}, journal = {Big Data Research}, volume = {19-20}, pages = {100135}, abstract = {The vast amount of data stored nowadays has turned big data analytics into a very trendy research field. The Spark distributed computing platform has emerged as a dominant and widely used paradigm for cluster deployment and big data analytics. However, to get started up is still a task that may take much time when manually done, due to the requisites that all nodes must fulfill. This work introduces LadonSpark, an open-source and non-commercial solution to configure and deploy a Spark cluster automatically. It has been specially designed for easy and efficient management of a Spark cluster with a friendly graphical user interface to automate the deployment of a cluster and to start up the distributed file system of Hadoop quickly. Moreover, LadonSpark includes the functionality of integrating any algorithm into the system. That is, the user only needs to provide the executable file and the number of required inputs for proper parametrization. Source codes developed in Scala, R, Python, or Java can be supported on LadonSpark. Besides, clustering, regression, classification, and association rules algorithms are already integrated so that users can test its usability from its initial installation.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The vast amount of data stored nowadays has turned big data analytics into a very trendy research field. The Spark distributed computing platform has emerged as a dominant and widely used paradigm for cluster deployment and big data analytics. However, to get started up is still a task that may take much time when manually done, due to the requisites that all nodes must fulfill. This work introduces LadonSpark, an open-source and non-commercial solution to configure and deploy a Spark cluster automatically. It has been specially designed for easy and efficient management of a Spark cluster with a friendly graphical user interface to automate the deployment of a cluster and to start up the distributed file system of Hadoop quickly. Moreover, LadonSpark includes the functionality of integrating any algorithm into the system. That is, the user only needs to provide the executable file and the number of required inputs for proper parametrization. Source codes developed in Scala, R, Python, or Java can be supported on LadonSpark. Besides, clustering, regression, classification, and association rules algorithms are already integrated so that users can test its usability from its initial installation. |
L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso High-content screening images streaming analysis using the STriGen methodology Conference SAC 35th Annual ACM Symposium on Applied Computing, 2020. @conference{Melgar20_SAC, title = {High-content screening images streaming analysis using the STriGen methodology}, author = {L. Melgar-García and D. Gutiérrez-Avilés and C. Rubio-Escudero and A. Troncoso }, doi = {doi.org/10.1145/3341105.3374071}, year = {2020}, date = {2020-03-01}, booktitle = {SAC 35th Annual ACM Symposium on Applied Computing}, pages = {537-539}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
2019 |
J. F. Torres and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez Random Hyper-Parameter Search-Based Deep Neural Network for Power Consumption Forecasting Conference IWANN 15th International Work-Conference on Artificial Neural Networks, 11506 , Lecture Notes in Computer Science 2019. @conference{TORRES19-2, title = {Random Hyper-Parameter Search-Based Deep Neural Network for Power Consumption Forecasting}, author = {J. F. Torres and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez}, url = {https://link.springer.com/chapter/10.1007/978-3-030-20521-8_22}, doi = {https://doi.org/10.1007/978-3-030-20521-8_22}, year = {2019}, date = {2019-05-16}, booktitle = {IWANN 15th International Work-Conference on Artificial Neural Networks}, volume = {11506}, pages = {259-269}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
A. M. Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez Real-Time Big Data Analytics in Smart Cities from LoRa-based IoT Networks Conference SOCO 14th International Conference on Soft Computing Models in Industrial and Environmental Applications, Advances in Intelligent Systems and Computing 2019. @conference{SOCO2019, title = {Real-Time Big Data Analytics in Smart Cities from LoRa-based IoT Networks}, author = {A. M. Fernández and D. Gutiérrez-Avilés and A. Troncoso and F. Martínez-Álvarez}, url = {https://link.springer.com/chapter/10.1007/978-3-030-20055-8_9}, year = {2019}, date = {2019-01-01}, booktitle = {SOCO 14th International Conference on Soft Computing Models in Industrial and Environmental Applications}, series = {Advances in Intelligent Systems and Computing}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
2018 |
D. Gutiérrez-Avilés and R. Giráldez and F. J. Gil-Cumbreras and C. Rubio-Escudero TRIQ: a new method to evaluate triclusters Journal Article BioData Mining, 11 (1), pp. 15, 2018. @article{Gutierrez-Aviles2018, title = {TRIQ: a new method to evaluate triclusters}, author = {D. Gutiérrez-Avilés and R. Giráldez and F. J. Gil-Cumbreras and C. Rubio-Escudero}, url = {https://biodatamining.biomedcentral.com/articles/10.1186/s13040-018-0177-5}, doi = {10.1186/s13040-018-0177-5}, year = {2018}, date = {2018-01-01}, journal = {BioData Mining}, volume = {11}, number = {1}, pages = {15}, abstract = {Triclustering has shown to be a valuable tool for the analysis of microarray data since its appearance as an improvement of classical clustering and biclustering techniques. The standard for validation of triclustering is based on three different measures: correlation, graphic similarity of the patterns and functional annotations for the genes extracted from the Gene Ontology project (GO).}, keywords = {}, pubstate = {published}, tppubtype = {article} } Triclustering has shown to be a valuable tool for the analysis of microarray data since its appearance as an improvement of classical clustering and biclustering techniques. The standard for validation of triclustering is based on three different measures: correlation, graphic similarity of the patterns and functional annotations for the genes extracted from the Gene Ontology project (GO). |
D. Gutiérrez-Avilés and J. A. Fábregas and J. Tejedor and F. Martínez-Álvarez and A. Troncoso and J. C. Riquelme SmartFD: A real big data application for electrical fraud detection Conference HAIS 13th International Conference on Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science 2018. @conference{HAIS2018, title = {SmartFD: A real big data application for electrical fraud detection}, author = {D. Gutiérrez-Avilés and J. A. Fábregas and J. Tejedor and F. Martínez-Álvarez and A. Troncoso and J. C. Riquelme}, url = {https://link.springer.com/chapter/10.1007/978-3-319-92639-1_11}, year = {2018}, date = {2018-01-01}, booktitle = {HAIS 13th International Conference on Hybrid Artificial Intelligence Systems}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
2016 |
D. Gutiérrez-Avilés and C. Rubio-Escudero TRIQ: A Comprehensive Evaluation Measure for Triclustering Algorithms Conference Hybrid Artificial Intelligent Systems: 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings, Lecture Notes in Computer Science 2016. @conference{Gutiérrez-Avilés2016, title = {TRIQ: A Comprehensive Evaluation Measure for Triclustering Algorithms}, author = {D. Gutiérrez-Avilés and C. Rubio-Escudero}, url = {https://link.springer.com/chapter/10.1007/978-3-319-32034-2_56}, year = {2016}, date = {2016-01-01}, booktitle = {Hybrid Artificial Intelligent Systems: 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {conference} } |
2015 |
D. Gutiérrez-Avilés and C. Rubio-Escudero MSL: A Measure to Evaluate Three-dimensional Patterns in Gene Expression Data Journal Article Evolutionary Bioinformatics, 11 , pp. 121—135, 2015. @article{Gutierrez-Aviles2015, title = {MSL: A Measure to Evaluate Three-dimensional Patterns in Gene Expression Data}, author = {D. Gutiérrez-Avilés and C. Rubio-Escudero}, url = {https://journals.sagepub.com/doi/10.4137/EBO.S25822}, doi = {10.4137/EBO.S25822}, year = {2015}, date = {2015-01-01}, journal = {Evolutionary Bioinformatics}, volume = {11}, pages = {121—135}, abstract = {icroarray technology is highly used in biological research environments due to its ability to monitor the RNA concentration levels. The analysis of the data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior. Biclustering relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. Triclustering appears for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. These triclusters provide hidden information in the form of behavior patterns from temporal experiments with microarrays relating subsets of genes, experimental conditions, and time points. We present an evaluation measure for triclusters called Multi Slope Measure, based on the similarity among the angles of the slopes formed by each profile formed by the genes, conditions, and times of the tricluster.}, keywords = {}, pubstate = {published}, tppubtype = {article} } icroarray technology is highly used in biological research environments due to its ability to monitor the RNA concentration levels. The analysis of the data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior. Biclustering relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. Triclustering appears for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. These triclusters provide hidden information in the form of behavior patterns from temporal experiments with microarrays relating subsets of genes, experimental conditions, and time points. We present an evaluation measure for triclusters called Multi Slope Measure, based on the similarity among the angles of the slopes formed by each profile formed by the genes, conditions, and times of the tricluster. |