2024
Wang, Zihui; Jiang, Yuan; Diallo, Abdoulaye Baniré; Kembel, Steven W.
Deep learning‐ and image processing‐based methods for automatic estimation of leaf herbivore damage Article de journal
Dans: Methods in Ecology and Evolution, vol. 15, no 4, p. 732–743, 2024, ISSN: 2041-210X, 2041-210X.
@article{wang_deep_2024,
title = {Deep learning‐ and image processing‐based methods for automatic estimation of leaf herbivore damage},
author = {Zihui Wang and Yuan Jiang and Abdoulaye Banir\'{e} Diallo and Steven W. Kembel},
url = {https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14293},
doi = {10.1111/2041-210X.14293},
issn = {2041-210X, 2041-210X},
year = {2024},
date = {2024-04-01},
urldate = {2024-04-30},
journal = {Methods in Ecology and Evolution},
volume = {15},
number = {4},
pages = {732\textendash743},
abstract = {Abstract
Quantifying the intensity of leaf herbivory pressure is crucial for understanding the interaction between plants and herbivores in both applied and basic science. Visual estimates and digital analysis have been commonly used to estimate leaf herbivore damage but are time‐consuming which limits the amount of data that can be collected and prevent answering big picture questions that require large‐scale sampling of herbivory pressure. Recent developments in deep learning have provided a potential tool for automatic collection of ecological data from various sources. However, most applications have focused on identification and counting, and there is a lack of deep learning tools for quantitative estimation of leaf herbivore damage.
Here, we trained generative adversarial networks (GANs) to predict the intact status of damaged leaves and applied image processing technique to estimate the area and percentage of leaf damage. We first described procedures for collecting leaf images, training GAN models, predicting intact leaves and calculating leaf area, with a Python package provided to enable hands‐on application of these procedures. Then, we collected a large leaf data set to train a universal deep learning model and developed an online app
HerbiEstim
to allow direct use of pretrained models to estimate herbivory damage of leaves. We tested these methods using both simulated and real leaf damage data.
The procedures provided in our study greatly improved the efficiency of leaf herbivore damage estimation. Our test demonstrated that the reconstruction of damaged leaf image resembled the ground‐truth image with a similarity of 98.8%. The estimation of leaf herbivore damage exhibited a high accuracy with an averaged root mean square error of 1.6% and had a general applicability to different plant taxa and leaf shapes.
Overall, our work demonstrated the feasibility of applying deep learning techniques to quantify leaf herbivory intensity. The use of GANs allows automatic estimation of leaf damage, representing a major advantage of the method. The Python package and the online app with pre‐trained models will facilitate the use of our method for the analysis of large data sets of plant\textendashherbivore interactions.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Quantifying the intensity of leaf herbivory pressure is crucial for understanding the interaction between plants and herbivores in both applied and basic science. Visual estimates and digital analysis have been commonly used to estimate leaf herbivore damage but are time‐consuming which limits the amount of data that can be collected and prevent answering big picture questions that require large‐scale sampling of herbivory pressure. Recent developments in deep learning have provided a potential tool for automatic collection of ecological data from various sources. However, most applications have focused on identification and counting, and there is a lack of deep learning tools for quantitative estimation of leaf herbivore damage.
Here, we trained generative adversarial networks (GANs) to predict the intact status of damaged leaves and applied image processing technique to estimate the area and percentage of leaf damage. We first described procedures for collecting leaf images, training GAN models, predicting intact leaves and calculating leaf area, with a Python package provided to enable hands‐on application of these procedures. Then, we collected a large leaf data set to train a universal deep learning model and developed an online app
HerbiEstim
to allow direct use of pretrained models to estimate herbivory damage of leaves. We tested these methods using both simulated and real leaf damage data.
The procedures provided in our study greatly improved the efficiency of leaf herbivore damage estimation. Our test demonstrated that the reconstruction of damaged leaf image resembled the ground‐truth image with a similarity of 98.8%. The estimation of leaf herbivore damage exhibited a high accuracy with an averaged root mean square error of 1.6% and had a general applicability to different plant taxa and leaf shapes.
Overall, our work demonstrated the feasibility of applying deep learning techniques to quantify leaf herbivory intensity. The use of GANs allows automatic estimation of leaf damage, representing a major advantage of the method. The Python package and the online app with pre‐trained models will facilitate the use of our method for the analysis of large data sets of plant–herbivore interactions.
E, McClymont; A, Larouche; Hc, Cote; Ab, Diallo; C, Elwood; F, Kakkar; D, Money; L, Sauve; H, Soudeyns; S, Gantt; I, Boucoiran
Maternal Cytomegalovirus Reinfection during Pregnancy among Women Living with HIV Article de journal
Dans: American Journal of Obstetrics and Gynecology, vol. 230, no 2, p. S641, 2024, ISSN: 00029378.
@article{e_maternal_2024,
title = {Maternal Cytomegalovirus Reinfection during Pregnancy among Women Living with HIV},
author = {McClymont E and Larouche A and Cote Hc and Diallo Ab and Elwood C and Kakkar F and Money D and Sauve L and Soudeyns H and Gantt S and Boucoiran I},
url = {https://linkinghub.elsevier.com/retrieve/pii/S0002937823006853},
doi = {10.1016/j.ajog.2023.09.061},
issn = {00029378},
year = {2024},
date = {2024-02-01},
urldate = {2024-04-30},
journal = {American Journal of Obstetrics and Gynecology},
volume = {230},
number = {2},
pages = {S641},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Lebatteux, Dylan; Soudeyns, Hugo; Boucoiran, Isabelle; Gantt, Soren; Diallo, Abdoulaye Baniré
Machine learning-based approach KEVOLVE efficiently identifies SARS-CoV-2 variant-specific genomic signatures Article de journal
Dans: PLOS ONE, vol. 19, no 1, p. e0296627, 2024, ISSN: 1932-6203.
@article{lebatteux_machine_2024,
title = {Machine learning-based approach KEVOLVE efficiently identifies SARS-CoV-2 variant-specific genomic signatures},
author = {Dylan Lebatteux and Hugo Soudeyns and Isabelle Boucoiran and Soren Gantt and Abdoulaye Banir\'{e} Diallo},
editor = {Nagarajan Raju},
url = {https://dx.plos.org/10.1371/journal.pone.0296627},
doi = {10.1371/journal.pone.0296627},
issn = {1932-6203},
year = {2024},
date = {2024-01-01},
urldate = {2024-04-30},
journal = {PLOS ONE},
volume = {19},
number = {1},
pages = {e0296627},
abstract = {Machine learning was shown to be effective at identifying distinctive genomic signatures among viral sequences. These signatures are defined as pervasive motifs in the viral genome that allow discrimination between species or variants. In the context of SARS-CoV-2, the identification of these signatures can assist in taxonomic and phylogenetic studies, improve in the recognition and definition of emerging variants, and aid in the characterization of functional properties of polymorphic gene products. In this paper, we assess KEVOLVE, an approach based on a genetic algorithm with a machine-learning kernel, to identify multiple genomic signatures based on minimal sets of
k
-mers. In a comparative study, in which we analyzed large SARS-CoV-2 genome dataset, KEVOLVE was more effective at identifying variant-discriminative signatures than several gold-standard statistical tools. Subsequently, these signatures were characterized using a new extension of KEVOLVE (KANALYZER) to highlight variations of the discriminative signatures among different classes of variants, their genomic location, and the mutations involved. The majority of identified signatures were associated with known mutations among the different variants, in terms of functional and pathological impact based on available literature. Here we showed that KEVOLVE is a robust machine learning approach to identify discriminative signatures among SARS-CoV-2 variants, which are frequently also biologically relevant, while bypassing multiple sequence alignments. The source code of the method and additional resources are available at:
https://github.com/bioinfoUQAM/KEVOLVE
.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
k
-mers. In a comparative study, in which we analyzed large SARS-CoV-2 genome dataset, KEVOLVE was more effective at identifying variant-discriminative signatures than several gold-standard statistical tools. Subsequently, these signatures were characterized using a new extension of KEVOLVE (KANALYZER) to highlight variations of the discriminative signatures among different classes of variants, their genomic location, and the mutations involved. The majority of identified signatures were associated with known mutations among the different variants, in terms of functional and pathological impact based on available literature. Here we showed that KEVOLVE is a robust machine learning approach to identify discriminative signatures among SARS-CoV-2 variants, which are frequently also biologically relevant, while bypassing multiple sequence alignments. The source code of the method and additional resources are available at:
https://github.com/bioinfoUQAM/KEVOLVE
.
2023
Mrabah, Nairouz; Amar, Mohamed Mahmoud; Bouguessa, Mohamed; Diallo, Abdoulaye Banire
Toward Convex Manifolds: A Geometric Perspective for Deep Graph Clustering of Single-cell RNA-seq Data Article d’actes
Dans: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, p. 4855–4863, International Joint Conferences on Artificial Intelligence Organization, Macau, SAR China, 2023, ISBN: 9781956792034.
@inproceedings{mrabah_toward_2023,
title = {Toward Convex Manifolds: A Geometric Perspective for Deep Graph Clustering of Single-cell RNA-seq Data},
author = {Nairouz Mrabah and Mohamed Mahmoud Amar and Mohamed Bouguessa and Abdoulaye Banire Diallo},
url = {https://www.ijcai.org/proceedings/2023/540},
doi = {10.24963/ijcai.2023/540},
isbn = {9781956792034},
year = {2023},
date = {2023-08-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence},
pages = {4855\textendash4863},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
address = {Macau, SAR China},
abstract = {The deep clustering paradigm has shown great potential for discovering complex patterns that can reveal cell heterogeneity in single-cell RNA sequencing data. This paradigm involves two training phases: pretraining based on a pretext task and fine-tuning using pseudo-labels. Although current models yield promising results, they overlook the geometric distortions that regularly occur during the training process. More precisely, the transition between the two phases results in a coarse flattening of the latent structures, which can deteriorate the clustering performance. In this context, existing methods perform euclidean-based embedding clustering without ensuring the flatness and convexity of the latent manifolds. To address this problem, we incorporate two mechanisms. First, we introduce an overclustering loss to flatten the local curves. Second, we propose an adversarial mechanism to adjust the global geometric configuration. The second mechanism gradually transforms the latent structures into convex ones. Empirical results on a variety of gene expression datasets show that our model outperforms state-of-the-art methods.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Mrabah, Nairouz; Amar, Mohamed Mahmoud; Bouguessa, Mohamed; Diallo, Abdoulaye Banire
Exploring the Interaction between Local and Global Latent Configurations for Clustering Single-Cell RNA-Seq: A Unified Perspective Article de journal
Dans: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no 8, p. 9235–9242, 2023, ISSN: 2374-3468, 2159-5399.
@article{mrabah_exploring_2023,
title = {Exploring the Interaction between Local and Global Latent Configurations for Clustering Single-Cell RNA-Seq: A Unified Perspective},
author = {Nairouz Mrabah and Mohamed Mahmoud Amar and Mohamed Bouguessa and Abdoulaye Banire Diallo},
url = {https://ojs.aaai.org/index.php/AAAI/article/view/26107},
doi = {10.1609/aaai.v37i8.26107},
issn = {2374-3468, 2159-5399},
year = {2023},
date = {2023-06-01},
urldate = {2024-04-30},
journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
volume = {37},
number = {8},
pages = {9235\textendash9242},
abstract = {The most recent approaches for clustering single-cell RNA-sequencing data rely on deep auto-encoders. However, three major challenges remain unaddressed. First, current models overlook the impact of the cumulative errors induced by the pseudo-supervised embedding clustering task (Feature Randomness). Second, existing methods neglect the effect of the strong competition between embedding clustering and reconstruction (Feature Drift). Third, the previous deep clustering models regularly fail to consider the topological information of the latent data, even though the local and global latent configurations can bring complementary views to the clustering task. To address these challenges, we propose a novel approach that explores the interaction between local and global latent configurations to progressively adjust the reconstruction and embedding clustering tasks. We elaborate a topological and probabilistic filter to mitigate Feature Randomness and a cell-cell graph structure and content correction mechanism to counteract Feature Drift. The Zero-Inflated Negative Binomial model is also integrated to capture the characteristics of gene expression profiles. We conduct detailed experiments on real-world datasets from multiple representative genome sequencing platforms. Our approach outperforms the state-of-the-art clustering methods in various evaluation metrics.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
E, McClymont; A, Albert; H, Côté; A, Diallo; C, Elwood; F, Kakkar; D, Money; L, Sauvé; H, Soudeyns; S, Gantt; Boucoiran, I.
Maternal and infant cytomegalovirus detection among women living with HIV Article de journal
Dans: American Journal of Obstetrics and Gynecology, vol. 228, no 2, p. S771–S772, 2023, ISSN: 00029378.
@article{e_maternal_2023,
title = {Maternal and infant cytomegalovirus detection among women living with HIV},
author = {McClymont E and Albert A and C\^{o}t\'{e} H and Diallo A and Elwood C and Kakkar F and Money D and Sauv\'{e} L and Soudeyns H and Gantt S and I. Boucoiran},
url = {https://linkinghub.elsevier.com/retrieve/pii/S0002937822009929},
doi = {10.1016/j.ajog.2022.11.108},
issn = {00029378},
year = {2023},
date = {2023-02-01},
urldate = {2024-04-30},
journal = {American Journal of Obstetrics and Gynecology},
volume = {228},
number = {2},
pages = {S771\textendashS772},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Wu, Chao-Jung; Liu, Hui-Wen; Tereshko, Lauren; Lin, Dongdong; Bergelson, Svetlana; Zhang, Baohong; Diallo, Abdoulaye Baniré; Mason, Cullen
Assessing functional impurities in rAAV production platforms by long-read sequencing Article de journal
Dans: 2023.
@article{chao-jung_wu_assessing_2023,
title = {Assessing functional impurities in rAAV production platforms by long-read sequencing},
author = {Chao-Jung Wu and Hui-Wen Liu and Lauren Tereshko and Dongdong Lin and Svetlana Bergelson and Baohong Zhang and Abdoulaye Banir\'{e} Diallo and Cullen Mason},
url = {https://rgdoi.net/10.13140/RG.2.2.17884.56962},
doi = {10.13140/RG.2.2.17884.56962},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Ibnatta, Youssef; Khaldoun, Mohammed; Sadik, Mohammed; Essaid, Sabir; Diallo, Abdoulaye Baniré
Mobile Localization for Indoor Iot Services: From Proof of Concept to Real-Word Experimentation Divers
2023.
@misc{ibnatta_mobile_2023,
title = {Mobile Localization for Indoor Iot Services: From Proof of Concept to Real-Word Experimentation},
author = {Youssef Ibnatta and Mohammed Khaldoun and Mohammed Sadik and Sabir Essaid and Abdoulaye Banir\'{e} Diallo},
url = {https://www.ssrn.com/abstract=4493039},
doi = {10.2139/ssrn.4493039},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
keywords = {},
pubstate = {published},
tppubtype = {misc}
}
Ahmad, Syed Mudasir; Donato, Marcos De; Bhat, Basharat Ahmad; Diallo, Abdoulaye Banire; Peters, Sunday O.
Editorial: Omics technologies in livestock improvement: From selection to breeding decisions Article de journal
Dans: Frontiers in Genetics, vol. 13, p. 1113417, 2023, ISSN: 1664-8021.
@article{ahmad_editorial_2023,
title = {Editorial: Omics technologies in livestock improvement: From selection to breeding decisions},
author = {Syed Mudasir Ahmad and Marcos De Donato and Basharat Ahmad Bhat and Abdoulaye Banire Diallo and Sunday O. Peters},
url = {https://www.frontiersin.org/articles/10.3389/fgene.2022.1113417/full},
doi = {10.3389/fgene.2022.1113417},
issn = {1664-8021},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
journal = {Frontiers in Genetics},
volume = {13},
pages = {1113417},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Naghashi, Vahid; Dallago, Gabriel; Diallo, Abdoulaye Banire; Boukadoum, Mounir
Univariate and multivariate time-series methods to forecast dairy income Article d’actes
Dans: 2023.
@inproceedings{naghashi_univariate_2023,
title = {Univariate and multivariate time-series methods to forecast dairy income},
author = {Vahid Naghashi and Gabriel Dallago and Abdoulaye Banire Diallo and Mounir Boukadoum},
url = {https://openreview.net/forum?id=0sGHJV7tRM},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
abstract = {Forecasting the income from milk sales can be addressed as a time-series problem since the sequence of multiple dairy attributes during lactation cycles are inter-related and temporally dependent. In this paper, we provide a framework to forecast the income from milk sales during the third lactation of the dairy cows based on dairy attributes recorded through the first and second lactation. We modeled the problem as univariate and multivariate time-series predictions. We propose several state-of-the-art implementations with ARIMA, N-BEATS, transformer and an original method, MuMu+attention, that combines Long-Short Term Memory neural network and attention mechanism to capture the temporal dependencies. To benchmark the implemented methods, we curated data from 147,749 dairy cows from 5,844 Canadian herds. The monthly income from milk sales ($CAD) measured at each cow during their third lactation was treated as the prediction target. The dataset was composed of dairy attributes of milk quality, production, season, year, and health, recorded over the first and second lactation of the dairy cows. The results highlighted that most of the methods can achieve relative good performance with the best prediction accuracy obtained by MuMu+attention. MuMu+attention results were 43% better over the classic ARIMA model. By forecasting the income from milk sales, our model could help farmers to early identify less profitable animals and better allocate resources.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Remita, Amine M.; Vitae, Golrokh; Diallo, Abdoulaye Baniré
Prior Density Learning in Variational Bayesian Phylogenetic Parameters Inference Article d’actes
Dans: Jahn, Katharina; Vinař, Tomáš (Ed.): Comparative Genomics, p. 112–130, Springer Nature Switzerland, Cham, 2023, ISBN: 9783031369117.
@inproceedings{remita_prior_2023,
title = {Prior Density Learning in Variational Bayesian Phylogenetic Parameters Inference},
author = {Amine M. Remita and Golrokh Vitae and Abdoulaye Banir\'{e} Diallo},
editor = {Katharina Jahn and Tom\'{a}\v{s} Vina\v{r}},
doi = {10.1007/978-3-031-36911-7_8},
isbn = {9783031369117},
year = {2023},
date = {2023-01-01},
booktitle = {Comparative Genomics},
pages = {112\textendash130},
publisher = {Springer Nature Switzerland},
address = {Cham},
abstract = {The advances in variational inference are providing promising paths in Bayesian estimation problems. These advances make variational phylogenetic inference an alternative approach to Markov Chain Monte Carlo methods for approximating the phylogenetic posterior. However, one of the main drawbacks of such approaches is modelling the prior through fixed distributions, which could bias the posterior approximation if they are distant from the current data distribution. In this paper, we propose an approach and an implementation framework to relax the rigidity of the prior densities by learning their parameters using a gradient-based method and a neural network-based parameterization. We applied this approach for branch lengths and evolutionary parameters estimation under several Markov chain substitution models. The results of performed simulations show that the approach is powerful in estimating branch lengths and evolutionary model parameters. They also show that a flexible prior model could provide better results than a predefined prior model. Finally, the results highlight that using neural networks improves the initialization of the optimization of the prior density parameters.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Jacques, Amanda A. Boatswain; Diallo, Abdoulaye Baniré; Lord, Etienne
The Canadian Cropland Dataset: A New Land Cover Dataset for Multitemporal Deep Learning Classification in Agriculture Article de journal
Dans: 2023, (arXiv:2306.00114 [cs]).
@article{jacques_canadian_2023,
title = {The Canadian Cropland Dataset: A New Land Cover Dataset for Multitemporal Deep Learning Classification in Agriculture},
author = {Amanda A. Boatswain Jacques and Abdoulaye Banir\'{e} Diallo and Etienne Lord},
url = {https://arxiv.org/abs/2306.00114},
doi = {10.48550/ARXIV.2306.00114},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
publisher = {arXiv},
abstract = {Monitoring land cover using remote sensing is vital for studying environmental changes and ensuring global food security through crop yield forecasting. Specifically, multitemporal remote sensing imagery provides relevant information about the dynamics of a scene, which has proven to lead to better land cover classification results. Nevertheless, few studies have benefited from high spatial and temporal resolution data due to the difficulty of accessing reliable, fine-grained and high-quality annotated samples to support their hypotheses. Therefore, we introduce a temporal patch-based dataset of Canadian croplands, enriched with labels retrieved from the Canadian Annual Crop Inventory. The dataset contains 78,536 manually verified high-resolution (10 m/pixel, 640 x 640 m) geo-referenced images from 10 crop classes collected over four crop production years (2017-2020) and five months (June-October). Each instance contains 12 spectral bands, an RGB image, and additional vegetation index bands. Individually, each category contains at least 4,800 images. Moreover, as a benchmark, we provide models and source code that allow a user to predict the crop class using a single image (ResNet, DenseNet, EfficientNet) or a sequence of images (LRCN, 3D-CNN) from the same location. In perspective, we expect this evolving dataset to propel the creation of robust agro-environmental models that can accelerate the comprehension of complex agricultural regions by providing accurate and continuous monitoring of land cover.},
note = {arXiv:2306.00114 [cs]},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Jacques, Amanda A. Boatswain; Diallo, Abdoulaye Baniré; Lord, Etienne
The Canadian Cropland Dataset: A New Land Cover Dataset for Multitemporal Deep Learning Classification in Agriculture Article de journal
Dans: 2023.
@article{jacques_canadian_2023-1,
title = {The Canadian Cropland Dataset: A New Land Cover Dataset for Multitemporal Deep Learning Classification in Agriculture},
author = {Amanda A. Boatswain Jacques and Abdoulaye Banir\'{e} Diallo and Etienne Lord},
url = {https://arxiv.org/abs/2306.00114},
doi = {10.48550/ARXIV.2306.00114},
year = {2023},
date = {2023-01-01},
urldate = {2024-04-30},
abstract = {Monitoring land cover using remote sensing is vital for studying environmental changes and ensuring global food security through crop yield forecasting. Specifically, multitemporal remote sensing imagery provides relevant information about the dynamics of a scene, which has proven to lead to better land cover classification results. Nevertheless, few studies have benefited from high spatial and temporal resolution data due to the difficulty of accessing reliable, fine-grained and high-quality annotated samples to support their hypotheses. Therefore, we introduce a temporal patch-based dataset of Canadian croplands, enriched with labels retrieved from the Canadian Annual Crop Inventory. The dataset contains 78,536 manually verified high-resolution (10 m/pixel, 640 x 640 m) geo-referenced images from 10 crop classes collected over four crop production years (2017-2020) and five months (June-October). Each instance contains 12 spectral bands, an RGB image, and additional vegetation index bands. Individually, each category contains at least 4,800 images. Moreover, as a benchmark, we provide models and source code that allow a user to predict the crop class using a single image (ResNet, DenseNet, EfficientNet) or a sequence of images (LRCN, 3D-CNN) from the same location. In perspective, we expect this evolving dataset to propel the creation of robust agro-environmental models that can accelerate the comprehension of complex agricultural regions by providing accurate and continuous monitoring of land cover.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2022
Lebatteux, Dylan; Soudeyns, Hugo; Boucoiran, Isabelle; Gantt, Soren; Diallo, Abdoulaye Banire
KANALYZER: a method to identify variations of discriminative k-mers in genomic sequences Article d’actes
Dans: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), p. 757–762, IEEE, Las Vegas, NV, USA, 2022, ISBN: 9781665468190.
@inproceedings{lebatteux_kanalyzer_2022,
title = {KANALYZER: a method to identify variations of discriminative k-mers in genomic sequences},
author = {Dylan Lebatteux and Hugo Soudeyns and Isabelle Boucoiran and Soren Gantt and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/9995370/},
doi = {10.1109/BIBM55620.2022.9995370},
isbn = {9781665468190},
year = {2022},
date = {2022-12-01},
urldate = {2024-04-30},
booktitle = {2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
pages = {757\textendash762},
publisher = {IEEE},
address = {Las Vegas, NV, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Shahzad, Kashif; Kausar, Ayesha; Manzoor, Saima; Rakha, Sobia A.; Uzair, Ambreen; Sajid, Muhammad; Arif, Afsheen; Khan, Abdul Faheem; Diallo, Abdoulaye; Ahmad, Ishaq
Views on Radiation Shielding Efficiency of Polymeric Composites/Nanocomposites and Multi-Layered Materials: Current State and Advancements Article de journal
Dans: Radiation, vol. 3, no 1, p. 1–20, 2022, ISSN: 2673-592X.
@article{shahzad_views_2022,
title = {Views on Radiation Shielding Efficiency of Polymeric Composites/Nanocomposites and Multi-Layered Materials: Current State and Advancements},
author = {Kashif Shahzad and Ayesha Kausar and Saima Manzoor and Sobia A. Rakha and Ambreen Uzair and Muhammad Sajid and Afsheen Arif and Abdul Faheem Khan and Abdoulaye Diallo and Ishaq Ahmad},
url = {https://www.mdpi.com/2673-592X/3/1/1},
doi = {10.3390/radiation3010001},
issn = {2673-592X},
year = {2022},
date = {2022-12-01},
urldate = {2024-04-30},
journal = {Radiation},
volume = {3},
number = {1},
pages = {1\textendash20},
abstract = {This article highlights advancements in polymeric composite/nanocomposites processes and applications for improved radiation shielding and high-rate attenuation for the spacecraft. Energetic particles, mostly electrons and protons, can annihilate or cause space craft hardware failures. The standard practice in space electronics is the utilization of aluminum as radiation safeguard and structural enclosure. In space, the materials must be lightweight and capable of withstanding extreme temperature/mechanical loads under harsh environments, so the research has focused on advanced multi-functional materials. In this regard, low-Z materials have been found effective in shielding particle radiation, but their structural properties were not sufficient for the desired space applications. As a solution, polymeric composites or nanocomposites have been produced having enhanced material properties and enough radiation shielding (gamma, cosmic, X-rays, protons, neutrons, etc.) properties along with reduced weight. Advantageously, the polymeric composites or nanocomposites can be layered to form multi-layered shields. Hence, polymer composites/nanocomposites offer promising alternatives to developing materials for efficiently attenuating photon or particle radiation. The latest technology developments for micro/nano reinforced polymer composites/nanocomposites have also been surveyed here for the radiation shielding of space crafts and aerospace structures. Moreover, the motive behind this state-of-the-art overview is to put forward recommendations for high performance design/applications of reinforced nanocomposites towards future radiation shielding technology in the spacecraft.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Remita, Amine M.; Diallo, Abdoulaye Baniré
EvoVGM: a deep variational generative model for evolutionary parameter estimation Article d’actes
Dans: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, p. 1–10, Association for Computing Machinery, New York, NY, USA, 2022, ISBN: 9781450393867.
@inproceedings{remita_evovgm_2022,
title = {EvoVGM: a deep variational generative model for evolutionary parameter estimation},
author = {Amine M. Remita and Abdoulaye Banir\'{e} Diallo},
url = {https://doi.org/10.1145/3535508.3545563},
doi = {10.1145/3535508.3545563},
isbn = {9781450393867},
year = {2022},
date = {2022-08-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics},
pages = {1\textendash10},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
series = {BCB '22},
abstract = {Most evolutionary-oriented deep generative models do not explicitly consider the underlying evolutionary dynamics of biological sequences as it is performed within the Bayesian phylogenetic inference framework. In this study, we propose a method for a deep variational Bayesian generative model (EvoVGM) that jointly approximates the true posterior of local evolutionary parameters and generates sequence alignments. Moreover, it is instantiated and tuned for continuous-time Markov chain substitution models such as JC69, K80 and GTR. We train the model via a low-variance stochastic estimator and a gradient ascent algorithm. Here, we analyze the consistency and effectiveness of EvoVGM on synthetic sequence alignments simulated with several evolutionary scenarios and different sizes. Finally, we highlight the robustness of a fine-tuned EvoVGM model using a sequence alignment of gene S of coronaviruses.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Almeida, Hayda; Tsang, Adrian; Diallo, Abdoulaye Baniré
Improving candidate Biosynthetic Gene Clusters in fungi through reinforcement learning Article de journal
Dans: Bioinformatics, vol. 38, no 16, p. 3984–3991, 2022, ISSN: 1367-4803, 1367-4811.
@article{almeida_improving_2022,
title = {Improving candidate Biosynthetic Gene Clusters in fungi through reinforcement learning},
author = {Hayda Almeida and Adrian Tsang and Abdoulaye Banir\'{e} Diallo},
editor = {Zhiyong Lu},
url = {https://academic.oup.com/bioinformatics/article/38/16/3984/6619162},
doi = {10.1093/bioinformatics/btac420},
issn = {1367-4803, 1367-4811},
year = {2022},
date = {2022-08-01},
urldate = {2024-04-30},
journal = {Bioinformatics},
volume = {38},
number = {16},
pages = {3984\textendash3991},
abstract = {Abstract
Motivation
Precise identification of Biosynthetic Gene Clusters (BGCs) is a challenging task. Performance of BGC discovery tools is limited by their capacity to accurately predict components belonging to candidate BGCs, often overestimating cluster boundaries. To support optimizing the composition and boundaries of candidate BGCs, we propose reinforcement learning approach relying on protein domains and functional annotations from expert curated BGCs.
Results
The proposed reinforcement learning method aims to improve candidate BGCs obtained with state-of-the-art tools. It was evaluated on candidate BGCs obtained for two fungal genomes, Aspergillus niger and Aspergillus nidulans. The results highlight an improvement of the gene precision by above 15% for TOUCAN, fungiSMASH and DeepBGC; and cluster precision by above 25% for fungiSMASH and DeepBCG, allowing these tools to obtain almost perfect precision in cluster prediction. This can pave the way of optimizing current prediction of candidate BGCs in fungi, while minimizing the curation effort required by domain experts.
Availability and implementation
https://github.com/bioinfoUQAM/RL-bgc-components.
Supplementary information
Supplementary data are available at Bioinformatics online.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Motivation
Precise identification of Biosynthetic Gene Clusters (BGCs) is a challenging task. Performance of BGC discovery tools is limited by their capacity to accurately predict components belonging to candidate BGCs, often overestimating cluster boundaries. To support optimizing the composition and boundaries of candidate BGCs, we propose reinforcement learning approach relying on protein domains and functional annotations from expert curated BGCs.
Results
The proposed reinforcement learning method aims to improve candidate BGCs obtained with state-of-the-art tools. It was evaluated on candidate BGCs obtained for two fungal genomes, Aspergillus niger and Aspergillus nidulans. The results highlight an improvement of the gene precision by above 15% for TOUCAN, fungiSMASH and DeepBGC; and cluster precision by above 25% for fungiSMASH and DeepBCG, allowing these tools to obtain almost perfect precision in cluster prediction. This can pave the way of optimizing current prediction of candidate BGCs in fungi, while minimizing the curation effort required by domain experts.
Availability and implementation
https://github.com/bioinfoUQAM/RL-bgc-components.
Supplementary information
Supplementary data are available at Bioinformatics online.
Yauy, Kevin; Lecoquierre, François; Baert-Desurmont, Stéphanie; Trost, Detlef; Boughalem, Aicha; Luscan, Armelle; Costa, Jean-Marc; Geromel, Vanna; Raymond, Laure; Richard, Pascale; Coutant, Sophie; Broutin, Mélanie; Lanos, Raphael; Fort, Quentin; Cackowski, Stenzel; Testard, Quentin; Diallo, Abdoulaye; Soirat, Nicolas; Holder, Jean-Marc; Duforet-Frebourg, Nicolas; Bouge, Anne-Laure; Beaumeunier, Sacha; Bertrand, Denis; Audoux, Jerome; Genevieve, David; Mesnard, Laurent; Nicolas, Gael; Thevenon, Julien; Philippe, Nicolas
Genome Alert!: A standardized procedure for genomic variant reinterpretation and automated gene–phenotype reassessment in clinical routine Article de journal
Dans: Genetics in Medicine, vol. 24, no 6, p. 1316–1327, 2022, ISSN: 10983600.
@article{yauy_genome_2022,
title = {Genome Alert!: A standardized procedure for genomic variant reinterpretation and automated gene\textendashphenotype reassessment in clinical routine},
author = {Kevin Yauy and Fran\c{c}ois Lecoquierre and St\'{e}phanie Baert-Desurmont and Detlef Trost and Aicha Boughalem and Armelle Luscan and Jean-Marc Costa and Vanna Geromel and Laure Raymond and Pascale Richard and Sophie Coutant and M\'{e}lanie Broutin and Raphael Lanos and Quentin Fort and Stenzel Cackowski and Quentin Testard and Abdoulaye Diallo and Nicolas Soirat and Jean-Marc Holder and Nicolas Duforet-Frebourg and Anne-Laure Bouge and Sacha Beaumeunier and Denis Bertrand and Jerome Audoux and David Genevieve and Laurent Mesnard and Gael Nicolas and Julien Thevenon and Nicolas Philippe},
url = {https://linkinghub.elsevier.com/retrieve/pii/S1098360022006542},
doi = {10.1016/j.gim.2022.02.008},
issn = {10983600},
year = {2022},
date = {2022-06-01},
urldate = {2024-04-30},
journal = {Genetics in Medicine},
volume = {24},
number = {6},
pages = {1316\textendash1327},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Jenna, Sarah; Lemaçon, Audrey; Dehghani, Mehrnoush; Renaut, Sébastien; Diallo, Abdoulaye Baniré
Dans: Journal of Clinical Oncology, vol. 40, no 16_suppl, p. 3123–3123, 2022, ISSN: 0732-183X, 1527-7755.
@article{jenna_inference_2022,
title = {Inference of sample-specific genetic interactions to increase accuracy of indication prioritization in oncology clinical trials and facilitate exploration of combined therapy opportunities.},
author = {Sarah Jenna and Audrey Lema\c{c}on and Mehrnoush Dehghani and S\'{e}bastien Renaut and Abdoulaye Banir\'{e} Diallo},
url = {https://ascopubs.org/doi/10.1200/JCO.2022.40.16_suppl.3123},
doi = {10.1200/JCO.2022.40.16_suppl.3123},
issn = {0732-183X, 1527-7755},
year = {2022},
date = {2022-06-01},
urldate = {2024-04-30},
journal = {Journal of Clinical Oncology},
volume = {40},
number = {16_suppl},
pages = {3123\textendash3123},
abstract = {3123
Background: Precision oncology is growing rapidly in parallel with advances in high throughput sequencing. Development of new anti-cancer therapies is, however, still associated with low efficacy issues, leading to phase II and III clinical trial failures. Improved methodologies are required to identify clinical and molecular patient profiles associated with good drug response to inform decisions on indication prioritization. Methods: We used a sample-specific Genetic Interaction Graph Inference (ssGI
2 ) algorithm, integrating bulk tumor transcriptomic data as well as data collected from 120 public databases and scientific literature in oncology, to infer genetic interactions (GI). More than 10,000 genes from 17,000 samples, covering 195 oncology ICD10 codes, were used to infer GIs for each individual sample. GIs involving a given drug target are selected from a compendium of 17,000 networks of 2M GIs each, and ranked based on their prevalence in the patient cohort and data-support. The mean Z-scored expression of genes from the top ranked GIs were subsequently used to predict drug response for each patient and to calculate the response rate for each indication. Detailed information on each drug target’s genetic interactors was used to characterize the drug’s mechanisms of action and explore opportunities for combined therapies. We investigated our method's ability to predict good responders using four FDA approved immune and targeted therapies (pembrolizumab, nivolumab, ipilimumab and sorafenib) across seven clinical studies. Importantly this methodology is suitable for drugs with no clinical studies available. Results: Our results show that the prediction of good responders can be achieved with Precision-Recall AUC on average 13% higher than predictions based on drug target expression level solely, in five out of seven studies. Also, for each drug target, between 30 to 140 genetic interactors with good performance (Precision=0.92; Recall=0.61) were identified, suggesting potential synergistic effects of drugs, some of which have already been confirmed by clinical studies on combined therapies. Conclusions: Our ssGI
2
-derived signatures are powerful predictors of good response to a drug even without available clinical data. Applying this methodology at a pre-clinical stage will significantly de-risk clinical trials, particularly for novel therapies, and could also support investigation of new combined therapies.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Background: Precision oncology is growing rapidly in parallel with advances in high throughput sequencing. Development of new anti-cancer therapies is, however, still associated with low efficacy issues, leading to phase II and III clinical trial failures. Improved methodologies are required to identify clinical and molecular patient profiles associated with good drug response to inform decisions on indication prioritization. Methods: We used a sample-specific Genetic Interaction Graph Inference (ssGI
2 ) algorithm, integrating bulk tumor transcriptomic data as well as data collected from 120 public databases and scientific literature in oncology, to infer genetic interactions (GI). More than 10,000 genes from 17,000 samples, covering 195 oncology ICD10 codes, were used to infer GIs for each individual sample. GIs involving a given drug target are selected from a compendium of 17,000 networks of 2M GIs each, and ranked based on their prevalence in the patient cohort and data-support. The mean Z-scored expression of genes from the top ranked GIs were subsequently used to predict drug response for each patient and to calculate the response rate for each indication. Detailed information on each drug target’s genetic interactors was used to characterize the drug’s mechanisms of action and explore opportunities for combined therapies. We investigated our method’s ability to predict good responders using four FDA approved immune and targeted therapies (pembrolizumab, nivolumab, ipilimumab and sorafenib) across seven clinical studies. Importantly this methodology is suitable for drugs with no clinical studies available. Results: Our results show that the prediction of good responders can be achieved with Precision-Recall AUC on average 13% higher than predictions based on drug target expression level solely, in five out of seven studies. Also, for each drug target, between 30 to 140 genetic interactors with good performance (Precision=0.92; Recall=0.61) were identified, suggesting potential synergistic effects of drugs, some of which have already been confirmed by clinical studies on combined therapies. Conclusions: Our ssGI
2
-derived signatures are powerful predictors of good response to a drug even without available clinical data. Applying this methodology at a pre-clinical stage will significantly de-risk clinical trials, particularly for novel therapies, and could also support investigation of new combined therapies.
Jenna, Sarah; Boucher, Benjamin; Dudragne, Liebaut; Diallo, Abdoulaye Baniré
Abstract 6375: Augmented intelligence to define drug targets associated with triple negative breast cancer (TNBC) metabolic reprogramming Article de journal
Dans: Cancer Research, vol. 82, no 12_Supplement, p. 6375–6375, 2022, ISSN: 1538-7445.
@article{jenna_abstract_2022,
title = {Abstract 6375: Augmented intelligence to define drug targets associated with triple negative breast cancer (TNBC) metabolic reprogramming},
author = {Sarah Jenna and Benjamin Boucher and Liebaut Dudragne and Abdoulaye Banir\'{e} Diallo},
url = {https://aacrjournals.org/cancerres/article/82/12_Supplement/6375/699488/Abstract-6375-Augmented-intelligence-to-define},
doi = {10.1158/1538-7445.AM2022-6375},
issn = {1538-7445},
year = {2022},
date = {2022-06-01},
urldate = {2024-04-30},
journal = {Cancer Research},
volume = {82},
number = {12_Supplement},
pages = {6375\textendash6375},
abstract = {Abstract
Introduction: Basal-like breast cancer (BLBC; TNBC) cells use aerobic glycolysis at a higher rate than Luminal A (LumA) cells. Metabolic reprogramming using aerobic glycolysis (Warburg effect), is correlated with increased aggressiveness of cancer cells and poor outcome in patients. Therefore, genes involved in this pathway are promising targets for developing cancer therapeutics.
Method: MIMs has developed a unique platform of augmented intelligence that combines bioinformatics, systems biology and artificial intelligence. It integrates multi-layered omics data with a knowledge-base that aggregates structured data from more than 140 databases as well as unstructured data from the scientific literature. Using this approach a genetic interaction graph (GI-Graph) is inferred per patient, capturing functional relationships between genes in the specific context of the tumor. The GI-Graphs are subsequently used to train supervised machine learning algorithms for predicting gene functionality and potential as a target. Preliminary analysis was performed on transcriptomic data from 321 LumA and 162 TNBC samples, from the TCGA Data Portal and a predictive model was developed using two GI-Graphs from the two subgroups. Briefly, subgraphs, containing genes with functional interactions with the known genes in OXPHOS and glycolysis pathways, were used to extract gene attributes, and to build an algorithm that predicts involvement of a gene in the Warburg effect. Using a testing gene set extracted from the literature the performance of the model was assessed, which showed a true positive rate of 18% and a false positive rate of 0.36%, and outperformed 5-times the classical bioinformatics tools.
Results: This model predicted 108 genes as the top 1% genes being involved in the metabolic reprogramming of TNBC. Additional information from MIMs’ platform, including differential gene expression between LumA and TNBC, gene pleiotropy and essentiality and the topological metrics, enabled the life scientists to further refine the gene lists, based on the expected characteristics of a good target in oncology. Following this process, 30 genes were selected as potential targets controlling the metabolic reprogramming of TNBC, among which 4 genes were already evaluated in TNBC clinical trials.
Conclusion: Our preliminary data strongly supports that the predictive model based on the GI-Graphs has the potential to identify promising therapeutic targets for TNBC.
Citation Format: Sarah Jenna, Benjamin Boucher, Liebaut Dudragne, Abdoulaye Banir\'{e} Diallo. Augmented intelligence to define drug targets associated with triple negative breast cancer (TNBC) metabolic reprogramming [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 6375.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Introduction: Basal-like breast cancer (BLBC; TNBC) cells use aerobic glycolysis at a higher rate than Luminal A (LumA) cells. Metabolic reprogramming using aerobic glycolysis (Warburg effect), is correlated with increased aggressiveness of cancer cells and poor outcome in patients. Therefore, genes involved in this pathway are promising targets for developing cancer therapeutics.
Method: MIMs has developed a unique platform of augmented intelligence that combines bioinformatics, systems biology and artificial intelligence. It integrates multi-layered omics data with a knowledge-base that aggregates structured data from more than 140 databases as well as unstructured data from the scientific literature. Using this approach a genetic interaction graph (GI-Graph) is inferred per patient, capturing functional relationships between genes in the specific context of the tumor. The GI-Graphs are subsequently used to train supervised machine learning algorithms for predicting gene functionality and potential as a target. Preliminary analysis was performed on transcriptomic data from 321 LumA and 162 TNBC samples, from the TCGA Data Portal and a predictive model was developed using two GI-Graphs from the two subgroups. Briefly, subgraphs, containing genes with functional interactions with the known genes in OXPHOS and glycolysis pathways, were used to extract gene attributes, and to build an algorithm that predicts involvement of a gene in the Warburg effect. Using a testing gene set extracted from the literature the performance of the model was assessed, which showed a true positive rate of 18% and a false positive rate of 0.36%, and outperformed 5-times the classical bioinformatics tools.
Results: This model predicted 108 genes as the top 1% genes being involved in the metabolic reprogramming of TNBC. Additional information from MIMs’ platform, including differential gene expression between LumA and TNBC, gene pleiotropy and essentiality and the topological metrics, enabled the life scientists to further refine the gene lists, based on the expected characteristics of a good target in oncology. Following this process, 30 genes were selected as potential targets controlling the metabolic reprogramming of TNBC, among which 4 genes were already evaluated in TNBC clinical trials.
Conclusion: Our preliminary data strongly supports that the predictive model based on the GI-Graphs has the potential to identify promising therapeutic targets for TNBC.
Citation Format: Sarah Jenna, Benjamin Boucher, Liebaut Dudragne, Abdoulaye Baniré Diallo. Augmented intelligence to define drug targets associated with triple negative breast cancer (TNBC) metabolic reprogramming [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 6375.
Naghashi, Vahid; Diallo, Abdoulaye Banire
A Model for the Prediction of Lifetime Profit Estimate of Dairy Cattle (Student Abstract) Article de journal
Dans: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no 11, p. 13021–13022, 2022, ISSN: 2374-3468.
@article{naghashi_model_2022,
title = {A Model for the Prediction of Lifetime Profit Estimate of Dairy Cattle (Student Abstract)},
author = {Vahid Naghashi and Abdoulaye Banire Diallo},
url = {https://ojs.aaai.org/index.php/AAAI/article/view/21647},
doi = {10.1609/aaai.v36i11.21647},
issn = {2374-3468},
year = {2022},
date = {2022-06-01},
urldate = {2024-04-30},
journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
volume = {36},
number = {11},
pages = {13021\textendash13022},
abstract = {In livestock management, the decision of animal replacement requires an estimation of the lifetime profit of the animal based on multiple factors and operational conditions. In Dairy farms, this can be associated with the profit corresponding to milk production, health condition and herd management costs, which in turn may be a function of other factors including genetics and weather conditions. Estimating the profit of a cow can be expressed as a spatio-temporal problem where knowing the first batch of production (early-profit) can allow to predict the future batch of productions (late-profit).
This problem can be addressed either by a univariate or multivariate time series forecasting. Several approaches have been designed for time series forecasting including Auto-Regressive approaches, Recurrent Neural Network including Long Short Term Memory (LSTM) method and a very deep stack of fully-connected layers. In this paper, we proposed a LSTM based approach coupled with attention and linear layers to better capture the dairy features. We compare the model, with three other architectures including NBEATs, ARIMA, MUMU-RNN using dairy production of 292181 dairy cows. The results highlight the performence of the proposed model of the compared architectures. They also show that a univariate NBEATs could perform better than the multi-variate approach there are compared to. We also highlight that such architecture could allow to predict late-profit with an error less than 3$ per month, opening the way of better resource management in the dairy industry.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
This problem can be addressed either by a univariate or multivariate time series forecasting. Several approaches have been designed for time series forecasting including Auto-Regressive approaches, Recurrent Neural Network including Long Short Term Memory (LSTM) method and a very deep stack of fully-connected layers. In this paper, we proposed a LSTM based approach coupled with attention and linear layers to better capture the dairy features. We compare the model, with three other architectures including NBEATs, ARIMA, MUMU-RNN using dairy production of 292181 dairy cows. The results highlight the performence of the proposed model of the compared architectures. They also show that a univariate NBEATs could perform better than the multi-variate approach there are compared to. We also highlight that such architecture could allow to predict late-profit with an error less than 3$ per month, opening the way of better resource management in the dairy industry.
Martin, Tomas; Fuentes, Victor; Valtchev, Petko; Diallo, Abdoulaye Baniré; Lacroix, René
Generalized graph pattern discovery in linked data with data properties and a domain ontology Article d’actes
Dans: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, p. 1890–1899, ACM, Virtual Event, 2022, ISBN: 9781450387132.
@inproceedings{martin_generalized_2022,
title = {Generalized graph pattern discovery in linked data with data properties and a domain ontology},
author = {Tomas Martin and Victor Fuentes and Petko Valtchev and Abdoulaye Banir\'{e} Diallo and Ren\'{e} Lacroix},
url = {https://dl.acm.org/doi/10.1145/3477314.3507301},
doi = {10.1145/3477314.3507301},
isbn = {9781450387132},
year = {2022},
date = {2022-04-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
pages = {1890\textendash1899},
publisher = {ACM},
address = {Virtual Event},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Conference, ICAR Annual
Sharing solutions on digital transformation, animal welfare and environmental sustainability to support global food security proceedings of the 45th ICAR Annual Conference held in Montreal, CA, 30 May – 3 June 2022 Ouvrage
ICAR, Utrecht, 2022, ISBN: 9789295014220, (OCLC: 1356312312).
@book{icar_annual_conference_sharing_2022,
title = {Sharing solutions on digital transformation, animal welfare and environmental sustainability to support global food security proceedings of the 45th ICAR Annual Conference held in Montreal, CA, 30 May - 3 June 2022},
author = {ICAR Annual Conference},
editor = {A M. Christensen},
isbn = {9789295014220},
year = {2022},
date = {2022-01-01},
number = {ǂno. ǂ26},
publisher = {ICAR},
address = {Utrecht},
series = {ICAR Technical Series},
note = {OCLC: 1356312312},
keywords = {},
pubstate = {published},
tppubtype = {book}
}
Nguyen, Dung; Boc, Alix; Diallo, Abdoulaye Banire; Makarenkov, Vladimir
Etude de classification des bacteriophages Article de journal
Dans: 2022.
@article{nguyen_etude_2022,
title = {Etude de classification des bacteriophages},
author = {Dung Nguyen and Alix Boc and Abdoulaye Banire Diallo and Vladimir Makarenkov},
url = {https://arxiv.org/abs/2201.00126},
doi = {10.48550/ARXIV.2201.00126},
year = {2022},
date = {2022-01-01},
urldate = {2024-04-30},
abstract = {Phages are one of the most present groups of organisms in the biosphere. Their identification continues and their taxonomies are divergent. However, due to their evolution mode and the complexity of their species ecosystem, their classification is not complete. Here, we present a new approach to the phages classification that combines the methods of horizontal gene transfer detection and ancestral sequence reconstruction.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2021
Lebatteux, Dylan; Diallo, Abdoulaye Banire
Combining a genetic algorithm and ensemble method to improve the classification of viruses Article d’actes
Dans: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), p. 688–693, IEEE, Houston, TX, USA, 2021, ISBN: 9781665401265.
@inproceedings{lebatteux_combining_2021,
title = {Combining a genetic algorithm and ensemble method to improve the classification of viruses},
author = {Dylan Lebatteux and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/9669670/},
doi = {10.1109/BIBM52615.2021.9669670},
isbn = {9781665401265},
year = {2021},
date = {2021-12-01},
urldate = {2024-04-30},
booktitle = {2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
pages = {688\textendash693},
publisher = {IEEE},
address = {Houston, TX, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Adelani, David Ifeoluwa; Abbott, Jade; Neubig, Graham; D’souza, Daniel; Kreutzer, Julia; Lignos, Constantine; Palen-Michel, Chester; Buzaaba, Happy; Rijhwani, Shruti; Ruder, Sebastian; Mayhew, Stephen; Azime, Israel Abebe; Muhammad, Shamsuddeen; Emezue, Chris Chinenye; Nakatumba-Nabende, Joyce; Ogayo, Perez; Aremu, Anuoluwapo; Gitau, Catherine; Mbaye, Derguene; Alabi, Jesujoba; Yimam, Seid Muhie; Gwadabe, Tajuddeen; Ezeani, Ignatius; Niyongabo, Rubungo Andre; Mukiibi, Jonathan; Otiende, Verrah; Orife, Iroro; David, Davis; Ngom, Samba; Adewumi, Tosin; Rayson, Paul; Adeyemi, Mofetoluwa; Muriuki, Gerald; Anebi, Emmanuel; Chukwuneke, Chiamaka; Odu, Nkiruka; Wairagala, Eric Peter; Oyerinde, Samuel; Siro, Clemencia; Bateesa, Tobius Saul; Oloyede, Temilola; Wambui, Yvonne; Akinode, Victor; Nabagereka, Deborah; Katusiime, Maurice; Awokoya, Ayodele; MBOUP, Mouhamadane; Gebreyohannes, Dibora; Tilaye, Henok; Nwaike, Kelechi; Wolde, Degaga; Faye, Abdoulaye; Sibanda, Blessing; Ahia, Orevaoghene; Dossou, Bonaventure F. P.; Ogueji, Kelechi; DIOP, Thierno Ibrahima; Diallo, Abdoulaye; Akinfaderin, Adewale; Marengereke, Tendai; Osei, Salomey
MasakhaNER: Named Entity Recognition for African Languages Divers
2021, (arXiv:2103.11811 [cs]).
@misc{adelani_masakhaner_2021,
title = {MasakhaNER: Named Entity Recognition for African Languages},
author = {David Ifeoluwa Adelani and Jade Abbott and Graham Neubig and Daniel D'souza and Julia Kreutzer and Constantine Lignos and Chester Palen-Michel and Happy Buzaaba and Shruti Rijhwani and Sebastian Ruder and Stephen Mayhew and Israel Abebe Azime and Shamsuddeen Muhammad and Chris Chinenye Emezue and Joyce Nakatumba-Nabende and Perez Ogayo and Anuoluwapo Aremu and Catherine Gitau and Derguene Mbaye and Jesujoba Alabi and Seid Muhie Yimam and Tajuddeen Gwadabe and Ignatius Ezeani and Rubungo Andre Niyongabo and Jonathan Mukiibi and Verrah Otiende and Iroro Orife and Davis David and Samba Ngom and Tosin Adewumi and Paul Rayson and Mofetoluwa Adeyemi and Gerald Muriuki and Emmanuel Anebi and Chiamaka Chukwuneke and Nkiruka Odu and Eric Peter Wairagala and Samuel Oyerinde and Clemencia Siro and Tobius Saul Bateesa and Temilola Oloyede and Yvonne Wambui and Victor Akinode and Deborah Nabagereka and Maurice Katusiime and Ayodele Awokoya and Mouhamadane MBOUP and Dibora Gebreyohannes and Henok Tilaye and Kelechi Nwaike and Degaga Wolde and Abdoulaye Faye and Blessing Sibanda and Orevaoghene Ahia and Bonaventure F. P. Dossou and Kelechi Ogueji and Thierno Ibrahima DIOP and Abdoulaye Diallo and Adewale Akinfaderin and Tendai Marengereke and Salomey Osei},
url = {http://arxiv.org/abs/2103.11811},
doi = {10.48550/arXiv.2103.11811},
year = {2021},
date = {2021-07-01},
urldate = {2024-04-30},
publisher = {arXiv},
abstract = {We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP.},
note = {arXiv:2103.11811 [cs]},
keywords = {},
pubstate = {published},
tppubtype = {misc}
}
Diallo, A.; Tandjigora, N.; Ndiaye, S.; Jan, Tariq; Ahmad, I.; Maaza, M.
Green synthesis of single phase hausmannite Mn3O4 nanoparticles via Aspalathus linearis natural extract Article de journal
Dans: SN Applied Sciences, vol. 3, no 5, p. 562, 2021, ISSN: 2523-3963, 2523-3971.
@article{diallo_green_2021,
title = {Green synthesis of single phase hausmannite Mn3O4 nanoparticles via Aspalathus linearis natural extract},
author = {A. Diallo and N. Tandjigora and S. Ndiaye and Tariq Jan and I. Ahmad and M. Maaza},
url = {https://link.springer.com/10.1007/s42452-021-04550-3},
doi = {10.1007/s42452-021-04550-3},
issn = {2523-3963, 2523-3971},
year = {2021},
date = {2021-05-01},
urldate = {2024-04-30},
journal = {SN Applied Sciences},
volume = {3},
number = {5},
pages = {562},
abstract = {Abstract
Nowadays, green synthesis of nanoparticles using plant precursors has been extensively studied. However, less attention has been given to Mn
3
O
4
. This contribution validates the synthesis of single-phase Hausmannite Mn
3
O
4
nanoparticles by a green approach without using any standard acid/base compounds, surfactants, and organic/inorganic dissolving agents. The chemical chelation of the Mn precursor was performed via bioactive compounds of the
Aspalathus Linearis’
extract
,
an African indigenous plant. Annealing at 400 °C for textasciitilde 1 h was required to crystallize the small amorphous nanoparticles with an initial bimodal size distribution peaking at
$$textbackslashlefttextbackslashlangle textbackslashphi_1 textbackslashrighttextbackslashrangle$$
ϕ
1
textasciitilde 4.21 nm and
$$textbackslashlefttextbackslashlangle textbackslashphi_2 textbackslashrighttextbackslashrangle$$
ϕ
2
textasciitilde 8.51 nm respectively. Such annealing lead to increase in the diameter of the nanoparticles from 17 to 28 nm.The morphological, structural, vibrational, surface, and photoluminescence properties of the single-phase Hausmannite nanoparticles were comprehensively investigated by High Resolution Transmission Electron Microscopy(HRTEM),Energy Dispersive X-ray Spectroscopy (EDS), X-ray Diffraction (XRD), Raman and X-rays Photoelectron Spectroscopy (XPS), spectroscopy as well as room temperature photoluminescence. Structural and morphological investigations revealed the formation of quasi-spherical nanoparticles having a single phase Hausmannite Mn
3
O
4
crystal structure. XPS results also validated the XRD results about the formation of Hausmannite Mn
3
O
4
nanoparticles. Raman investigations allowed a crystal-clear distinction between the Mn
3
O
4
nature of the nanoparticles from the potential
γ
-Mn
2
O
3
phase as both phases belong to the same space group and both assume tetragonally-distorted cubic lattices of nearly similar dimensions. The optical studies of the single phase Hausmannite crystalline nanoparticles exhibited a broad photoluminescence in the spectral range of 300\textendash700 nm, which is ideal for emission devices.
Graphic abstract},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Nowadays, green synthesis of nanoparticles using plant precursors has been extensively studied. However, less attention has been given to Mn
3
O
4
. This contribution validates the synthesis of single-phase Hausmannite Mn
3
O
4
nanoparticles by a green approach without using any standard acid/base compounds, surfactants, and organic/inorganic dissolving agents. The chemical chelation of the Mn precursor was performed via bioactive compounds of the
Aspalathus Linearis’
extract
,
an African indigenous plant. Annealing at 400 °C for textasciitilde 1 h was required to crystallize the small amorphous nanoparticles with an initial bimodal size distribution peaking at
$$textbackslashlefttextbackslashlangle textbackslashphi_1 textbackslashrighttextbackslashrangle$$
ϕ
1
textasciitilde 4.21 nm and
$$textbackslashlefttextbackslashlangle textbackslashphi_2 textbackslashrighttextbackslashrangle$$
ϕ
2
textasciitilde 8.51 nm respectively. Such annealing lead to increase in the diameter of the nanoparticles from 17 to 28 nm.The morphological, structural, vibrational, surface, and photoluminescence properties of the single-phase Hausmannite nanoparticles were comprehensively investigated by High Resolution Transmission Electron Microscopy(HRTEM),Energy Dispersive X-ray Spectroscopy (EDS), X-ray Diffraction (XRD), Raman and X-rays Photoelectron Spectroscopy (XPS), spectroscopy as well as room temperature photoluminescence. Structural and morphological investigations revealed the formation of quasi-spherical nanoparticles having a single phase Hausmannite Mn
3
O
4
crystal structure. XPS results also validated the XRD results about the formation of Hausmannite Mn
3
O
4
nanoparticles. Raman investigations allowed a crystal-clear distinction between the Mn
3
O
4
nature of the nanoparticles from the potential
γ
-Mn
2
O
3
phase as both phases belong to the same space group and both assume tetragonally-distorted cubic lattices of nearly similar dimensions. The optical studies of the single phase Hausmannite crystalline nanoparticles exhibited a broad photoluminescence in the spectral range of 300–700 nm, which is ideal for emission devices.
Graphic abstract
Karoui, Yasmine; Jacques, Amanda A. Boatswain; Diallo, Abdoulaye Baniré; Shepley, Elise; Vasseur, Elsa
A Deep Learning Framework for Improving Lameness Identification in Dairy Cattle Article de journal
Dans: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no 18, p. 15811–15812, 2021, ISSN: 2374-3468, 2159-5399.
@article{karoui_deep_2021,
title = {A Deep Learning Framework for Improving Lameness Identification in Dairy Cattle},
author = {Yasmine Karoui and Amanda A. Boatswain Jacques and Abdoulaye Banir\'{e} Diallo and Elise Shepley and Elsa Vasseur},
url = {https://ojs.aaai.org/index.php/AAAI/article/view/17902},
doi = {10.1609/aaai.v35i18.17902},
issn = {2374-3468, 2159-5399},
year = {2021},
date = {2021-05-01},
urldate = {2024-04-30},
journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
volume = {35},
number = {18},
pages = {15811\textendash15812},
abstract = {Lameness, characterized by an anomalous gait in cows due to a dysfunction in their locomotive system, is a serious welfare issue for cows and farmers. Prompt lameness detection methods can prevent the development of acute lameness in cattle. In this study, we propose a deep learning framework to help identify lameness based on motion curves of different leg joints on the cow. The framework combines data augmentation and a convolutional neural network using an LeNet architecture. Performance assessed using cross validation showed promising prediction accuracies above 99% and 91% for validation and test sets, respectively. This also demonstrates the usefulness of data generation in cases where the data set is originally small in size and difficult to generate.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Amin, Atia B.; Dubois, Georges Octave; Thurel, Sephora; Danyluk, Jean; Boukadoum, Mounir; Diallo, Abdoulaye Banire
Wireless Sensor Network and Irrigation System to Monitor Wheat Growth under Drought Stress Article d’actes
Dans: 2021 IEEE International Symposium on Circuits and Systems (ISCAS), p. 1–4, IEEE, Daegu, Korea, 2021, ISBN: 9781728192017.
@inproceedings{amin_wireless_2021,
title = {Wireless Sensor Network and Irrigation System to Monitor Wheat Growth under Drought Stress},
author = {Atia B. Amin and Georges Octave Dubois and Sephora Thurel and Jean Danyluk and Mounir Boukadoum and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/9401545/},
doi = {10.1109/ISCAS51556.2021.9401545},
isbn = {9781728192017},
year = {2021},
date = {2021-05-01},
urldate = {2024-04-30},
booktitle = {2021 IEEE International Symposium on Circuits and Systems (ISCAS)},
pages = {1\textendash4},
publisher = {IEEE},
address = {Daegu, Korea},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Martin, Tomas; Fuentes, Victor; Valtchev, Petko; Diallo, Abdoulaye Baniré; Lacroix, René; Boukadoum, Mounir; Leduc, Maxime
Graph pattern mining on top of a domain ontology – preliminary results from a dairy production application Article de journal
Dans: Procedia Computer Science, vol. 192, p. 1227–1236, 2021, ISSN: 18770509.
@article{martin_graph_2021,
title = {Graph pattern mining on top of a domain ontology - preliminary results from a dairy production application},
author = {Tomas Martin and Victor Fuentes and Petko Valtchev and Abdoulaye Banir\'{e} Diallo and Ren\'{e} Lacroix and Mounir Boukadoum and Maxime Leduc},
url = {https://linkinghub.elsevier.com/retrieve/pii/S187705092101615X},
doi = {10.1016/j.procs.2021.08.126},
issn = {18770509},
year = {2021},
date = {2021-01-01},
urldate = {2024-04-30},
journal = {Procedia Computer Science},
volume = {192},
pages = {1227\textendash1236},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Uthamacumaran, Abicumaran; Suarez, Narjara Gonzalez; Diallo, Abdoulaye Baniré; Annabi, Borhane
Computational Methods for Structure-to-Function Analysis of Diet-Derived Catechins-Mediated Targeting of In Vitro Vasculogenic Mimicry Article de journal
Dans: Cancer Informatics, vol. 20, p. 117693512110092, 2021, ISSN: 1176-9351, 1176-9351.
@article{uthamacumaran_computational_2021,
title = {Computational Methods for Structure-to-Function Analysis of Diet-Derived Catechins-Mediated Targeting of In Vitro Vasculogenic Mimicry},
author = {Abicumaran Uthamacumaran and Narjara Gonzalez Suarez and Abdoulaye Banir\'{e} Diallo and Borhane Annabi},
url = {http://journals.sagepub.com/doi/10.1177/11769351211009229},
doi = {10.1177/11769351211009229},
issn = {1176-9351, 1176-9351},
year = {2021},
date = {2021-01-01},
urldate = {2024-04-30},
journal = {Cancer Informatics},
volume = {20},
pages = {117693512110092},
abstract = {Background:
Vasculogenic mimicry (VM) is an adaptive biological phenomenon wherein cancer cells spontaneously self-organize into 3-dimensional (3D) branching network structures. This emergent behavior is considered central in promoting an invasive, metastatic, and therapy resistance molecular signature to cancer cells. The quantitative analysis of such complex phenotypic systems could require the use of computational approaches including machine learning algorithms originating from complexity science.
Procedures:
In vitro 3D VM was performed with SKOV3 and ES2 ovarian cancer cells cultured on Matrigel. Diet-derived catechins disruption of VM was monitored at 24 hours with pictures taken with an inverted microscope. Three computational algorithms for complex feature extraction relevant for 3D VM, including 2D wavelet analysis, fractal dimension, and percolation clustering scores were assessed coupled with machine learning classifiers.
Results:
These algorithms demonstrated the structure-to-function galloyl moiety impact on VM for each of the gallated catechin tested, and shown applicable in quantifying the drug-mediated structural changes in VM processes.
Conclusions:
Our study provides evidence of how appropriate 3D VM compression and feature extractors coupled with classification/regression methods could be efficient to study in vitro drug-induced perturbation of complex processes. Such approaches could be exploited in the development and characterization of drugs targeting VM.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Vasculogenic mimicry (VM) is an adaptive biological phenomenon wherein cancer cells spontaneously self-organize into 3-dimensional (3D) branching network structures. This emergent behavior is considered central in promoting an invasive, metastatic, and therapy resistance molecular signature to cancer cells. The quantitative analysis of such complex phenotypic systems could require the use of computational approaches including machine learning algorithms originating from complexity science.
Procedures:
In vitro 3D VM was performed with SKOV3 and ES2 ovarian cancer cells cultured on Matrigel. Diet-derived catechins disruption of VM was monitored at 24 hours with pictures taken with an inverted microscope. Three computational algorithms for complex feature extraction relevant for 3D VM, including 2D wavelet analysis, fractal dimension, and percolation clustering scores were assessed coupled with machine learning classifiers.
Results:
These algorithms demonstrated the structure-to-function galloyl moiety impact on VM for each of the gallated catechin tested, and shown applicable in quantifying the drug-mediated structural changes in VM processes.
Conclusions:
Our study provides evidence of how appropriate 3D VM compression and feature extractors coupled with classification/regression methods could be efficient to study in vitro drug-induced perturbation of complex processes. Such approaches could be exploited in the development and characterization of drugs targeting VM.
Martin, Tomas; Fuentes, Victor; Valtchev, Petko; Diallo, Abdoulaye Baniré; Lacroix, René; Leduc, Maxime; Boukadoum, Mounir
Towards Mining Generalized Patterns from RDF Data and a Domain Ontology Section de livre
Dans: Kamp, Michael; Koprinska, Irena; Bibal, Adrien; Bouadi, Tassadit; Frénay, Benoît; Galárraga, Luis; Oramas, José; Adilova, Linara; Krishnamurthy, Yamuna; Kang, Bo; Largeron, Christine; Lijffijt, Jefrey; Viard, Tiphaine; Welke, Pascal; Ruocco, Massimiliano; Aune, Erlend; Gallicchio, Claudio; Schiele, Gregor; Pernkopf, Franz; Blott, Michaela; Fröning, Holger; Schindler, Günther; Guidotti, Riccardo; Monreale, Anna; Rinzivillo, Salvatore; Biecek, Przemyslaw; Ntoutsi, Eirini; Pechenizkiy, Mykola; Rosenhahn, Bodo; Buckley, Christopher; Cialfi, Daniela; Lanillos, Pablo; Ramstead, Maxwell; Verbelen, Tim; Ferreira, Pedro M.; Andresini, Giuseppina; Malerba, Donato; Medeiros, Ibéria; Fournier-Viger, Philippe; Nawaz, M. Saqib; Ventura, Sebastian; Sun, Meng; Zhou, Min; Bitetta, Valerio; Bordino, Ilaria; Ferretti, Andrea; Gullo, Francesco; Ponti, Giovanni; Severini, Lorenzo; Ribeiro, Rita; Gama, João; Gavaldà, Ricard; Cooper, Lee; Ghazaleh, Naghmeh; Richiardi, Jonas; Roqueiro, Damian; Miranda, Diego Saldana; Sechidis, Konstantinos; Graça, Guilherme (Ed.): Machine Learning and Principles and Practice of Knowledge Discovery in Databases, vol. 1524, p. 268–278, Springer International Publishing, Cham, 2021, ISBN: 9783030937355 9783030937362.
@incollection{kamp_towards_2021,
title = {Towards Mining Generalized Patterns from RDF Data and a Domain Ontology},
author = {Tomas Martin and Victor Fuentes and Petko Valtchev and Abdoulaye Banir\'{e} Diallo and Ren\'{e} Lacroix and Maxime Leduc and Mounir Boukadoum},
editor = {Michael Kamp and Irena Koprinska and Adrien Bibal and Tassadit Bouadi and Beno\^{i}t Fr\'{e}nay and Luis Gal\'{a}rraga and Jos\'{e} Oramas and Linara Adilova and Yamuna Krishnamurthy and Bo Kang and Christine Largeron and Jefrey Lijffijt and Tiphaine Viard and Pascal Welke and Massimiliano Ruocco and Erlend Aune and Claudio Gallicchio and Gregor Schiele and Franz Pernkopf and Michaela Blott and Holger Fr\"{o}ning and G\"{u}nther Schindler and Riccardo Guidotti and Anna Monreale and Salvatore Rinzivillo and Przemyslaw Biecek and Eirini Ntoutsi and Mykola Pechenizkiy and Bodo Rosenhahn and Christopher Buckley and Daniela Cialfi and Pablo Lanillos and Maxwell Ramstead and Tim Verbelen and Pedro M. Ferreira and Giuseppina Andresini and Donato Malerba and Ib\'{e}ria Medeiros and Philippe Fournier-Viger and M. Saqib Nawaz and Sebastian Ventura and Meng Sun and Min Zhou and Valerio Bitetta and Ilaria Bordino and Andrea Ferretti and Francesco Gullo and Giovanni Ponti and Lorenzo Severini and Rita Ribeiro and Jo\~{a}o Gama and Ricard Gavald\`{a} and Lee Cooper and Naghmeh Ghazaleh and Jonas Richiardi and Damian Roqueiro and Diego Saldana Miranda and Konstantinos Sechidis and Guilherme Gra\c{c}a},
url = {https://link.springer.com/10.1007/978-3-030-93736-2_21},
doi = {10.1007/978-3-030-93736-2_21},
isbn = {9783030937355 9783030937362},
year = {2021},
date = {2021-01-01},
urldate = {2024-04-30},
booktitle = {Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
volume = {1524},
pages = {268\textendash278},
publisher = {Springer International Publishing},
address = {Cham},
keywords = {},
pubstate = {published},
tppubtype = {incollection}
}
2020
Almeida, Hayda; Palys, Sylvester; Tsang, Adrian; Diallo, Abdoulaye Baniré
TOUCAN: a framework for fungal biosynthetic gene cluster discovery Article de journal
Dans: NAR Genomics and Bioinformatics, vol. 2, no 4, p. lqaa098, 2020, ISSN: 2631-9268.
@article{almeida_toucan_2020,
title = {TOUCAN: a framework for fungal biosynthetic gene cluster discovery},
author = {Hayda Almeida and Sylvester Palys and Adrian Tsang and Abdoulaye Banir\'{e} Diallo},
url = {https://academic.oup.com/nargab/article/doi/10.1093/nargab/lqaa098/6007553},
doi = {10.1093/nargab/lqaa098},
issn = {2631-9268},
year = {2020},
date = {2020-11-01},
urldate = {2024-04-30},
journal = {NAR Genomics and Bioinformatics},
volume = {2},
number = {4},
pages = {lqaa098},
abstract = {Abstract
Fungal secondary metabolites (SMs) are an important source of numerous bioactive compounds largely applied in the pharmaceutical industry, as in the production of antibiotics and anticancer medications. The discovery of novel fungal SMs can potentially benefit human health. Identifying biosynthetic gene clusters (BGCs) involved in the biosynthesis of SMs can be a costly and complex task, especially due to the genomic diversity of fungal BGCs. Previous studies on fungal BGC discovery present limited scope and can restrict the discovery of new BGCs. In this work, we introduce TOUCAN, a supervised learning framework for fungal BGC discovery. Unlike previous methods, TOUCAN is capable of predicting BGCs on amino acid sequences, facilitating its use on newly sequenced and not yet curated data. It relies on three main pillars: rigorous selection of datasets by BGC experts; combination of functional, evolutionary and compositional features coupled with outperforming classifiers; and robust post-processing methods. TOUCAN best-performing model yields 0.982 F-measure on BGC regions in the Aspergillus niger genome. Overall results show that TOUCAN outperforms previous approaches. TOUCAN focuses on fungal BGCs but can be easily adapted to expand its scope to process other species or include new features.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Fungal secondary metabolites (SMs) are an important source of numerous bioactive compounds largely applied in the pharmaceutical industry, as in the production of antibiotics and anticancer medications. The discovery of novel fungal SMs can potentially benefit human health. Identifying biosynthetic gene clusters (BGCs) involved in the biosynthesis of SMs can be a costly and complex task, especially due to the genomic diversity of fungal BGCs. Previous studies on fungal BGC discovery present limited scope and can restrict the discovery of new BGCs. In this work, we introduce TOUCAN, a supervised learning framework for fungal BGC discovery. Unlike previous methods, TOUCAN is capable of predicting BGCs on amino acid sequences, facilitating its use on newly sequenced and not yet curated data. It relies on three main pillars: rigorous selection of datasets by BGC experts; combination of functional, evolutionary and compositional features coupled with outperforming classifiers; and robust post-processing methods. TOUCAN best-performing model yields 0.982 F-measure on BGC regions in the Aspergillus niger genome. Overall results show that TOUCAN outperforms previous approaches. TOUCAN focuses on fungal BGCs but can be easily adapted to expand its scope to process other species or include new features.
Frasco, Charlotte; Radmacher, Maxime; Lacroix, René; Cue, Roger; Valtchev, Petko; Robert, Claude; Boukadoum, Mounir; Sirard, Marc-André; Diallo, Abdoulaye
Towards an Effective Decision-making System based on Cow Profitability using Deep Learning: Article d’actes
Dans: Proceedings of the 12th International Conference on Agents and Artificial Intelligence, p. 949–958, SCITEPRESS – Science and Technology Publications, Valletta, Malta, 2020, ISBN: 9789897583957.
@inproceedings{frasco_towards_2020,
title = {Towards an Effective Decision-making System based on Cow Profitability using Deep Learning:},
author = {Charlotte Frasco and Maxime Radmacher and Ren\'{e} Lacroix and Roger Cue and Petko Valtchev and Claude Robert and Mounir Boukadoum and Marc-Andr\'{e} Sirard and Abdoulaye Diallo},
url = {http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0009174809490958},
doi = {10.5220/0009174809490958},
isbn = {9789897583957},
year = {2020},
date = {2020-01-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the 12th International Conference on Agents and Artificial Intelligence},
pages = {949\textendash958},
publisher = {SCITEPRESS - Science and Technology Publications},
address = {Valletta, Malta},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2019
Remita, Amine M.; Diallo, Abdoulaye Banire
Statistical Linear Models in Virus Genomic Alignment-free Classification: Application to Hepatitis C Viruses Article d’actes
Dans: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), p. 474–481, IEEE, San Diego, CA, USA, 2019, ISBN: 9781728118673.
@inproceedings{remita_statistical_2019,
title = {Statistical Linear Models in Virus Genomic Alignment-free Classification: Application to Hepatitis C Viruses},
author = {Amine M. Remita and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/8983375/},
doi = {10.1109/BIBM47256.2019.8983375},
isbn = {9781728118673},
year = {2019},
date = {2019-11-01},
urldate = {2024-04-30},
booktitle = {2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
pages = {474\textendash481},
publisher = {IEEE},
address = {San Diego, CA, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Almeida, Hayda; Tsang, Adrian; Diallo, Abdoulaye Banire
Supporting supervised learning in fungal Biosynthetic Gene Cluster discovery: new benchmark datasets Article d’actes
Dans: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), p. 1280–1287, IEEE, San Diego, CA, USA, 2019, ISBN: 9781728118673.
@inproceedings{almeida_supporting_2019,
title = {Supporting supervised learning in fungal Biosynthetic Gene Cluster discovery: new benchmark datasets},
author = {Hayda Almeida and Adrian Tsang and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/8983041/},
doi = {10.1109/BIBM47256.2019.8983041},
isbn = {9781728118673},
year = {2019},
date = {2019-11-01},
urldate = {2024-04-30},
booktitle = {2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
pages = {1280\textendash1287},
publisher = {IEEE},
address = {San Diego, CA, USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wu, Chao-Jung; Remita, Amine M.; Diallo, Abdoulaye Baniré
MirLibSpark: A Scalable NGS Plant MicroRNA Prediction Pipeline for Multi-Library Functional Annotation Article d’actes
Dans: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, p. 669–674, ACM, Niagara Falls NY USA, 2019, ISBN: 9781450366663.
@inproceedings{wu_mirlibspark_2019,
title = {MirLibSpark: A Scalable NGS Plant MicroRNA Prediction Pipeline for Multi-Library Functional Annotation},
author = {Chao-Jung Wu and Amine M. Remita and Abdoulaye Banir\'{e} Diallo},
url = {https://dl.acm.org/doi/10.1145/3307339.3343463},
doi = {10.1145/3307339.3343463},
isbn = {9781450366663},
year = {2019},
date = {2019-09-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics},
pages = {669\textendash674},
publisher = {ACM},
address = {Niagara Falls NY USA},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Vitae, Golrokh; Remita, Amine M.; Diallo, Abdoulaye Baniré
Revisiting the landscape of evolutionary breakpoints across human genome using multi-way comparison Divers
2019.
@misc{vitae_revisiting_2019,
title = {Revisiting the landscape of evolutionary breakpoints across human genome using multi-way comparison},
author = {Golrokh Vitae and Amine M. Remita and Abdoulaye Banir\'{e} Diallo},
url = {http://biorxiv.org/lookup/doi/10.1101/696245},
doi = {10.1101/696245},
year = {2019},
date = {2019-07-01},
urldate = {2024-04-30},
abstract = {Abstract
Genome rearrangement is one of the major forces driving the processes of the evolution and disease development. The chromosomal position affected by these rearrangements are called breakpoints. The breakpoints occurring during the evolution of species are known to be non randomly distributed. Detecting their landscape and mapping them to genomic features constitute an important features in both comparative and functional genomics. Several studies have attempted to provide such mapping based on pairwise comparison of genes as conservation anchors. With the availability of more accurate multi-way alignments, we design an approach to identify synteny blocks and evolutionary breakpoints based on UCSC 45-way conservation sequence alignments with 12 selected species. The multi-way designed approach with the mild flexibility of presence of selected species, helped to have a better determination of human lineage-specific evolutionary breakpoints. We identified 261,391 human lineage-specific evolutionary breakpoints across the genome and 2,564 dense regions enriched with biological processes involved in adaptive traits such as
response to DNA damage stimulus, cellular response to stress
and
metabolic process
. Moreover, we found 230 regions refractory to evolutionary breakpoints that carry genes associated with crucial developmental process such as
organ morphogenesis, skeletal system development, chordate embryonic development, nerve development
and
regulation of biological process
. This initial map of the human genome will help to gain better insight into several studies including developmental studies and cancer rearrangement processes.},
keywords = {},
pubstate = {published},
tppubtype = {misc}
}
Genome rearrangement is one of the major forces driving the processes of the evolution and disease development. The chromosomal position affected by these rearrangements are called breakpoints. The breakpoints occurring during the evolution of species are known to be non randomly distributed. Detecting their landscape and mapping them to genomic features constitute an important features in both comparative and functional genomics. Several studies have attempted to provide such mapping based on pairwise comparison of genes as conservation anchors. With the availability of more accurate multi-way alignments, we design an approach to identify synteny blocks and evolutionary breakpoints based on UCSC 45-way conservation sequence alignments with 12 selected species. The multi-way designed approach with the mild flexibility of presence of selected species, helped to have a better determination of human lineage-specific evolutionary breakpoints. We identified 261,391 human lineage-specific evolutionary breakpoints across the genome and 2,564 dense regions enriched with biological processes involved in adaptive traits such as
response to DNA damage stimulus, cellular response to stress
and
metabolic process
. Moreover, we found 230 regions refractory to evolutionary breakpoints that carry genes associated with crucial developmental process such as
organ morphogenesis, skeletal system development, chordate embryonic development, nerve development
and
regulation of biological process
. This initial map of the human genome will help to gain better insight into several studies including developmental studies and cancer rearrangement processes.
Lebatteux, Dylan; Remita, Amine M.; Diallo, Abdoulaye Baniré
Toward an Alignment-Free Method for Feature Extraction and Accurate Classification of Viral Sequences Article de journal
Dans: Journal of Computational Biology, vol. 26, no 6, p. 519–535, 2019, ISSN: 1557-8666.
@article{lebatteux_toward_2019,
title = {Toward an Alignment-Free Method for Feature Extraction and Accurate Classification of Viral Sequences},
author = {Dylan Lebatteux and Amine M. Remita and Abdoulaye Banir\'{e} Diallo},
url = {https://www.liebertpub.com/doi/10.1089/cmb.2018.0239},
doi = {10.1089/cmb.2018.0239},
issn = {1557-8666},
year = {2019},
date = {2019-06-01},
urldate = {2024-04-30},
journal = {Journal of Computational Biology},
volume = {26},
number = {6},
pages = {519\textendash535},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Diallo, Abdoulaye Baniré; Nguifo, Engelbert Mephu; Dhifli, Wajdi; Azizi, Elham; Prabhakaran, Sandhya; Tansey, Wesley
Dans: Journal of Computational Biology, vol. 26, no 6, p. 507–508, 2019, ISSN: 1557-8666.
@article{diallo_selected_2019,
title = {Selected Papers from the Workshop on Computational Biology: Joint with the International Joint Conference on Artificial Intelligence and the International Conference on Machine Learning, 2018},
author = {Abdoulaye Banir\'{e} Diallo and Engelbert Mephu Nguifo and Wajdi Dhifli and Elham Azizi and Sandhya Prabhakaran and Wesley Tansey},
url = {https://www.liebertpub.com/doi/10.1089/cmb.2019.29020.abd},
doi = {10.1089/cmb.2019.29020.abd},
issn = {1557-8666},
year = {2019},
date = {2019-06-01},
urldate = {2024-04-30},
journal = {Journal of Computational Biology},
volume = {26},
number = {6},
pages = {507\textendash508},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Lebatteux, Dylan; Remita, Amine M; Diallo, Abdoulaye Baniré
Toward an Alignment-Free Method for Feature Extraction and Accurate Classification of Viral Sequences. Article de journal
Dans: Journal of computational biology : a journal of computational molecular cell biology, vol. 26, no. 6, p. 519-535, 2019, ISSN: 1557-8666 (Electronic).
@article{CASTOR-KRFE,
title = {Toward an Alignment-Free Method for Feature Extraction and Accurate Classification of Viral Sequences.},
author = {Dylan Lebatteux and Amine M Remita and Abdoulaye Banir\'{e} Diallo},
doi = {10.1089/cmb.2018.0239},
issn = {1557-8666 (Electronic)},
year = {2019},
date = {2019-01-01},
journal = {Journal of computational biology : a journal of computational molecular cell biology},
volume = {26},
issue = {6},
pages = {519-535},
abstract = {The classification of pathogens in emerging and re-emerging viruses represents major interests in taxonomic studies, functional genomics, host-pathogen interplay, prevention, and disease treatments. It consists of assigning a given sequence to its related group of known sequences sharing similar characteristics and traits. The challenges to such classification could be associated with several virus properties including recombination, mutation rate, multiplicity of motifs, and diversity. In domains such as pathogen monitoring and surveillance, it is important to detect and quantify known and novel taxa without exploiting the full and accurate alignments or virus family profiles. In this study, we propose an alignment-free method, CASTOR-KRFE, to detect discriminating subsequences within known pathogen sequences to classify accurately unknown pathogen sequences. This method includes three major steps: (1) vectorization of known viral genomic sequences based on k-mers to constitute the potential features, (2) efficient way of pattern extraction and evaluation maximizing classification performance, and (3) prediction of the minimal set of features fitting a given criterion (threshold of performance metric and maximum number of features). We assessed this method through a jackknife data partitioning on a dozen of various virus data sets, covering the seven major virus groups and including influenza virus, Ebola virus, human immunodeficiency virus 1, hepatitis C virus, hepatitis B virus, and human papillomavirus. CASTOR-KRFE provides a weighted average F-measure \>0.96 over a wide range of viruses. Our method also shows better performance on complex virus data sets than multiple subsequences extractor for classification (MISSEL), a subsequence extraction method, and the Discriminative mode of MEME patterns extraction tool.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2018
Halioui, Ahmed; Valtchev, Petko; Diallo, Abdoulaye Banire
Bioinformatic Workflow Extraction from Scientific Texts based on Word Sense Disambiguation Article de journal
Dans: IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no 6, p. 1979–1990, 2018, ISSN: 1545-5963, 1557-9964, 2374-0043.
@article{halioui_bioinformatic_2018,
title = {Bioinformatic Workflow Extraction from Scientific Texts based on Word Sense Disambiguation},
author = {Ahmed Halioui and Petko Valtchev and Abdoulaye Banire Diallo},
url = {https://ieeexplore.ieee.org/document/8385176/},
doi = {10.1109/TCBB.2018.2847336},
issn = {1545-5963, 1557-9964, 2374-0043},
year = {2018},
date = {2018-11-01},
urldate = {2024-04-30},
journal = {IEEE/ACM Transactions on Computational Biology and Bioinformatics},
volume = {15},
number = {6},
pages = {1979\textendash1990},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Gao, Xin; Chen, Jake Y.; Zaki, Mohammed J.
Multiscale and Multimodal Analysis for Computational Biology Article de journal
Dans: IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no 6, p. 1951–1952, 2018, ISSN: 1545-5963, 1557-9964, 2374-0043.
@article{gao_multiscale_2018,
title = {Multiscale and Multimodal Analysis for Computational Biology},
author = {Xin Gao and Jake Y. Chen and Mohammed J. Zaki},
url = {https://ieeexplore.ieee.org/document/8573239/},
doi = {10.1109/TCBB.2018.2838658},
issn = {1545-5963, 1557-9964, 2374-0043},
year = {2018},
date = {2018-11-01},
urldate = {2024-04-30},
journal = {IEEE/ACM Transactions on Computational Biology and Bioinformatics},
volume = {15},
number = {6},
pages = {1951\textendash1952},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Ransy, Doris G.; Lord, Etienne; Caty, Martine; Lapointe, Normand; Boucher, Marc; Diallo, Abdoulaye Baniré; Soudeyns, Hugo
Subtle differences in selective pressures applied on the envelope gene of HIV-1 in pregnant versus non-pregnant women Article de journal
Dans: Infection, Genetics and Evolution, vol. 62, p. 141–150, 2018, ISSN: 15671348.
@article{ransy_subtle_2018,
title = {Subtle differences in selective pressures applied on the envelope gene of HIV-1 in pregnant versus non-pregnant women},
author = {Doris G. Ransy and Etienne Lord and Martine Caty and Normand Lapointe and Marc Boucher and Abdoulaye Banir\'{e} Diallo and Hugo Soudeyns},
url = {https://linkinghub.elsevier.com/retrieve/pii/S1567134818301989},
doi = {10.1016/j.meegid.2018.04.020},
issn = {15671348},
year = {2018},
date = {2018-08-01},
urldate = {2024-04-30},
journal = {Infection, Genetics and Evolution},
volume = {62},
pages = {141\textendash150},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Li, Qiang; Byrns, Brook; Badawi, Mohamed A.; Diallo, Abdoulaye Banire; Danyluk, Jean; Sarhan, Fathey; Laudencia-Chingcuanco, Debbie; Zou, Jitao; Fowler, D. Brian
Transcriptomic Insights into Phenological Development and Cold Tolerance of Wheat Grown in the Field Article de journal
Dans: Plant Physiology, vol. 176, no 3, p. 2376–2394, 2018, ISSN: 0032-0889, 1532-2548.
@article{li_transcriptomic_2018,
title = {Transcriptomic Insights into Phenological Development and Cold Tolerance of Wheat Grown in the Field},
author = {Qiang Li and Brook Byrns and Mohamed A. Badawi and Abdoulaye Banire Diallo and Jean Danyluk and Fathey Sarhan and Debbie Laudencia-Chingcuanco and Jitao Zou and D. Brian Fowler},
url = {https://academic.oup.com/plphys/article/176/3/2376-2394/6117008},
doi = {10.1104/pp.17.01311},
issn = {0032-0889, 1532-2548},
year = {2018},
date = {2018-03-01},
urldate = {2024-04-30},
journal = {Plant Physiology},
volume = {176},
number = {3},
pages = {2376\textendash2394},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
2017
Remita, Mohamed Amine; Halioui, Ahmed; Diouara, Abou Abdallah Malick; Daigle, Bruno; Kiani, Golrokh; Diallo, Abdoulaye Baniré
A machine learning approach for viral genome classification Article de journal
Dans: BMC Bioinformatics, vol. 18, no 1, p. 208, 2017, ISSN: 1471-2105.
@article{remita_machine_2017,
title = {A machine learning approach for viral genome classification},
author = {Mohamed Amine Remita and Ahmed Halioui and Abou Abdallah Malick Diouara and Bruno Daigle and Golrokh Kiani and Abdoulaye Banir\'{e} Diallo},
url = {http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1602-3},
doi = {10.1186/s12859-017-1602-3},
issn = {1471-2105},
year = {2017},
date = {2017-12-01},
urldate = {2024-05-03},
journal = {BMC Bioinformatics},
volume = {18},
number = {1},
pages = {208},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Halioui, Ahmed; Valtchev, Petko; Diallo, Abdoulaye Baniré
T-GOWler: Discovering Generalized Process Models Within Texts Article de journal
Dans: Journal of Computational Biology, vol. 24, no 8, p. 799–808, 2017, ISSN: 1557-8666.
@article{halioui_t-gowler_2017,
title = {T-GOWler: Discovering Generalized Process Models Within Texts},
author = {Ahmed Halioui and Petko Valtchev and Abdoulaye Banir\'{e} Diallo},
url = {http://www.liebertpub.com/doi/10.1089/cmb.2017.0085},
doi = {10.1089/cmb.2017.0085},
issn = {1557-8666},
year = {2017},
date = {2017-08-01},
urldate = {2024-04-30},
journal = {Journal of Computational Biology},
volume = {24},
number = {8},
pages = {799\textendash808},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Diallo, Abdoulaye Baniré; Nguifo, Engelbert Mephu; Zaki, Mohammed J.
Preface: Selected Papers from the Workshop Bioinformatics and Artificial Intelligence Joined with the International Joint Conference on Artificial Intelligence Article de journal
Dans: Journal of Computational Biology, vol. 24, no 8, p. 733–733, 2017, ISSN: 1557-8666.
@article{diallo_preface_2017,
title = {Preface: Selected Papers from the Workshop Bioinformatics and Artificial Intelligence Joined with the International Joint Conference on Artificial Intelligence},
author = {Abdoulaye Banir\'{e} Diallo and Engelbert Mephu Nguifo and Mohammed J. Zaki},
url = {http://www.liebertpub.com/doi/10.1089/cmb.2017.29008.abd},
doi = {10.1089/cmb.2017.29008.abd},
issn = {1557-8666},
year = {2017},
date = {2017-08-01},
urldate = {2024-04-30},
journal = {Journal of Computational Biology},
volume = {24},
number = {8},
pages = {733\textendash733},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Halioui, Ahmed; Martin, Tomas; Valtchev, Petko; Diallo, Abdoulaye Baniré
Ontology-based workflow pattern mining: application to bioinformatics expertise acquisition Article d’actes
Dans: Proceedings of the Symposium on Applied Computing, p. 824–827, ACM, Marrakech Morocco, 2017, ISBN: 9781450344869.
@inproceedings{halioui_ontology-based_2017,
title = {Ontology-based workflow pattern mining: application to bioinformatics expertise acquisition},
author = {Ahmed Halioui and Tomas Martin and Petko Valtchev and Abdoulaye Banir\'{e} Diallo},
url = {https://dl.acm.org/doi/10.1145/3019612.3019866},
doi = {10.1145/3019612.3019866},
isbn = {9781450344869},
year = {2017},
date = {2017-04-01},
urldate = {2024-04-30},
booktitle = {Proceedings of the Symposium on Applied Computing},
pages = {824\textendash827},
publisher = {ACM},
address = {Marrakech Morocco},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Remita, Mohamed Amine; Halioui, Ahmed; Diouara, Abou Abdallah Malick; Daigle, Bruno; Kiani, Golrokh; Diallo, Abdoulaye Baniré
A machine learning approach for viral genome classification. Article de journal
Dans: BMC bioinformatics, vol. 18, no. 1, p. 208, 2017, ISSN: 1471-2105 (Electronic).
@article{CASTOR,
title = {A machine learning approach for viral genome classification.},
author = {Mohamed Amine Remita and Ahmed Halioui and Abou Abdallah Malick Diouara and Bruno Daigle and Golrokh Kiani and Abdoulaye Banir\'{e} Diallo},
doi = {10.1186/s12859-017-1602-3},
issn = {1471-2105 (Electronic)},
year = {2017},
date = {2017-01-01},
journal = {BMC bioinformatics},
volume = {18},
issue = {1},
pages = {208},
abstract = {BACKGROUND: Advances in cloning and sequencing technology are yielding a massive number of viral genomes. The classification and annotation of these genomes constitute important assets in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific well-studied family of viruses. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly sequenced strains of diverse virus families. RESULTS: Here, we introduce a virus classification platform, CASTOR, based on machine learning methods. CASTOR is inspired by a well-known technique in molecular biology: restriction fragment length polymorphism (RFLP). It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. We benchmark CASTOR for the classification of distinct datasets of human papillomaviruses (HPV), hepatitis B viruses (HBV) and human immunodeficiency viruses type 1 (HIV-1). Results reveal true positive rates of 99%, 99% and 98% for HPV Alpha species, HBV genotyping and HIV-1 M subtyping, respectively. Furthermore, CASTOR shows a competitive performance compared to well-known HIV-1 specific classifiers (REGA and COMET) on whole genomes and pol fragments. CONCLUSION: The performance of CASTOR, its genericity and robustness could permit to perform novel and accurate large scale virus studies. The CASTOR web platform provides an open access, collaborative and reproducible machine learning classifiers. CASTOR can be accessed at http://castor.bioinfo.uqam.ca .},
keywords = {},
pubstate = {published},
tppubtype = {article}
}