Puzzle out Machine Learning Model-Explaining Disintegration Process in ODTs

Tablets are the most common dosage form of pharmaceutical products. While tablets represent the majority of marketed pharmaceutical products, there remain a significant number of patients who find it difficult to swallow conventional tablets. Such difficulties lead to reduced patient compliance. Orally disintegrating tablets (ODT), sometimes called oral dispersible tablets, are the dosage form of choice for patients with swallowing difficulties. ODTs are defined as a solid dosage form for rapid disintegration prior to swallowing. The disintegration time, therefore, is one of the most important and optimizable critical quality attributes (CQAs) for ODTs. Current strategies to optimize ODT disintegration times are based on a conventional trial-and-error method whereby a small number of samples are used as proxies for the compliance of whole batches. We present an alternative machine learning approach to optimize the disintegration time based on a wide variety of machine learning (ML) models through the H2O AutoML platform. ML models are presented with inputs from a database originally presented by Han et al., which was enhanced and curated to include chemical descriptors representing active pharmaceutical ingredient (API) characteristics. A deep learning model with a 10-fold cross-validation NRMSE of 8.1% and an R2 of 0.84 was obtained. The critical parameters influencing the disintegration of the directly compressed ODTs were ascertained using the SHAP method to explain ML model predictions. A reusable, open-source tool, the ODT calculator, is now available at Heroku platform.

Download the full article

Continue reading here

About this article: Szlęk, J.; Khalid, M.H.; Pacławski, A.; Czub, N.; Mendyk, A. Puzzle out Machine Learning Model-Explaining Disintegration Process in ODTs. Pharmaceutics 2022, 14, 859. https://doi.org/10.3390/pharmaceutics14040859

Conclusions
The rapid development of artificial intelligence and machine learning tools, together with the increasing computational capacity of modern computers, creates a great opportunity for the pharmaceutical industry. These changes are also visible in the form of regulators such as the FDA, which developed guidelines and pilot programs targeting the application of AI/ML in healthcare. Within FDA initiatives, the development of model-based products is worth mentioning. Concepts such as model-informed drug development (MIDD) and machine learning have been noticed by the agency [81]. The areas in which predictive models can be used include the understanding of the production process and knowledge discovery.

A data-driven modeling paradigm, which the current AI/ML is based on, demands both high quality and large quantities of data. The former ensures precision (high predictability), whereas the latter accounts for the scope of the developed models. Given the highly automated manner of contemporary AI/ML implementations, the search for crucial variables and the handling of missing data via, for example, data imputation, has also become a domain of AI/ML. Having that said, we presume that when the dataset can be extended both in number of cases and features, the resulting models retain or improve their efficacy yet broaden their scope. The quantity of data could be also a factor in the context of improving the handling of incomplete features, when remaining cases would provide the means for data imputation. This is, of course, case-related, yet AI/ML works surprisingly well when it comes to filling the holes in the data when provided with a large number of cases to analyze.
ODTs manufactured by a direct compression process are complex systems. Many factors affect critical quality attributes, including the tablet disintegration time. In the present study, a new database focused on disintegration time was created from the literature, which was then utilized to develop the ML model using the H2O AutoML platform. An explainability analysis was carried out to understand the foundations underlying the final model’s predictions using Shapley values and partial dependency plots. Our findings on the effect that various formulation components exert on the disintegration time are corroborated by the existing literature and experts, showing that AutoML-based approaches are suitable for modeling complex pharmaceutical tasks. However, ML is constrained by the availability of data; thus, such models can be improved by the extension of well-structured and labeled datasets.
The source data, scripts used in this work, together with the online version of the model, are freely available, as stated in the Supplementary Materials section.

You might also like