What Makes a Potent Nitrosamine? Statistical Validation of Expert-Derived Structure–Activity Relationships

The discovery of carcinogenic nitrosamine impurities above the safe limits in pharmaceuticals has led to an urgent need to develop methods for extending structure–activity relationship (SAR) analyses from relatively limited datasets, while the level of confidence required in that SAR indicates that there is significant value in investigating the effect of individual substructural features in a statistically robust manner. This is a challenging exercise to perform on a small dataset, since in practice, compounds contain a mixture of different features, which may confound both expert SAR and statistical quantitative structure–activity relationship (QSAR) methods. Isolating the effects of a single structural feature is made difficult due to the confounding effects of other functionality as well as issues relating to determining statistical significance in cases of concurrent statistical tests of a large number of potential variables with a small dataset; a naïve QSAR model does not predict any features to be significant after correction for multiple testing. We propose a variation on Bayesian multiple linear regression to estimate the effects of each feature simultaneously yet independently, taking into account the combinations of features present in the dataset and reducing the impact of multiple testing, showing that some features have a statistically significant impact. This method can be used to provide statistically robust validation of expert SAR approaches to the differences in potency between different structural groupings of nitrosamines. Structural features that lead to the highest and lowest carcinogenic potency can be isolated using this method, and novel nitrosamine compounds can be assigned into potency categories with high accuracy.

Introduction

Recent discovery of nitrosamine impurities in marketed drugs has led to a rapid evolution of regulatory activity (1−4) and, in response, analysis of the synthetic and formulation pathways for existing drug products (DPs) as well as novel active pharmaceutical ingredients (APIs) and DPs. Due to the extreme carcinogenic potency (5,6) of some nitrosamines such as nitrosodiethylamine (NDEA), these compounds are considered to be in the cohort of concern, (7−9) and a class-specific acceptable intake (AI) of 18 ng/day has been set by the European Medicines Agency (EMA) and other regulators─based on the 5th percentile of known nitrosamine TD50 values (the dose that induces tumors in 50% of animals over control, which can be extrapolated to a standardized AI for humans). Read-across to the harmonic mean TD50s of NDEA (26.5 mg/kg/day) and NDMA (96 mg/kg/day), corresponding to AI limits of 26.5 and 96 ng/day, respectively, has been proposed for a number of common nitrosamines by the EMA, (1) U.S. Food and Drug Administration (FDA), (4) and others. However, the carcinogenic potencies of nitrosamines span a range of at least 4 orders of magnitude, (10) and these class-based AI limits can be increased (1,4) not only for those compounds that have reliable carcinogenicity data but also those for which a structurally close analogue with reliable carcinogenicity data can be determined.

 

This, however, raises the question of “what is structurally similar?”. One approach for structural similarity that is often used is the Tanimoto coefficient of similarity, calculated for the whole molecule; however, this by itself would be a poor method to use for nitrosamines since the carcinogenic potential is critically dependent on the metabolic potential, (11−13) which is itself dependent on the local environment around the nitrosamine substructure. (12−14) Approaches have been made subjectively to address nitrosamine structure–activity relationships (SAR); (12−14) however, the step from “this feature may affect potency” to “this feature has a statistically significant effect on potency” has hitherto not been made for nitrosamines. This work presents a method by which that can be performed. In addition, a comparable method is used for the classification of features as to whether they have an impact on if the nitrosamine is carcinogenic or not (positive prevalence). These two models are referred to as the “regression” and “classification” models henceforth.

 

While the cohort of concern was defined (7−9) based on the N-nitroso substructure (N–N═O) and thus can be considered to include all N-nitroso compounds (NOCs), the main focus of both SAR work and regulatory attention has been on dialkyl nitrosamines─as opposed to nitrosoureas, nitrosoamides, and others (as defined in Figure 2 in Cross and Ponting (14)). These have been observed to have comparable potency to dialkyl nitrosamines but have different requirements for metabolic activation. Results are presented here for analysis performed both on the entire set of N-nitroso compounds and considering the subset of dialkyl nitrosamines alone (henceforth referred to as “NOC” and “nitrosamine” datasets).

 

We have previously shown (15) that the carcinogenic potencies of N-nitroso compounds and nitrosamines as classes of compounds follow a log-normal distribution, and that the same can be said of the various subclasses proposed in that work. Subsequent research by a collaborative cross-industry working group (14) has refined the potential structural features to provide a list of over 80 features, encoded as SMARTS (SMILES (Simplified Molecular-Input Line-Entry System) Arbitrary Target Specification) patterns. In this work, we present the synthesis of these two previous aspects─statistical methods are used to show that a number of expert-derived features have statistically significant effects on the carcinogenic potency and prevalence of nitrosamines. Furthermore, the statistical analysis of the impact of the features was compared with an independent subjective assessment, performed by an expert in SAR analysis previously uninvolved with this work but familiar with nitrosamine safety assessment.

 

A key complexity in moving from expert assessment to statistically significant results, which this work seeks to address, is that any given nitrosamine is likely to be a member of multiple substructual categories. For example, N-nitrosonornicotine (NNN, see Figure 8b) is a pyrrolidine ring, with an isopropyl-like α-carbon, which is also benzylic─and the different features may have a variety of effects that may variously increase or decrease potency. These may also mask the effect of each other, especially in the relatively small dataset that is available for nitrosamines. The deconvolution of these requires a statistical technique (discussed subsequently) that is able to take dependencies in the data into account and precludes analysis of individual features in isolation. Figure 1 shows, using the set of features described subsequently, the overlaps between categories for the dataset of nitrosamines with available carcinogenicity data. These methods could also be applied to other complex structural classes (e.g., aromatic amines), once an expert-derived list of potentially impactful features is created. Returning to the question of defining the relevance of an analogue for potential read-across to a novel nitrosamine compound, the presence or absence of particular features should be evaluated, especially those shown to have a statistically significant impact on the potency.

 

 

or read it here

 

Source: What Makes a Potent Nitrosamine? Statistical Validation of Expert-Derived Structure–Activity Relationships, Robert Thomas, Rachael E. Tennant, Antonio Anax F. Oliveira, and David J. Ponting, Chemical Research in Toxicology Article ASAP, DOI: 10.1021/acs.chemrestox.2c00199
You might also like