QSAR , Molecular Docking and Dynamics Studies of Pyrrolo [ 2 , 3-b ] Pyridine Derivatives as Bruton ’ s Tyrosine Kinase Inhibitors

Article history: Received on: 02/09/2017 Accepted on: 21/10/2017 Available online: 30/12/2017 Bruton’s tyrosine kinase (BTK) is involved in multiple signaling pathways regulating the B cell receptor, which is identified as an attractive drug target for lymphoid malignancies. The aims of the present study were to develop a model of Quantitative Structure Activity Relationship (QSAR), and to perform molecular docking and molecular dynamics study of some pyrrolo[2,3-b]pyridine derivatives as potential inhibitor of BTK. The selection and calculation of suitable descriptors was performed by using Molecular Operating Environment (MOE 2009.10), while Multiple Linear Regression (MLR) was used to generate QSAR models. The result of study revealed that the validated QSAR model satisfied the statistical criteria for correlation coefficient, leaveone-out validation coefficient, Fischer’s value, and external validation at 0.944, 0.740, 14.873, and 0.792, respectively. Using the validated QSAR model, a novel compound was proposed, which had IC50 lower than that of parent compound. It was then docked into the active site of BTK. The molecular dynamics simulation showed that the new compound was stable during 40 ns dynamics run. The MM-PBSA calculation showed that the new compound had lower binding free energy than those of native ligand and parent compound, which indicated that the new compound could be researched further.


INTRODUCTION
Bruton's tyrosine kinase (BTK) is a non-receptor tyrosine kinase, which is critical for B-cell development, differentiation, and signaling (Mohamed et al., 2009;Tsukada et al., 1993;Vetrie et al., 1993).BTK is considered as an attractive drug target, particularly for the treatment of several diseases including cancer, due to the fact that it is over expressed in many B cell leukaemias (Zhao et al., 2015;Hendriks et al., 2014).
Several BTK inhibitors have been developed including ibrutinib, which showed antitumor activity against various B cell malignancies.In accordance with those facts, Zhao et al. (2015) synthesized a series of pyrrolo [2,3-b]pyridine-based BTK inhibitors using scaffold-hopping drug design strategy.They found that one compound (3P) showed superior activity to that .RN486 and pyrrolo[2,pyrimidine derivative 2 both in BTK enzymatic (IC 50 = 6.0 nM) and cellular inhibition (IC 50 = 14 nM) assays.To further develop novel compound of BTK inhibitor based on the pyrrolo [2,3-b]pyridine, the present study was devoted to identify a quantitative structure-activity relationship of the pyrrolo [2,3-b]pyridine derivatives and to determine their binding modes and structural stabilities as BTK inhibitor.In the computeraided drug design methods, the quantitative structure-activity relationship (QSAR) study combined with molecular docking and molecular dynamics simulation is the accepted method for predicting novel and potent lead compounds (Arba et al., 2016).While QSAR develops a model for predicting new potent compound (Kumar et al., 2016;Shinde et al., 2017), molecular docking predicts the binding mode of a ligand in the active site of its protein target (El-Sawy et al., 2017;Abdalsalam, 2017;Khan et al., 2017).On the other hand, molecular dynamics simulation coupled with MM-PBSA calculation is useful to evaluate the stability and binding free energy of a ligand to its protein target.

Data set
The pyrrolo[2,3-b]pyridine derivatives (Fig. 1 and Table 1) reported by Zhao et al. (2015) were used as data set.The biological activities (IC 50 ) of those compounds were converted into their negative logarithmic scale (pIC 50 = −log IC 50 ).The data set comprised of 22 compounds which covered a wide range of IC 50 values, spanning from 6 nM to 562.2 nM.Before proceeding to further analysis, recognition of outliers was performed based on their studentized deleted residual values.A compound was considered as an outlier if it has studentized deleted residual value higher than 2 or less than −2.Next, all compounds were grouped into training set and test set.The biological activities of the compounds were sorted, and 25 % of the data was randomly selected as the test set, while the remaining compounds were used as the training set.

QSAR model calculation and validation
The QSAR model was developed using the multiple linear regression analysis by using SPSS for Windows (version 19; SPSS Inc., Chicago, IL, USA).The QSAR model was generated using descriptors as the independent variable (X) and biological activity values (pIC 50 ) as the dependent variable (Y).The statistical reliability of the QSAR model was evaluated based on several criteria, i.e. squared correlation coefficient (R 2 ), Fischer's value for statistical significance (F), adjusted squared correlation coefficient (R adj 2 ), and standard error of estimation (SEE) (Dearden et al., 2009).In addition, internal validation of the developed model was performed by using leave-one-out (LOO) cross-validation coefficient (q 2 ) ( Golbraikh and Tropsha, 2002), in which each compound in the training set was eliminated in the calculation of linear regression analysis.The value of q 2 was calculated according to the following formula: Where, q 2 is coefficient of internal cross validation; y is observed activity of compound i; ŷ is predicted activity of compound i; and ȳ is the average observed activity of compound i.The value of q 2 > 0.5 indicated the predictive ability of the developed model (Golbraikh et al. 2003;Tropsha et al., 2003).In addition, Tropsha et al. (2003) explained that the internal cross validation should be accomplished by external validation using test set, which was represented by the value of external cross validation coefficient (R 2pred ).A model was considered to be valid if it possessed R 2pred value higher than 0.6.

New compound design and molecular docking
Using the validated QSAR model, new compound was designed and its biological activity was then predicted.The compound 3P was used as parent compound as it had the lowest IC 50 (6 nM) in the experimental study.One compound that had lower predicted IC 50 than that of the parent compound (3P) was then docked to its protein target (BTK).Structure of BTK was downloaded from protein data bank (PDB id : 4OTR, X-ray resolution : 1.95 Å) (Lou et al., 2015).The docking site was centered into the native ligand (2V3) binding site with a grid point spacing of 0.375 Å and a dimension of 50 in each x y z direction.Other docking parameters were used as default (Arba et al., 2017).

Molecular dynamics simulation
Each parent compound (3P), new designed compound, and native ligand (2V3), complexed with BTK, obtained from the docking studies, was taken for molecular dynamics simulations using the GPU version of the PMEMD engine of Amber 16 package (Case et al., 2015;Salomon-Ferrer et al., 2013) by employing the ff14SB force field (Maier et al., 2015).The GAFF2 force field (Wang et al., 2004) and AM1-BCC (Jakalian et al., 2002) were used to parameterize each ligand.The water model of a truncated octahedron TIP3P was used with a minimum distance of 10 Å around the complex.The Na+ counterions were added to keep the system electroneutral.The Sander module of Amber 16 was used for minimization, heating, and equilibration.The first minimization consisted of 500 steps of steepest descents and 5500 steps of conjugate gradients with restrained protein (k = 500 kcal mol -1 Å -2 ), which was followed by the second minimization with the restrained backbone atoms of protein and the third minimization without any restraint, respectively.
The heating step was performed gradually from 0 to 100, 100 to 200, and 200 to 300 K; each of which was carried out for 50 ps in NVT ensemble with a time step of 0.0005 ps and restraints (k) of 5 kcal mol -1 Å -2 .Next, each system underwent relaxation from 5 to 3 and 0 kcal mol -1 Å -2 during three 100 ps equilibration steps in NPT ensemble.Lastly, production step was performed for 40 ns by using pmemd.cudamodule in Amber 16 in an isothermal isobaric ensemble.The Langevin thermostat with a collision rate of 1.0 ps -1 was used to keep the system in 300 K thermal bath.All covalent bonds which involve hydrogen atoms were restrained using SHAKE algorithm (Ryckaert et al., 1977).
The PME method was employed to treat long-range electrostatic interactions with an integration step of 2 fs (Darden et al., 1993).The long-range non-bonded interactions were calculated with a cutoff distance of 9.0 Å by applying periodic boundary conditions.Analyses and extraction of structural snapshots were carried out using CPPTRAJ module (Roe and Cheatham, 2013), Visual Molecular Dynamics, and Discovery Studio Visualizer softwares (Humphrey et al., 1996), respectively.

Binding free energy calculations
The calculation of binding free energy for each system followed the Molecular Mechanics-Poisson Boltzmann solvent accessible surface area (MM-PBSA) method as described by Kollman et al. (2000) and Arba et al. (2017).The binding free energy was calculated based on 200 snapshots extracted from the last 5 ns trajectories (Miller et al. 2012).

RESULTS AND DISCUSSION
To shed light on the structure-activity relationship, QSAR study was performed using the data set shown in the Table 1.The biological activity of each compound was changed into pIC 50 and considered as the dependent variables.The values of 13 descriptors were used as the independent variables.First, determination of outlier compounds was performed by calculating their studentized deleted residual values.The results showed that four compounds (3J, 3T, 3F, 3Q) were identified as the outliers (Table 2).The four compounds were excluded from data set.Furthermore, the data set was divided into training set (15 compounds) and test set (3 compounds).The division into the two sets was performed based on their pIC 50 (Table 2).Next, the training set consisting of fifteen compounds was used to perform the multiple linear regression analysis for developing QSAR model.The multiple linear regression results were then compiled and ranked based on several statistical parameters, including the correlation coefficient (R), the determination coefficient (R 2 ), and the Fischer's value (F).The best QSAR model obtained was as follows: pIC 50 = 19.40158+ 0.00007(AM1_E) + 3.01906(AM1_HOMO) -0.04234(Apol) -0.033587(LogP) + 0.10249(vol) The model's equation satisfied the statistical requirements as shown in Table 3. Evaluation of the model with leave-one-out (LOO) crossvalidation coefficient (q 2 ) showed that the q 2 value was 0.740, which indicated that the model was valid.According to the developed QSAR model, the biological activity of a compound was influenced by total energy (AM1_E), HOMO energy (AM1_HOMO), polarity (Apol), partition coefficient (Log P), and van der Waals volume (Vol).The model implied that the positive contributions came from AM1_E, AM1_HOMO, and vol, while the negative contribution originated from Apol and Log P. The relationship between the predicted pIC 50 and observed pIC 50 are shown in Fig. 2. The reliability of the model was further assured by an external validation of the test compounds as shown in Fig. 2 with R 2pred = 0.79, implying that the developed model was valid (Golbraikh and Tropsha, 2002).

New compound design and molecular docking
Using the validated QSAR model, a new compound was designed.The new compound had lower predicted IC 50 than that of parent compound (3P).The result of calculation was shown in Table 4.The new designed compound, i.e. 1I, which had lower predicted IC 50 was then docked to the target protein (Bruton's Tyrosine Kinase).First, native ligand was redocked to the BTK and the docking result showed that both X-ray and docking conformations had root-mean-square deviation (RMSD) of 0.89 Å, indicating the reliability of the molecular docking protocol (Jones et al., 1997;Morris et al., 1998).The interaction between 2V3 and BTK occurred through hydrogen bonds with Lys430, Met477, and Ser538 (Fig. 3b).The amino acid residues constituting active site of the BTK, for instance Phe413 and Asp539, were observed to interact with 2V3 through van der Waals interactions.Fig. 3 depicts crystallographic and docking conformations of 2V3.
In case of 3P and 1I, hydrogen bonds were observed with Lys430.Hydrogen bond with Asp539 was also formed, in which ligand acted as donor of hydrogen bonds.The hydrogen bonds were also detected between 1I and Met477, in which 1I acted as both donor and acceptor of hydrogen bonds.Hydrogon bonds, in which ligand acted as the donor, were also observed with Asp521 and Tyr551 in 3P.In addition, residues of the active site surrounding the ligand interacted through hydrophobic interaction with each ligand.Fig. 4 depicts interaction of 3P and 1I in the ligand binding domain of BTK.

Molecular dynamics simulations
Based on best docking conformation, molecular dynamics simulations were carried out on each ligand which was complexed with BTK to provide structural, dynamical, and energetic information on the interactions between ligands and BTK to determine the binding affinity of those ligands to BTK.MD simulation was also performed for 2V3-BTK complex as a reference structure.Before proceeding to energetics analysis, each system equilibration was monitored by checking the values of root-mean-square-deviation (RMSD) of heavy atoms of protein with respect to the initial structure.Fig. 5 shows the RMSD plot of heavy atoms of protein for each complex with respect to simulation time.Generally, the RMSD value below 2 Å indicates the stability of each complex during 40 ns dynamics simulation.The effect of ligand bindings on the mobility of the BTK residues was reflected in root-mean-square fluctuation (RMSF) of the backbone atoms of protein that was plotted against the residue numbers, as depicted in Fig. 6.Most protein residues in each complex had RMSF values lower than 2 Å, which indicated that the ligand binding did not induce conformation change of the protein (BTK).

MM-PBSA binding free energy
Binding free energies of all complexes were calculated using the MM-PBSA method to assess the energetic aspect of association of ligands to BTK.  −33.58 kcal/mol) was extremely higher than the experimental data (ΔG exp = −11.79kcal/mol, ΔG = RT ln K d ≈ RT ln IC 50 , where IC 50 = 4 nM) (Lou et al., 2015).Despite the fact that the MM-PBSA calculation tends to be over predictive (Genheden and Ryde, 2015), the exlusion of the entropy term in the present study clearly contributes to the large discrepancy.Moreover, the binding free energies of BTK with designed ligand (1I) was more negative (−48.84) than that of 2V3 (−33.58), which indicated that 1I bound more tightly to BTK than 2V3.In the complex of 1I-BTK, the nonpolar contribution (ΔE NONPOLAR ) originated from the van der Waals energy (ΔE VDW ) and the nonpolar solvation energy (ΔE PBSUR ) was more negative than those of both 2V3 and 3P.It was also the case for the van der Waals (ΔE VDW ) and electrostatic (ΔE ELE ) energy terms.However, the overall electrostatic contributions (ΔE PBELE ) arising from favorable Coulomb interactions (ΔE ELE ) and unfavorable polar energy of desolvation (ΔE PBCAL ), did not favor the ligand binding.The MM-PBSA prediction showed that hydrophobic interactions were the main factor in the recognition of ligand to BTK.

CONCLUSION
QSAR is a mathematical relationship that correlates chemical structure and biological activity for a series of compounds.In this study, 13 descriptors were used to represent the pyrrolo[2,3-b]pyridine structures, which were taken as the independent variable, while the biological activities (pIC 50 ) were used as the dependent variable.Analysis on MLR resulted in a QSAR model that was used to predict the activity of pyrrolo [2,3b]pyridine derivative against BTK.The new compound (1I), with lower predicted IC 50 , was then docked to BTK, and it was then stable during 40 ns dynamic simulation.The MM-PBSA calculation evidenced that the designed ligand might be used for further experimental study.
Financial support and sponsorship: Nil.

Conflict of Interests:
There are no conflicts of interest.

Table 2 :
The studentized deleted residual values of each compound.Compounds which were identified as outliers and test set were assigned by * and **, respectively.

Table 4 :
The predicted IC50 of the new compound.

Table 5
reports binding free energies and separate energy terms for each ligand-BTK complex.It is shown that MM-PBSA prediction on affinity of 2V3 (ΔG pred =

Table 5 :
The binding free energies and the corresponding components (kcal/mol) of each compound bound to the BTK.