Journal of Pharmaceutics & Drug Delivery Research ISSN: 2325-9604

Reach Us +1 209 730 0872
All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Research Article, J Pharm Drug Deliv Res Vol: 1 Issue: 2

Prediction of Aqueous Solubility of Organic Solvents as a Function of Selected Molecular Properties

Kamal I. Al-Malah*
Department of Chemical Engineering, University of Hail, Hail, Saudi Arabia
Corresponding author : Kamal I. Al-Malah
Department of Chemical Engineering, University of Hail, Hail, Saudi Arabia
Tel: +966-556334640
E-mail: [email protected]
Received: August 21, 2012 Accepted: October 17, 2012 Published: October 22, 2012
Citation: Al-Malah KI (2012) Prediction of Aqueous Solubility of Organic Solvents as a Function of Selected Molecular Properties. J Pharm Drug Deliv Res 1:2. doi:10.4172/2325-9604.1000106

Abstract

Prediction of Aqueous Solubility of Organic Solvents as a Function of Selected Molecular Properties

The aqueous solubility of 66 organic solvents was examined. As an organic solvent molecule is not always simple (single-carbon) organic species, the aqueous solubility was tested as a function of some selected molecular properties (descriptors); namely and using VolSurf+ notation, CP, D1, D2, D3, FLEX, G, HSA, LogKow, MW, PHSAR, POL, PSA, PSAR, R, S, V, W1, W2, and W3. In general, the organic solvent solubility data can be best described by the two molecular descriptors LogKow and R, with PHSAR as the third refining or tuning-up factor (weight function).

Keywords: Organic solvent; Solubility; Molecular descriptor; Rugosity; Polar surface area; Octanol-water partition coefficient; VolSurf+ software

Keywords

Organic solvent; Solubility; Molecular descriptor; Rugosity; Polar surface area; Octanol-water partition coefficient; VolSurf+ software

Introduction

Organic solvents nowadays have many applications in almost all avenues of industrial life. They are used in emulsion and microemulsion formulation as either solvent or co-surfactant as is the case in detergent, cosmetic, paint, and pharmaceutical industries. They are also used in liquid-liquid extraction and absorption processes. Moreover, solvents may be used as a reaction medium to bring reactants together, as a reactant to react with a solute when it cannot be dissolved, and as a carrier, to deliver chemical compounds in solutions to their point of use in the required amounts [1]. In pharmaceutical industries, crystallization from solution is widely used for the purification of pharmaceutical active ingredients and excipients during the final stages of manufacture. The type of solvent being used influences the morphology of crystals obtained. For example, the polarity of the solvent affects the crystal morphology of ibuprofen. The use of methanol results in symmetrical and smooth crystals while the use of a low polarity solvent (such as acetone) results in elongated crystals [2].
Barwick [3] explained that there are three schemes being considered in solvent classification: the Rohrschneider-Snyder classification scheme, the solvatochromic scheme, and the Hildebrand polarity scale. The Rohrschneider-Snyder for classifying solvents is according to their “polarity” or chromatographic strength (P) and selectivity (xi). The solvatochromic approach is based on gas-liquid partition coefficients being correlated with the solvatochromic scales describing solvent dipolarizability-polarizability (π*), hydrogen bond acidity (α) and basicity (β). The Hildebrand solubility parameter, δ, is reported as being the most widely applied index of solvent and solute polarity. It is shown that the molecular size of a solute will affect its relative solubility. The larger the molecular volume, then the greater the effect of a change in solvent polarity will be on the solubility of the solute. Solute solubility is at a maximum when the solute and solvent have the same δ value.
Klopman et al. [4] proposed an estimation of the aqueous solubility of organic molecules by the group contribution approach. The learning set consisted of 1168 organic compounds with experimental data taken from the literature after critical evaluation. The best method, based on a new fragment atom scheme, led to a squared correlation coefficient of 0.95.
Nordström et al. [5] investigated the relationships between solubility, temperature dependence of solubility, melting temperature and melting enthalpy for the purpose of finding relations that could significantly reduce the need for experimental work in the selection of the solvent for processing of organic fine chemicals and pharmaceuticals. The relationships were investigated theoretically and by evaluation of experimental data for 41 organic and pharmaceutical compounds comprising a total of 115 solubility curves in organic and aqueous solvents. The work considers selection of the equation for correlation of solubility data based on thermodynamic considerations and ability to predict melting properties of the solute from solubility data.
Wang et al. [6] developed four reliable aqueous solubility models, ASM-ATC (aqueous solubility model based on atom type counts), ASM-ATC-LOGP (aqueous solubility model based on atom type counts and ClogP as an additional descriptor), ASM-SAS (aqueous solubility model based on solvent accessible surface areas), and ASMSAS- LOGP (aqueous solubility model based on solvent accessible surface areas and ClogP as an additional descriptor), using a diverse data set of 3664 compounds. All four models were extensively validated by various cross-validation tests, and encouraging predictability was achieved. ASM-ATC-LOGP was the best model; achieved a correlation coefficient square (R2) and root-mean-square error (RMSE) of 0.832 and 0.840 logarithm unit, respectively.
In previous works, the aqueous solubility of simple inorganic [7] and simple (single-carbon) organic [8] molecules was examined and expressed in terms of important molecular properties. In the present article, the aqueous solubility of some organic solvents will be examined as a function of some selected molecular descriptors which are thought to affect the solvation process. The non-linear regression method coupled with the correlation coefficient, R2, will be used as a tool to find the best fitted or estimated model parameters that better describe the objective function, that is, the aqueous solubility of a given organic solvent, as a function of its molecular relevant descriptors. Finally, it is to be mentioned, here, that this study is not a typical QSAR study; where QSAR needs a large, broad, diversified and well distributed set of compounds, which is then randomly divided into a training set and a smaller test set. Instead, this is a typical curve-fitting study; where the dependent variable is expressed here as a function of, at most, two independent variables, chosen at a time, out of the list of pertinent variables.

Theory

There are different packages, available via world-wide web, for calculation of molecular descriptors. DRAGON® (http://www.talete.mi.it/dragon.htm), MarvinSketch® (http://www.chemaxon.com/products/marvin/), and VolSurf+® (http://www.moldiscovery.com/) are just examples of such packages. Chemometric packages of VolSurf+® were used here to evaluate the molecular descriptors for a given solvent. However, the Simplified Molecular Input Line Entry System (SMILES) formula for the given solvent is required as an input argument by VolSurf+®. The aqueous solubility, at room temperature, of all examined solvents was experimentally measured and given by the Estimation Programs Interface (EPI) SuiteTM, developed by the US Environmental Protection Agency’s Office of Pollution Prevention and Toxics and Syracuse Research Corporation (SRC). It can be downloaded from: http://www.epa.gov/oppt/exposure/pubs/episuitedl.htm.
The aqueous solubility at 25°C, WS25C, will be expressed as a function of the best two variables out of the following list of pertinent molecular parameters. It should be pointed out that the nomenclature of VolSurf+® software was used, except for LogKow. The following molecular properties were chosen and presented in alphabetical order: CP, D1, D2, D3, FLEX, G, HSA, LogKow, MW, PHSAR, POL, PSA, PSAR, R, S, V, W1, W2, and W3.
The definition of each molecular descriptor is shown below.
CP: Critical packing (CP) parameter defining a ratio between the hydrophilic and lipophilic part of a molecule. In contrast to the hydrophilic-lipophilic balance, critical packing refers just to molecular shape.
It is defined as:
CP= (Lipophilic Volume)/(Hydrophilic Surface*Lipophilic Length)
D1, D2, D3: Hydrophobic volumes (D1 - D8). VolSurf+® uses a probe called DRY to generate 3D lipophilic fields. In analogy to hydrophilic regions, hydrophobic regions may be defined as the molecular envelope generating attractive hydrophobic interactions. VolSurf+ computes hydrophobic descriptors at eight different energy levels adapted to the usual energy range of hydrophobic interactions (from -0.2 to -1.6 kcal/mol). It should be pointed out here that the first three hydrophobic volumes were chosen as it turns out that, beyond D3, the numerical values of D4-D8 generally reduce to zero.
FLEX: A flexibility parameter which represents the maximum flexibility of a molecule. For each molecule 50 conformers (random) are produced, and FLEX represents the Log, averaged on atom ‘i’, result of the differences between the maximum and minimum distances of atom ‘i’ in a selected conformer with the atom ‘i’ in all the other conformers.
G: Molecular globularity which is defined as S/Sequiv with Sequiv = surface area of a sphere having the same volume V, given that S and V are the molecular surface and volume, respectively. Globularity is 1.0 for a perfect spherical molecule. It assumes values greater than 1.0 for a real spheroidal molecule.
HSA: Hydrophobic surface area which is calculated via the sum of hydrophobic region contributions.
LogKow: The logarithm of the partition coefficient between 1-octanol and water. All values were experimentally measured except for the following solvents: 2-Ethyl-1-hexanol, Isopropyl Acetate, Dimethyl Carbonate, DimethylPropylene Urea, and 2-Methylpentane. LogKow values, whether experimentally measured or estimated, were all given by the Estimation Programs Interface (EPI) SuiteTM (see, the first paragraph of this section).
MW: Molecular weight of solvent.
PHSAR: The ratio between the polar surface area (PSA) and the hydrophobic surface area (HSA).
POL: An estimation of the average molecular polarizability. This method is based on the structure of the compounds (and not any molecular field) and is therefore independent of the number and type of probes used. There is a strong correlation between the experimental molecular polarizability and the polarizability calculated with VolSurf+® for more than 300 chemicals.
PSA: Polar Surface Area. It is calculated via the sum of polar region contributions.
PSAR: The ratio between the polar surface area (PSA) and the Surface (S).
R: Rugosity, a measure of molecular wrinkled surface; it represents the ratio of volume/surface (V/S). The smaller the ratio, the larger is the rugosity. For a sphere, the rugosity is the radius of the sphere divided by 3. The greater the difference from a sphere, the smaller is the ratio and so does the rugosity.
S: Molecular surface; represents the accessible surface (in Å2) traced out by a water probe interacting at +0.2 kcal/mol when a water molecule rolls over the target molecule.
V: Molecular volume; represents the water-excluded volume (in Å3), i.e., the volume enclosed by the water-accessible surface computed at a repulsive value of +0.2 kcal/mol.
W1, W2, W3: Hydrophilic volumes. Each describes the molecular envelope which is accessible to and attractively interacts with water molecules. The volume of this envelope varies with the level of interaction energies. Hydrophilic descriptors computed from molecular fields of -0.2 to -1.0 kcal/mol (W1 - W3) account for polarizability and dispersion forces which are available in all types of solvents whether they are polar or non-polar.
Table 1 lists the organic solvents to be examined, along with their SMILES formula, and aqueous solubility at 25°C. Table 2 shows VolSurf+®-calculated molecular descriptors for the examined solvents.
Table 1: The organic solvents, their SMILES formula, and their experimental aqueous solubility, @25°C, WS25C, as given by the Estimation Programs Interface (EPI) SuiteTM (http://www.epa.gov/oppt/exposure/pubs/episuitedl.htm).
Table 2: The molecular descriptors of organic solvents, calculated using VolSurf+R.

Results and Discussion

The MATLAB® surface fitting tool was used to fit a multi-dimensional, non-linear regression problem as is the case here. The general formula for curve-fitting is:
Z = (X,Y)
Where z is the dependent variable and X & Y are the independent variables.
To facilitate the process of curve-fitting and gain a better insight, two independent variables only at a time are chosen to make the regression process reliable. The aqueous solubility of a solvent at 25°C (WS25C) was fitted as a function of only two variables (X) and (Y) out of the list of molecular descriptors given in Table 2. The following simple model was used as a tool to conduct a comparison among different pair-wise combinations of molecular descriptors to see which will better predict the variability in solvent aqueous solubility:
WS25C = a + b.X + c.Y       (1)
It should be pointed out here that the model form (i.e., a polynomial of first degree both in X and Y) is identical for X and Y. Thus, the order of variables is immaterial. All possible pair-wise permutations (170 non-repeated pairs) were tested. Table 3 shows the top five cases with the highest correlation coefficient, R2.
Table 3: List of the best five cases, out of all possible pairwise permutations (170 non-repeated cases), with the highest correlation coefficient, R2.
From Table 3, it can be seen that almost any of the five cases may be used to predict (or, describe) the variability of organic solvent aqueous solubility. One more thing to notice is that LogKow is present in all five cases. Further deciphering of the aqueous solubility relationship can be done by incorporating the weight function in the non-linear regression process.
If a weighted non-linear regression is carried out, further improvement can be achieved. This was done when considering curve-fitting while the weight functions is permutated over all other variables except the variables under concern. Table 4 shows the results of the weighted curve-fitting.
WS25C = f (Molecular descriptor from Table3,LogKow)       (2)
Table 4: The best five weighted non-linear regression cases out of those possible permutated cases, presented in Table 3.
From Table 4 one can see that, in all 5 best cases, the log partition coefficient between octanol and water, LogKow, is the most important factor among all examined molecular descriptors, followed by the molecular rugosity, R.
Figure 1 shows the aqueous solubility of examined organic solvent as a function of R and LogKow molecular descriptors with the weight function being PHSAR, which represents case 1 in Table 4. The red zone means maximum solubility and the dark blue means minimum. The maximum solubility is found at both very low R and LogKow values. R2, with PHSAR as a weight factor, is 0.9696. Figure 2 shows the contour plot of aqueous solubility of the examined organic solvents as a function R and LogKow, weighted by the function PHSAR. Table 5 shows the curve-fitted parameters for the organic solvent aqueous solubility, WS25C, as given by Eq. (1) fitted to R and LogKow, and augmented by the weight function, PHSAR.
Figure 1: The organic solvent aqueous solubility, WS25C, as a function of R and LogKow.
Figure 2: The contour plot for the organic aqueous solubility, WS25C, as a function of R and LogKow.
Table 5: Curve-fitted parameters of the solubility model (Equation 1), accompanied by the 95% confidence interval for different organic solvents. Solubility data is in mg/L (ppm) and evaluated @ 25°C. WS25C = a+b.X+c.Y = a+b.R+c.logow.
The following proposition is presented here in light of traits shown in figures 1 and 2:
To have an organic solvent with high aqueous solubility, it is proposed here that it has to have a low value of both LogKow and R, accompanied by a high value of polar to hydrophobic surface area ratio, PHSAR. Examples of organic solvents meeting the afore-mentioned criterion are: formamide, di-methyl sulfoxide, methanol, ethylene glycol, formic acid, ethanol, pyridine, and acetic acid. Such solvent are characterized by the lowest values of R and LogKow and the highest PHSAR. On the contrary, the following materials, which have low WS25C, are characterized by the highest values of R and LogKow and the lowest PHSAR: n-octane, heptane, hexane, methyl-cyclohexane, cyclo-hexane, tetra-chloroethylene, mesitylene, and ethylbenzene. Other organic solvents more or less violate the afore-mentioned criterion in one aspect or another. While looking at figure 2, and starting from the left bottom corner, where the maximum solubility occurs, one may notice that the general trend is that the aqueous solubility decreases as diagonally moving off the left bottom corner.
The proposition that to have an organic solvent with high aqueous solubility, WS25C, is discernable in terms of requiring low value of LogKow and high PHSAR. A low value of LogKow means that the organic solvent partitions such that its concentration in water is higher than that in octanol. High PHSAR value obviously means that the exterior surface of the organic solvent is predominantly hydrophilic (or, polar) in nature. This is in harmony with what was reported by Martınez-Aragon et al. [9] that dielectric constant, dipole moment, and hydrogen bonding were indirect indicators of polarity, and, therefore, less sensitive to changes in polarity than direct indicators such as log P (i.e., LogKow in our case). Therefore, log P seems to describe the changes in polarity better.
On the other hand, WS25C was found to increase with decreasing R, and as pointed earlier, the greater the difference from a sphere, the smaller is the ratio and so does the rugosity, R. That simply means given two molecules having the same volume, V, the first is completely spherical and the second can assume any other shape or geometry (hence, both have different S values), then the spherical molecule will have a lower aqueous solubility than a molecule with any other shape or geometry. Thus, the smaller R, the larger is S for the same volume. At high WS25C, the organic solvent molecule is essentially characterized by having a more hydrophilic (or, polar) surface area or patch; then increasing S will do a favor in terms of dissolving the organic solvent in water as there will be no entropic forces counteracting the dissolution process and van der Waals’ attractive forces are overwhelming. On the contrary, at low WS25C, the organic solvent molecule will have a more hydrophobic (or, non-polar) surface area or patch, hence, more resistance to dissolution in terms of entropic driving forces that will counteract the dissolution process and end up with phase separation of organic solvent phase from water phase. Under such conditions decreasing R (or, increasing S) will even worsen the dissolution process. Since the molecular surface, S, represents the accessible surface (in Å2) traced out by a water probe interacting at +0.2 kcal/mol when a water molecule rolls over the target molecule, that explains, in principle, why such a typical behavior of a non-polar (hydrophobic) solvent where the contact area between the solute (organic solvent in this case) and solvent (water in this case) is minimized via assuming the spherical shape in the form of a droplet. The fact that the dark blue region (minimum WS25C) in figure 2 lies on the top right corner where it is characterized by high R (i.e., low S or spherical in shape) and high LogKow (hydrophobic or non-polar in nature) is in line with this notion.
It is worth mentioning in this regard that Torrens and Castellano [10] calculated the fractal dimension (a measure of surface accessibility toward different solvents) and other descriptors for homologous series of phenyl alcohols and 4-alkylanilines, as models for drugs that could be administrated in a patch joined to the skin. They made a comparison between 4-alkylanilines and phenyl alcohols and showed that the smaller polar character of the former caused their less negative (Gibbs free energy of solvation in water) and greater hydrophobicity. Both series were distinguished by the molecular rugosity ' G. It is worth-mentioning here that the molecular rugosity, ' G, as defined by Torrens and Castellano [10] has the unit of ångström inverse (Å-1); i.e., the ratio of surface/volume (S/V), which is exactly the reciprocal value of R, in our case.

Conclusions

1. In general, the organic solvent solubility data can be best described by the two molecular properties: LogKow and R; with PHSAR as the third refining or tuning-up factor (weight function).
2. To have an organic solvent with high aqueous solubility, it is proposed that it has to have a low value of both the log partition coefficient between octanol and water, LogKow, and the molecular rugosity, R=V/S, accompanied by a high value of polar to hydrophobic surface area ratio, PHSAR. This is in line with the fact that hydrophobic organic solvents, which are the least soluble in water, assume spherical (droplet) shape in an aqueous environment.
3. Examples of organic solvents meeting the afore-mentioned criterion are: formamide, di-methyl sulfoxide, methanol, ethylene glycol, formic acid, ethanol, pyridine, and acetic acid. Such solvent are characterized by the lowest values of R and LogKow and the highest PHSAR.
4. The following organic solvents, which have low WS25C, are characterized by the highest values of R and LogKow and the lowest PHSAR: n-ocatane, heptane, hexane, methyl-cyclohexane, cyclo-hexane, tetra-chloroethylene, mesitylene, and ethylbenzene.

References











Track Your Manuscript

Share This Page