Skip to content

Commit

Permalink
updated sdfs to remove failed molecules and updated scripts
Browse files Browse the repository at this point in the history
  • Loading branch information
hechth committed Mar 19, 2024
1 parent 5c91113 commit b4954ce
Show file tree
Hide file tree
Showing 11 changed files with 2,504 additions and 2,110 deletions.
264 changes: 254 additions & 10 deletions analysis/Python_scripts/comparisons.ipynb

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions analysis/Python_scripts/reference_datasets_means.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
,n_atoms,aromatic_nitrogens,molecular_complexity,molecular_flexibility,rotatable_bonds,stereo_centers,electronegative_atoms
RCX2024,33.014,0.452,0.730,0.365,3.264,0.749,4.741
Wang2020,21.954,0.056,0.473,0.330,2.654,0.663,1.783
Wang2022_ES,22.825,0.000,0.468,0.302,1.700,0.375,1.325
Wang2022_TMS,33.673,0.126,0.527,0.630,4.408,0.332,2.396
Schreckenbach2021,31.031,0.031,0.594,0.515,4.219,0.594,5.938
Asgeirsson2017,19.762,0.000,0.271,0.401,1.429,0.286,1.762
Lee2022,18.278,1.392,0.756,0.043,0.000,0.000,5.494
8 changes: 8 additions & 0 deletions analysis/Python_scripts/study_comparisons.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
,natomsmean,aromaticnitrogensmean,molecularcomplexitymean,molecularflexibilitymean,rotatablebondsmean,stereocentersmean,electronegativeatomsmean,natomsmin,aromaticnitrogensmin,molecularcomplexitymin,molecularflexibilitymin,rotatablebondsmin,stereocentersmin,electronegativeatomsmin,natomsmax,aromaticnitrogensmax,molecularcomplexitymax,molecularflexibilitymax,rotatablebondsmax,stereocentersmax,electronegativeatomsmax
this,33.01,0.45,0.73,0.36,3.26,0.75,4.74,12.00,0.00,0.38,0.00,0.00,0.00,0.00,80.00,3.00,1.18,0.85,21.00,9.00,14.00
Wang2020,21.95,0.06,0.47,0.33,2.65,0.66,1.78,7.00,0.00,0.12,0.00,0.00,0.00,0.00,59.00,3.00,0.80,0.86,10.00,6.00,8.00
Wang2022_ES,22.82,0.00,0.47,0.30,1.70,0.38,1.32,8.00,0.00,0.27,0.00,0.00,0.00,0.00,56.00,0.00,0.77,0.69,8.00,3.00,5.00
Wang2022_TMS,33.67,0.13,0.53,0.63,4.41,0.33,2.40,17.00,0.00,0.26,0.22,1.00,0.00,1.00,58.00,4.00,0.84,0.91,14.00,4.00,6.00
Schreckenbach2021,31.03,0.03,0.59,0.51,4.22,0.59,5.94,11.00,0.00,0.35,0.00,0.00,0.00,2.00,74.00,1.00,0.85,0.90,16.00,8.00,12.00
Asgeirsson2017,19.76,0.00,0.27,0.40,1.43,0.29,1.76,6.00,0.00,0.00,0.00,0.00,0.00,0.00,49.00,0.00,0.69,0.89,5.00,6.00,8.00
Lee2022,18.28,1.39,0.76,0.04,0.00,0.00,5.49,12.00,0.00,0.67,0.00,0.00,0.00,4.00,24.00,4.00,0.82,0.23,0.00,0.00,8.00
22 changes: 22 additions & 0 deletions analysis/Python_scripts/study_comparisons_t.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
,this,Wang2020,Wang2022_ES,Wang2022_TMS,Schreckenbach2021,Asgeirsson2017,Lee2022
natomsmean,33.01,21.95,22.82,33.67,31.03,19.76,18.28
aromaticnitrogensmean,0.45,0.06,0.00,0.13,0.03,0.00,1.39
molecularcomplexitymean,0.73,0.47,0.47,0.53,0.59,0.27,0.76
molecularflexibilitymean,0.36,0.33,0.30,0.63,0.51,0.40,0.04
rotatablebondsmean,3.26,2.65,1.70,4.41,4.22,1.43,0.00
stereocentersmean,0.75,0.66,0.38,0.33,0.59,0.29,0.00
electronegativeatomsmean,4.74,1.78,1.32,2.40,5.94,1.76,5.49
natomsmin,12.00,7.00,8.00,17.00,11.00,6.00,12.00
aromaticnitrogensmin,0.00,0.00,0.00,0.00,0.00,0.00,0.00
molecularcomplexitymin,0.38,0.12,0.27,0.26,0.35,0.00,0.67
molecularflexibilitymin,0.00,0.00,0.00,0.22,0.00,0.00,0.00
rotatablebondsmin,0.00,0.00,0.00,1.00,0.00,0.00,0.00
stereocentersmin,0.00,0.00,0.00,0.00,0.00,0.00,0.00
electronegativeatomsmin,0.00,0.00,0.00,1.00,2.00,0.00,4.00
natomsmax,80.00,59.00,56.00,58.00,74.00,49.00,24.00
aromaticnitrogensmax,3.00,3.00,0.00,4.00,1.00,0.00,4.00
molecularcomplexitymax,1.18,0.80,0.77,0.84,0.85,0.69,0.82
molecularflexibilitymax,0.85,0.86,0.69,0.91,0.90,0.89,0.23
rotatablebondsmax,21.00,10.00,8.00,14.00,16.00,5.00,0.00
stereocentersmax,9.00,6.00,3.00,4.00,8.00,6.00,0.00
electronegativeatomsmax,14.00,8.00,5.00,6.00,12.00,8.00,8.00
3 changes: 2 additions & 1 deletion analysis/Python_scripts/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,12 +161,13 @@ def sdf_to_dataframe(molecules: Chem.SDMolSupplier) -> pd.DataFrame:
pd.DataFrame: The converted DataFrame.
"""
df = pd.DataFrame({
"n_atoms": [int(AddHs(m).GetNumAtoms()) for m in molecules],
"aromatic_nitrogens": [int(m.GetProp("Aromatic Nitrogens")) for m in molecules],
"molecular_complexity": [float(m.GetProp("Molecular Complexity")) for m in molecules],
"molecular_flexibility": [float(m.GetProp("Molecular Flexibility")) for m in molecules],
"rotatable_bonds": [int(m.GetProp("Rotatable Bonds")) for m in molecules],
"stereo_centers": [int(m.GetProp("Stereo Centers")) for m in molecules],
"electronegative_atoms": [int(m.GetProp("Electronegative Atoms")) for m in molecules]
"electronegative_atoms": [int(m.GetProp("Electronegative Atoms")) for m in molecules],
})
return df

Expand Down
8 changes: 8 additions & 0 deletions analysis/data/reference/reference_datasets_means.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
,n_atoms,aromatic_nitrogens,molecular_complexity,molecular_flexibility,rotatable_bonds,stereo_centers,electronegative_atoms
RCX2024,33.013623978201636,0.45231607629427795,0.7298077929155314,0.36499000844686647,3.2643051771117166,0.7493188010899182,4.741144414168938
Wang2020,21.607981220657276,0.0539906103286385,0.45897370892018785,0.32600432394366197,2.57981220657277,0.6572769953051644,1.7511737089201878
Wang2022_ES,24.0,0.0,0.39790638297872344,0.30278372340425525,1.6808510638297873,0.7446808510638298,1.425531914893617
Wang2022_TMS,33.675879396984925,0.12562814070351758,0.5245158542713568,0.6300572110552763,4.400753768844221,0.3417085427135678,2.408291457286432
Schreckenbach2021,30.90909090909091,0.030303030303030304,0.5764693939393939,0.502788393939394,4.121212121212121,0.696969696969697,6.03030303030303
Asgeirsson2017,19.761904761904763,0.0,0.27145,0.40097523809523805,1.4285714285714286,0.2857142857142857,1.7619047619047619
Lee2022,18.27848101265823,1.3924050632911393,0.7562775949367087,0.04332087215189873,0.0,0.0,5.493670886075949
Loading

0 comments on commit b4954ce

Please sign in to comment.