A Mathis-OpenBabel type conversion format to SMILES and vice-versa

View previous topic View next topic Go down

A Mathis-OpenBabel type conversion format to SMILES and vice-versa

Post by Cr6 on Wed Apr 08, 2015 12:27 am

Was looking at this program OpenBabel: 

http://sourceforge.net/projects/openbabel/files/latest/download

SMILES
http://www.opensmiles.org/opensmiles.html

And this program FROWNS:
http://frowns.sourceforge.net/

Thought of requirements for converting .Mol files to a Mathis style "SMILES" type syntax.
Since Mathis doesn't use Electron Bonding, another form of syntax for the Charge Field will be needed that can convert SMILES to the Charge Field and vice-versa. This will need to be more encompassing and granular than the current Valence Electron-Bonding Theory.

(Simplified Molecular Input Line Entry System) 
http://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system

Of course the Mathis style syntax will need to be as expressive and congruent with his "slots-holes-photon charge field-neutrons-photon spin direction"  based mechanical theory.

SMILES format
----------

Oc1cc2c(CCN)c[nH]c2cc1    SEROTONIN


Mol file
----------

SEROTONIN
  WLViewer          3D                             0

 25 26  0  0  0  0  0  0  0  0  0
    6.9030    1.5260   -0.0830 H   0  0  0  0  0  0  0  0  0  1
    3.0680    6.5290    0.0410 O   0  0  0  0  0  0  0  0  0  2
   11.9930    5.8600    0.1510 H   0  0  0  0  0  0  0  0  0  3
   10.9480    7.0240   -0.4660 H   0  0  0  0  0  0  0  0  0  4
   11.0560    6.2870    0.2380 N   0  0  0  0  0  0  0  0  0  5
   10.0540    5.2590    0.0370 C   0  0  0  0  0  0  0  0  0  6
    8.6080    5.8190    0.1170 C   0  0  0  0  0  0  0  0  0  7
    6.9260    2.5480   -0.0460 N   0  0  0  0  0  0  0  0  0  8
    8.1080    3.3620    0.0010 C   0  0  0  0  0  0  0  0  0  9
    7.6950    4.6450    0.0450 C   0  0  0  0  0  0  0  0  0 10
    5.3130    5.7840    0.0500 C   0  0  0  0  0  0  0  0  0 11
    6.2320    4.6610    0.0240 C   0  0  0  0  0  0  0  0  0 12
    5.8050    3.4110   -0.0300 C   0  0  0  0  0  0  0  0  0 13
    4.4070    3.0670   -0.0600 C   0  0  0  0  0  0  0  0  0 14
    3.5220    4.0750   -0.0380 C   0  0  0  0  0  0  0  0  0 15
    3.9970    5.5040    0.0200 C   0  0  0  0  0  0  0  0  0 16
    3.5420    7.4090    0.0800 H   0  0  0  0  0  0  0  0  0 17
   10.1900    4.8490   -0.8650 H   0  0  0  0  0  0  0  0  0 18
   10.1660    4.5590    0.7420 H   0  0  0  0  0  0  0  0  0 19
    8.4740    6.3080    0.9790 H   0  0  0  0  0  0  0  0  0 20
    8.4350    6.4390   -0.6490 H   0  0  0  0  0  0  0  0  0 21
    9.0546    3.0397    0.0011 H   0  0  0  0  0  0  0  0  0 22
    5.6440    6.7269    0.0886 H   0  0  0  0  0  0  0  0  0 23
    4.1074    2.1136   -0.0958 H   0  0  0  0  0  0  0  0  0 24
    2.5420    3.8771   -0.0602 H   0  0  0  0  0  0  0  0  0 25
  1  8  1  0  0  0
  2 16  1  0  0  0
  2 17  1  0  0  0
  3  5  1  0  0  0
  4  5  1  0  0  0
  5  6  1  0  0  0
  6 19  1  0  0  0
  6  7  1  0  0  0
  6 18  1  0  0  0
  7 20  1  0  0  0
  7 10  1  0  0  0
  7 21  1  0  0  0
  8 13  1  0  0  0
  8  9  1  0  0  0
  9 22  1  0  0  0
  9 10  2  0  0  0
 10 12  1  0  0  0
 11 23  1  0  0  0
 11 16  2  0  0  0
 11 12  1  0  0  0
 12 13  2  0  0  0
 13 14  1  0  0  0
 14 24  1  0  0  0
 14 15  2  0  0  0
 15 25  1  0  0  0
 15 16  1  0  0  0
M  END


Last edited by Cr6 on Thu Apr 09, 2015 1:21 am; edited 3 times in total

Cr6
Admin

Posts : 676
Join date : 2014-08-09

View user profile http://milesmathis.the-talk.net

Back to top Go down

cclib - parses output files from ADF, Firefly, GAMESS, GAMESS-UK, Gaussian, Jaguar and ORCA

Post by Cr6 on Thu Apr 09, 2015 12:55 am

This is a library that covers molecular modeling in python.

Description


cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from ADF, Firefly, GAMESS, GAMESS-UK, Gaussian, Jaguar and ORCA.

IMPORTANT! As of version 1.2, cclib development has moved to github. Please use the flowing pages for up-to-date information about cclib:

Repository (source code, tracker) - https://github.com/cclib/cclib
Online documentation - http://cclib.github.io/

class ccData(object):
"""Stores data extracted by cclib parsers
Description of cclib attributes:
aonames -- atomic orbital names (list of strings)
aooverlaps -- atomic orbital overlap matrix (array[2])
atombasis -- indices of atomic orbitals on each atom (list of lists)
atomcharges -- atomic partial charges (dict of arrays[1])
atomcoords -- atom coordinates (array[3], angstroms)
atommasses -- atom masses (array[1], daltons)
atomnos -- atomic numbers (array[1])
atomspins -- atomic spin densities (dict of arrays[1])
charge -- net charge of the system (integer)
ccenergies -- molecular energies with Coupled-Cluster corrections (array[2], eV)
coreelectrons -- number of core electrons in atom pseudopotentials (array[1])
enthalpy -- sum of electronic and thermal enthalpies (float, hartree/particle)
entropy -- entropy (float, hartree/particle)
etenergies -- energies of electronic transitions (array[1], 1/cm)
etoscs -- oscillator strengths of electronic transitions (array[1])
etrotats -- rotatory strengths of electronic transitions (array[1], ??)
etsecs -- singly-excited configurations for electronic transitions (list of lists)
etsyms -- symmetries of electronic transitions (list of string)
freeenergy -- sum of electronic and thermal free energies (float, hartree/particle)
fonames -- fragment orbital names (list of strings)
fooverlaps -- fragment orbital overlap matrix (array[2])
fragnames -- names of fragments (list of strings)
frags -- indices of atoms in a fragment (list of lists)
gbasis -- coefficients and exponents of Gaussian basis functions (PyQuante format)
geotargets -- targets for convergence of geometry optimization (array[1])
geovalues -- current values for convergence of geometry optmization (array[1])
grads -- current values of forces (gradients) in geometry optimization (array[3])
hessian -- elements of the force constant matrix (array[1])
homos -- molecular orbital indices of HOMO(s) (array[1])
mocoeffs -- molecular orbital coefficients (list of arrays[2])
moenergies -- molecular orbital energies (list of arrays[1], eV)
moments -- molecular multipole moments (list of arrays[], a.u.)
mosyms -- orbital symmetries (list of lists)
mpenergies -- molecular electronic energies with Møller-Plesset corrections (array[2], eV)
mult -- multiplicity of the system (integer)
natom -- number of atoms (integer)
nbasis -- number of basis functions (integer)
nmo -- number of molecular orbitals (integer)
nocoeffs -- natural orbital coefficients (array[2])
nooccnos -- natural orbital occupation numbers (array[1])
optdone -- flags whether an optimization has converged (Boolean)
scancoords -- geometries of each scan step (array[3], angstroms)
scanenergies -- energies of potential energy surface (list)
scannames -- names of varaibles scanned (list of strings)
scanparm -- values of parameters in potential energy surface (list of tuples)
scfenergies -- molecular electronic energies after SCF (Hartree-Fock, DFT) (array[1], eV)
scftargets -- targets for convergence of the SCF (array[2])
scfvalues -- current values for convergence of the SCF (list of arrays[2])
temperature -- temperature used for Thermochemistry (float, kelvin)
vibanharms -- vibrational anharmonicity constants (array[2], 1/cm)
vibdisps -- cartesian displacement vectors (array[3], delta angstrom)
vibfreqs -- vibrational frequencies (array[1], 1/cm)
vibirs -- IR intensities (array[1], km/mol)
vibramans -- Raman intensities (array[1], A^4/Da)
vibsyms -- symmetries of vibrations (list of strings)

Cr6
Admin

Posts : 676
Join date : 2014-08-09

View user profile http://milesmathis.the-talk.net

Back to top Go down

Re: A Mathis-OpenBabel type conversion format to SMILES and vice-versa

Post by Cr6 on Thu Apr 09, 2015 1:04 am

http://www.opensmiles.org/opensmiles.html


This document is intended for developers designing or improving a SMILES parser or writer. Readers are expected to be acquainted with SMILES. Due to the formality of this document, it is not a good tutorial for those trying to learn SMILES. This document is written with precision as the primary goal; readability is secondary.

What is a Molecule? The Valence Model of Chemistry

Before defining the SMILES language, it is important to state the physical model on which it is based: the valence model of chemistry, which uses a mathematician’s graph to represent a molecule. In a chemical graph, the nodes are atoms, and the edges are semi-rigid bonds that can be single, double, or triple according to the rules of valence bond theory.
This simple mental model has little resemblance to the underlying quantum-mechanical reality of electrons, protons and neutrons, yet it has proved to be a remarkably useful approximation of how atoms behave in close proximity to one another. However, the valence model is an imperfect representation of molecular structure, and the SMILES language inherits these imperfections. Chemical bonds are often tautomeric, aromatic or otherwise fractional rather than neat integer multiples. Delocalized bonds, bond-centered bonds, hydrogen bonds and various other inter-atom forces that are well characterized by a quantum-mechanics description simply don’t fit into the valence model.
"If you can build a molecule from a modeling kit, you can name it."
McLeod Peters
McLeod and Peter’s quip captures the deficiencies of SMILES well: if you can’t build a molecule from a modeling kit, the deficiencies of SMILES and other connection-table formats become apparent.

Cr6
Admin

Posts : 676
Join date : 2014-08-09

View user profile http://milesmathis.the-talk.net

Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum