Friday, February 18, 2011

An acronym in cheminformatics: GEAM for group environment annotation marker

GEAM stands for group environment annotation marker. In chemistry and molecular graph theory the word group means subgraph or substructure. A group is a formal part of a molecule— a formal building block. In those cases, where it coincidences with an observable chemical (sub)species, it is typically called a fragment.

The word group is best known from its occurrence in the term group contribution method or group contribution model (GCM), referring to the concept of group additivity that rationalizes certain molecular properties as an additive function of group contributions (incremental values associated with particular groups).

GCMs typically define their own notations to specify groups. The chemical language CurlySMILES provides an independent approach to encode groups: a substructure is encoded like the structure of a whole molecule in SMILES. Then, each substructure-defining open bond is encoded as an annotation to the atomic node at which the open bond occurs. For example, the
cyclobutanecarbonyl group of corresponding acyl halides can be encoded in CurlySMILES as follows:

C1CCC1C{-X}=O

The GEAM -X indicates the possible group environment: a halide atom (F, Cl, Br or I). To indicate an environment of alkyl instead of halide groups, one would use the notation C1CCC1C{-R}=O for the cyclobutylcarbonyl group. Restriction of the environment to alkyl groups with 4 to 7 C atoms is done with the notation C1CCC1C{-Rn=4-7}=O. Further examples of group environment notations, including terminal groups as well as multivalent groups are available.

References
[1] Axel Drefahl: CurlySMILES: a chemical language to customize and annotate encodings of molecular and nanodevice structures. Journal of Cheminformatics 2011, 3:1.
DOI: 10.1186/1758-2946-3-1.
[2] Axel Drefahl: CurlySMILES: group environment annotation [www.axeleratio.com/csm/proj/grpenvann.htm].

No comments:

Post a Comment