/** @page libsbml-python-math Mathematical Expressions and their Manipulation
This section describes libSBML's facilities for working with SBML
representations of mathematical expressions.
@section math-overview Basic Concepts
LibSBML uses Abstract Syntax
Trees (ASTs) to provide a canonical, in-memory representation for all
mathematical formulas regardless of their original format (i.e., C-like
infix strings or MathML). In
libSBML, an AST is a collection of one or more objects of type ASTNode_t.
An AST @em node in libSBML is a recursive structure containing a pointer to
the node's value (which might be, for example, a number or a symbol) and a
list of children nodes. Each ASTNode_t node may have none, one, two, or
more child depending on its type. The following diagram illustrates an
example of how the mathematical expression "1 + 2" is represented as an AST
with one @em plus node having two @em integer children nodes for the
numbers 1 and 2. The figure also shows the corresponding MathML 2.0 representation:
@image html astnode-illustration.jpg "Example AST representation of a mathematical expression."
@image latex astnode-illustration.jpg "Example AST representation of a mathematical expression."
The following are noteworthy about the AST representation in libSBML:
@li A numerical value represented in MathML 2.0 as a real number with an
exponent is preserved as such in the AST node representation, even if the
number could be stored in a @c float data type. This is done so that when
an SBML model is read in and then written out again, the amount of change
introduced by libSBML to the SBML during the round-trip activity is
minimized.
@li Rational numbers are represented in an AST node using separate
numerator and denominator values. These can be retrieved using the
methods ASTNode.getNumerator() and ASTNode.getDenominator().
@li The children of an ASTNode are other ASTNode objects. The list of
children is empty for nodes that are leaf elements, such as numbers.
For nodes that are actually roots of expression subtrees, the list of
children points to the parsed objects that make up the rest of the
expression.
For many applications, the details of ASTs are irrelevant because the
applications can use the text-string based translation functions such as
libsbml.formulaToString() and libsbml.parseFormula(). If you find the
complexity of using the AST representation of expressions too high for your
purposes, perhaps the string-based functions will be more suitable.
Finally, it is worth noting that the AST and MathML handling code in
libSBML remains written in C, not C++, as all of libSBML was originally
written in C. Readers may occasionally wonder why some aspects are more
C-like than following a C++ style, and that's the reason.
@section math-convert Converting between ASTs and Text Strings
SBML Level 2 represents mathematical expressions using MathML 2.0 (more specifically, a
subset of the content portion of MathML 2.0), but most
software applications using libSBML do not use MathML directly. Instead,
applications generally either interact with mathematics in text-string
form, or else they use the API for working with Abstract Syntax Trees
(described below). LibSBML provides support for both approaches. The
libSBML formula parser has been carefully engineered so that
transformations from MathML to infix string notation and back is
possible with a minimum of disruption to the structure of the mathematical
expression.
The example below shows a simple program that, when run, takes a MathML
string compiled into the program, converts it to an AST, converts
that to an infix representation of the formula, compares it to the
expected form of that formula, and finally translates that formula back to
MathML and displays it. The output displayed on the terminal should have
the same structure as the MathML it started with. The program is a simple
example of using the various MathML and AST reading and writing methods,
and shows that libSBML preserves the ordering and structure of the
mathematical expressions.
@verbatim
import libsbml
expected = "1 + f(x)"
xml = ""\
""
ast = libsbml.readMathMLFromString(xml)
result = libsbml.formulaToString(ast)
if (result == text):
print "Got expected result"
else:
print "Mismatch after readMathMLFromString()"
new_mathml = libsbml.parseFormula(result)
new_string = libsbml.writeMathMLToString(new_mathml)
print "Result of writing AST to string: "
print new_string
@endverbatim
The text-string form of mathematical formulas produced by
libsbml.formulaToString() and read by libsbml.parseFormula() are simple
C-inspired infix notation taken from SBML Level 1. It is summarized
in the next section below. A formula in this text-string form therefore
can be handed to a program that understands SBML Level 1 mathematical
expressions, or used as part of a translation system. In summary, the
functions available are the following:
@li \link formulaToString()
libsbml.formulaToString(ASTNode_t)
\endlink \f$\rightarrow\f$
@c string
reads an AST, converts it to a text string in SBML Level 1 formula
syntax, and returns it. The caller owns the character string returned and
should free it after it is no longer needed.
@li \link parseFormula() libsbml.parseFormula(string)
\endlink
\f$\rightarrow\f$ @c ASTNode_t
reads a text-string containing a mathematical expression in
SBML Level 1 syntax, and returns an AST corresponding to the
expression.
@section math-diffs The String Formula Syntax and Differences with MathML
The text-string formula syntax is an infix notation essentially derived
from the syntax of the C programming language and was originally used in
SBML Level 1. The formula strings may contain operators, function
calls, symbols, and white space characters. The allowable white space
characters are tab and space. The following are illustrative examples of
formulas expressed in the syntax:
@verbatim
0.10 * k4^2
@endverbatim
@verbatim
(vm * s1)/(km + s1)
@endverbatim
The following table shows the precedence rules in this syntax. In the
Class column, @em operand implies the construct is an operand, @em prefix
implies the operation is applied to the following arguments, @em unary
implies there is one argument, and @em binary implies there are two
arguments. The values in the Precedence column show how the order of
different types of operation are determined. For example, the expression
a * b + c is evaluated as (a * b) + c because the @c *
operator has higher precedence. The Associates column shows how the order
of similar precedence operations is determined; for example, a - b +
c is evaluated as (a - b) + c because the @c + and @c -
operators are left-associative. The precedence and associativity rules are
taken from the C programming language, except for the symbol @c ^, which is
used in C for a different purpose. (Exponentiation can be invoked using
either @c ^ or the function @c power.)
@image html string-syntax.jpg "Table of precedence rules."
@image latex string-syntax.jpg "Table of precedence rules."
A program parsing a formula in an SBML model should assume that names
appearing in the formula are the identifiers of Species, Parameter,
Compartment, FunctionDefinition, or Reaction objects defined in a model.
When a function call is involved, the syntax consists of a function
identifier, followed by optional white space, followed by an opening
parenthesis, followed by a sequence of zero or more arguments separated by
commas (with each comma optionally preceded and/or followed by zero or more
white space characters), followed by a closing parenthesis. There is an
almost one-to-one mapping between the list of predefined functions
available, and those defined in MathML. All of the MathML funcctions are
recognized; this set is larger than the functions defined in SBML Level 1.
In the subset of functions that overlap between MathML and SBML Level 1,
there exist a few differences. The following table summarizes the
differences between the predefined functions in SBML Level 1 and the MathML
equivalents in SBML Level 2:
Text string formula functions |
MathML equivalents in SBML Level 2 |
acos | arccos |
asin | arcsin |
atan | arctan |
ceil | ceiling |
log | ln |
log10(x) | log(10, x) |
pow(x, y) | power(x, y) |
sqr(x) | power(x, 2) |
sqrt(x) | root(2, x) |
@section math-ast Methods for working with libSBML's Abstract Syntax Trees
Every ASTNode in a libSBML AST has an associated type, a value taken from
the enumeration
ASTNodeType_t. The list of possible types is quite long,
because it covers all the mathematical functions that are permitted in
SBML. The values are shown in the following table; their names hopefully
evoke the construct that they represent:
|
|
|
AST_UNKNOWN | AST_FUNCTION_ARCCOTH | AST_FUNCTION_POWER |
AST_PLUS | AST_FUNCTION_ARCCSC | AST_FUNCTION_ROOT |
AST_MINUS | AST_FUNCTION_ARCCSCH | AST_FUNCTION_SEC |
AST_TIMES | AST_FUNCTION_ARCSEC | AST_FUNCTION_SECH |
AST_DIVIDE | AST_FUNCTION_ARCSECH | AST_FUNCTION_SIN |
AST_POWER | AST_FUNCTION_ARCSIN | AST_FUNCTION_SINH |
AST_INTEGER | AST_FUNCTION_ARCSINH | AST_FUNCTION_TAN |
AST_REAL | AST_FUNCTION_ARCTAN | AST_FUNCTION_TANH |
AST_REAL_E | AST_FUNCTION_ARCTANH | AST_LOGICAL_AND |
AST_RATIONAL | AST_FUNCTION_CEILING | AST_LOGICAL_NOT |
AST_NAME | AST_FUNCTION_COS | AST_LOGICAL_OR |
AST_NAME_TIME | AST_FUNCTION_COSH | AST_LOGICAL_XOR |
AST_CONSTANT_E | AST_FUNCTION_COT | AST_RELATIONAL_EQ |
AST_CONSTANT_FALSE | AST_FUNCTION_COTH | AST_RELATIONAL_GEQ |
AST_CONSTANT_PI | AST_FUNCTION_CSC | AST_RELATIONAL_GT |
AST_CONSTANT_TRUE | AST_FUNCTION_CSCH | AST_RELATIONAL_LEQ |
AST_LAMBDA | AST_FUNCTION_EXP | AST_RELATIONAL_LT |
AST_FUNCTION | AST_FUNCTION_FACTORIAL | AST_RELATIONAL_NEQ |
AST_FUNCTION_ABS | AST_FUNCTION_FLOOR |
|
AST_FUNCTION_ARCCOS | AST_FUNCTION_LN |
AST_FUNCTION_ARCCOSH | AST_FUNCTION_LOG |
AST_FUNCTION_ARCCOT | AST_FUNCTION_PIECEWISE |
There are a number of methods for interrogating the type of an ASTNode and
for testing whether a node belongs to a general category of constructs.
The methods are the following:
@li ASTNodeType_t ASTNode.getType()
returns the type of
this AST node.
@li bool ASTNode.isConstant()
returns @c true if this
AST node is a MathML constant (@c true, @c false, @c pi, @c exponentiale),
@c false otherwise.
@li bool ASTNode.isBoolean()
returns @c true if this
AST node returns a boolean value (by being either a logical operator, a
relational operator, or the constant @c true or @c false).
@li bool ASTNode.isFunction()
returns @c true if this
AST node is a function (i.e., a MathML defined function such as @c exp or
else a function defined by a FunctionDefinition in the Model).
@li bool ASTNode.isInfinity()
returns @c true if this
AST node is the special IEEE 754 value infinity.
@li bool ASTNode.isInteger()
returns @c true if this
AST node is holding an integer value.
@li bool ASTNode.isNumber()
returns @c true if this
AST node is holding any number.
@li bool ASTNode.isLambda()
returns @c true if this
AST node is a MathML @c lambda construct.
@li bool ASTNode.isLog10()
returns @c true if this
AST node represents the @c log10 function, specifically, that its type is
AST_FUNCTION_LOG and it has two children, the first of which is an integer
equal to 10.
@li bool ASTNode.isLogical()
returns @c true if this
AST node is a logical operator (@c and, @c or, @c not, @c xor).
@li bool ASTNode.isName()
returns @c true if this
AST node is a user-defined name or (in SBML Level 2) one of the two special
@c csymbol constructs "delay" or "time".
@li bool ASTNode.isNaN()
returns @c true if this
AST node has the special IEEE 754 value "not a number" (NaN).
@li bool ASTNode.isNegInfinity()
returns @c true if this
AST node has the special IEEE 754 value of negative infinity.
@li bool ASTNode.isOperator()
returns @c true if this
AST node is an operator (e.g., @c +, @c -, etc.)
@li bool ASTNode.isPiecewise()
returns @c true if this
AST node is the MathML @c piecewise function.
@li bool ASTNode.isRational()
returns @c true if this
AST node is a rational number having a numerator and a denominator.
@li bool ASTNode.isReal()
returns @c true if this
AST node is a real number (specifically, AST_REAL_E or AST_RATIONAL).
@li bool ASTNode.isRelational()
returns @c true if this
AST node is a relational operator.
@li bool ASTNode.isSqrt()
returns @c true if this
AST node is the square-root operator
@li bool ASTNode.isUMinus()
returns @c true if this
AST node is a unary minus.
@li bool ASTNode.isUnknown()
returns @c true if this
AST node's type is unknown.
Programs manipulating AST node structures should check the type of a given
node before calling methods that return a value from the node. The
following meethods are available for returning values from nodes:
@li long ASTNode.getInteger()
@li char ASTNode.getCharacter()
@li string ASTNode.getName()
@li long ASTNode.getNumerator()
@li long ASTNode.getDenominator()
@li float ASTNode.getReal()
@li float ASTNode.getMantissa()
@li long ASTNode.getExponent()
Finally (and rather predictably), libSBML provides methods for setting the
values of AST nodes.
@li ASTNode.setCharacter(char)
sets the value of
this ASTNode to the given character. If character is one of @c +, @c -, @c
*, @c / or @c ^, the node type will be to the appropriate operator type.
For all other characters, the node type will be set to AST_UNKNOWN.
@li ASTNode.setName(string)
sets the value of
this AST node to the given name. The node type will be set (to AST_NAME)
only if the AST node was previously an operator
(isOperator(node) != 0
) or number (isNumber(node) !=
0
). This allows names to be set for AST_FUNCTIONs and the like.
@li ASTNode.setValue(int)
sets the value of the
node to the given integer. Equivalent to the next method.
@li ASTNode.setValue(long)
sets the value of the
node to the given integer.
@li ASTNode.setValue(long, long)
sets the value of this ASTNode to the given rational in two parts: the
numerator and denominator. The node type is set to AST_RATIONAL.
@li ASTNode.setValue(float)
sets the value of
this ASTNode to the given real (float) and sets the node type to AST_REAL.
@li ASTNode.setValue(float, long)
sets the value of this ASTNode to the given real (float) in two parts: the
mantissa and the exponent. The node type is set to AST_REAL_E.
The following are some miscellaneous methods for manipulating ASTs:
@li ASTNode ASTNode.ASTNode(ASTNodeType_t)
creates a new
ASTNode object and returns a pointer to it. The returned node will have
the given type, or a type of AST_UNKNOWN if no type is explicitly given.
@li unsigned int ASTNode.getNumChildren()
returns the number
of children of this AST node or 0 is this node has no children.
@li ASTNode.addChild(ASTNode)
adds the given node
as a child of this AST node. Child nodes are added in left-to-right order.
@li ASTNode.prependChild(ASTNode)
adds the given
node as a child of this AST node. This method adds child nodes in
right-to-left order.
@li ASTNode ASTNode.getChild (unsigned int)
returns the nth
child of this AST node or NULL if this node has no nth child (n >
(ASTNode.getNumChildren() - 1)
).
@li ASTNode ASTNode.getLeftChild()
returns the left child of
this AST node. This is equivalent to ASTNode.getChild(0)
;
@li ASTNode ASTNode.getRightChild()
returns the right child of this AST node or NULL if this node has no right
child.
@li ASTNode.swapChildren(ASTNode)
swaps the
children of this ASTNode with the children of @c that ASTNode.
@li ASTNode.setType(ASTNodeType_t)
sets the type of this ASTNode to the given ASTNodeType_t enumeration value.
@section math-reading Reading and Writing MathML from/to ASTs
As mentioned above, applications often can avoid working with raw MathML by
using either libSBML's text-string interface or the AST API. However, when
needed, reading MathML content directly and creating ASTs, as well as the
converse task of writing MathML, is easily done using two methods designed
for this purpose:
@li ASTNode readMathMLFromString(string)
reads raw
MathML from a text string, constructs an AST from it, then returns the root
ASTNode of the resulting expression tree.
@li string writeMathMLToString(ASTNode)
writes an AST to a
string. The caller owns the character string returned and should free it
after it is no longer needed.
*/