Skip to content

Latest commit

 

History

History
313 lines (238 loc) · 13.7 KB

expressiontree.md

File metadata and controls

313 lines (238 loc) · 13.7 KB

ExpressionTree

A container that holds arithmetic or logical expressions.

1. Motivation

The main goal of the ExpressionTree is to effectively store a tree-like structure of parsed expression and to effectively build this structure during sequential parsing.

2. Description

template <typename OperationType, typename SubTree, int holdSize, typename... Ts>
class ExpressionTree;
  • OperationType - a type that represents the expression operators (arithmetic or logical)
  • SubTree - a type that represents a head of subtree (subexpression)
  • holdSize - count of nodes that should be stored on the stack without allocations
  • Ts... - types of the expression arguments.

ExpressionTree does not support operator precedence. You can support it manually as it done in QueryEntries and SelectIteratorContainer, or by enclosing higher priority operators in brackets as it done in SortExpression. Here is not used the traditional way for constructing trees with inheritance of nodes, allocations of separate nodes and holding pointers to them. ExpressionTree holds all nodes by value in a vector (container_) sequentially in type Node based on variant. In order to support lazy copying Node can hold a reference to payload of another Node by using ExpressionTree::Ref<T> type.

Warning: The lazy copy node shall not live longer than the original one.

Subtree is stored in container_ just behind its head (SubTree) which holds occupied space. For details see examples. This architecture allows to reduce count of allocations and virtual functions calls.

2.1. Examples

Expression

A + (B - C - (D + E)) - F

would be stored like

0 1 2 3 4 5 6 7
+, A +, 6 +, B -, C -, 3 +, D +, E -, F

Every cell holds an operator (here + or -) and payload.

Lets see cells 4-5: they correspond to subexpression -(D + E).

  • Cell 4 is a head of subtree: - - operator before the bracket and 3 - count of cells occupied by subtree (from 4 to 6).
  • Cell 5 holds default operator + and value D.
  • Cell 6 holds operator + and value E.

Similarly, cell 1 stores 6 - count of cells occupied by subtree (from 1 to 6).

This structure provides effective forward iteration and doesn't allow to iterate backwards.

Default building process of ExpressionTree is to append elements from beginning to end by using methods Append(), OpenBracket() and CloseBracket(). activeBrackets_ - list of cells of heads of open brackets which sizes should be incremented when new element is added. Lets trace evolution of container_ and activeBrackets_ during construction expression (A + B) - (C + (D - E) + F) - G:

  1. OpenBracket(+). Constructed expression: (
0
+, 1
0
  1. Append(+, A). Constructed expression: (A
0 1
+, 2 +, A
0
  1. Append(+, B). Constructed expression: (A + B
0 1 2
+, 3 +, A +, B
0
  1. CloseBracket(). Constructed expression: (A + B)
0 1 2
+, 3 +, A +, B

activeBrackets_ is empty

  1. OpenBracket(-). Constructed expression: (A + B) - (
0 1 2 3
+, 3 +, A +, B -, 1
3
  1. Append(+, C). Constructed expression: (A + B) - (C
0 1 2 3 4
+, 3 +, A +, B -, 2 +, C
3
  1. OpenBracket(+). Constructed expression: (A + B) - (C + (
0 1 2 3 4 5
+, 3 +, A +, B -, 3 +, C +, 1
3 5
  1. Append(+, D). Constructed expression: (A + B) - (C + (D
0 1 2 3 4 5 6
+, 3 +, A +, B -, 4 +, C +, 2 +, D
3 5
  1. Append(-, E). Constructed expression: (A + B) - (C + (D - E
0 1 2 3 4 5 6 7
+, 3 +, A +, B -, 5 +, C +, 3 +, D -, E
3 5
  1. CloseBracket(). Constructed expression: (A + B) - (C + (D - E)
0 1 2 3 4 5 6 7
+, 3 +, A +, B -, 5 +, C +, 3 +, D -, E
3
  1. Append(+, F). Constructed expression: (A + B) - (C + (D - E) + F
0 1 2 3 4 5 6 7 8
+, 3 +, A +, B -, 6 +, C +, 3 +, D -, E +, F
3
  1. CloseBracket(). Constructed expression: (A + B) - (C + (D - E) + F)
0 1 2 3 4 5 6 7 8
+, 3 +, A +, B -, 6 +, C +, 3 +, D -, E +, F

activeBrackets_ is empty

  1. Append(-, G). Constructed expression: (A + B) - (C + (D - E) + F) - G
0 1 2 3 4 5 6 7 8 9
+, 3 +, A +, B -, 6 +, C +, 3 +, D -, E +, F -, G

activeBrackets_ is empty

3. class Bracket

Default type for SubTree template argument. Just holds size of the subtree that is incremented by Append() and is reduced by Erase().

4. class ExpressionTree::Node

Node is type of container_'s cells. It contains operation (value of OperationType) and a value of one of the types:

  • SubTree (Bracket by default) if it is head of subexpression.
  • one of Ts... types.
  • ExpressionTree::Ref<T>, where T is one of Ts... types, it means that the Node holds reference to payload of another Node which holds value of type T.

4.1. Methods

  • T& Node::Value<T>(), const T& Node::Value<T>() const return reference to payload value if it holds value of type T or Ref<T>, fail otherwise.

  • size_t Node::Size() const returns 1 if it is not head of subexpression or count of cells occupied by subexpression otherwise.

  • bool Node::IsSubTree() const, bool Node::IsLeaf() const test is the Node head of subexpression or vice versa.

  • bool Node::Holds<T>() const returns true if it holds value of type T.

  • void Node::Append() increments size of subexpression if it is head of subexpression, fails otherwise.

  • void Node::Erase(size_t) reduces size of subexpression if it is head of subexpression, fails otherwise.

  • template <typename... Args>
    void Node::ExecuteAppropriate(const std::function<void(Args&)>&... funcs);
    template <typename... Args>
    void Node::ExecuteAppropriate(const std::function<void(const Args&)>&... funcs) const;

    invokes appropriate functor if the Node holds value of one of Args... types or Ref<T>, where T is one of Args... types, no functor will be invoked otherwise.

  • template <typename R>
    R Node::CalculateAppropriate(const std::function<R(const SubTree&)>& f, const std::function<R(const Ts&)>&... funcs) const;

    invokes appropriate functor depending on type of value is held by Node and provides returned value.

  • Node Node::MakeLazyCopy()&

    • returns copy of origin one if it is head of subexpression or holds value of Ref<T> type.
    • returns new Node that holds Ref<T> which references to payload of origin one if it holds T (one of Ts...).

    Warning the copy shall not live longer than the origin.

  • Node Node::MakeDeepCopy() const &

    • returns copy of origin one if it is head of subexpression or holds value of one of Ts... types.
    • returns new Node which holds copy of value that Ref<T> references to if origin one holds value of Ref<T> type.
  • Node Node::MakeDeepCopy() &&

    • returns move-copy of origin one if it is head of subexpression or holds value of one of Ts... types.
    • returns new Node which holds copy of value that Ref<T> references to if origin one holds value of Ref<T> type.
  • bool Node::IsRef() const returns true if it holds reference to payload of another Node.

  • void Node::SetValue<T>(T&&) sets the Node to hold new value of type T.

5. class ExpressionTree::Ref<T>

Ref<T> is a wrapper on pointer to T. It is used for lazy coping of Node to do not make copy of its payload but to simply create a reference to it.

6. class ExpressionTree::iterator and class ExpressionTree::const_iterator

They are forward iterators which iterates over nodes of one level and do not go into subexpressions. So if an iterator it points not to head of a subexpression after operation ++it it will point to next cell. And if an iterator it points to head of a subexpression after operation ++it it will point to the cell next after the last cell of the subexpression. To iterate into subexpression use methods

iterator iterator::begin();
const_iterator iterator::cbegin();
const_iterator const_iterator::begin();
const_iterator const_iterator::cbegin();
iterator iterator::end();
const_iterator iterator::cend();
const_iterator const_iterator::end());
const_iterator const_iterator::cend();

if the iterator points to head of subexpression these methods return an iterator that points to the first or next after the last cell of the subexpression or fail otherwise.

For example, for expression A + B - (C - D + (E - F) - G)

0 1 2 3 4 5 6 7 8
+, A +, B -, 7 +, C -, D +, 3 +, E -, F -, G
  • if it points to cell 1 when it.begin() (and similar) fails and after ++it it will point to cell 2.
  • if it points to cell 2 when it.begin() returns an iterator that points to cell 3, it.end() returns an iterator that points to cell after 8 and ++it makes it to point to cell after 8.
  • if it points to cell 3 when it.begin() (and similar) fails and after ++it it will point to cell 4.
  • if it points to cell 5 when it.begin() returns an iterator that points to cell 6, it.end() returns an iterator that points to cell 8 and ++it makes it to point to cell 8.

iterator can be converted to const_iterator but not vice versa.

7. Methods

  • Copy constructor and copy assignment operator make deep copy for all copying nodes.
  • Get iterators, which point to the first or next after the last cell of the expression:
iterator ExpressionTree::begin();
const_iterator ExpressionTree::begin() const;
const_iterator ExpressionTree::cbegin() const;
iterator ExpressionTree::end();
const_iterator ExpressionTree::end() const;
const_iterator ExpressionTree::cend() const;
  • Get iterator that points to the first cell of current active subexpression (last subexpression for which OpenBracket() was called and CloseBracket() was not) and returns begin() if no active subexpression:
const_iterator ExpressionTree::begin_of_current_bracket() const
  • Append an operand to the end or to the beginning of the expression. T must be one of Ts... types:
void ExpressionTree::Append<T>(OperationType, const T&);
void ExpressionTree::Append<T>(OperationType, T&&);
void ExpressionTree::AppendFront<T>(OperationType, T&&);
  • Append deep or lazy copy of a part of another expression. !Warning! lazy copy should not live over the original expression:
void ExpressionTree::Append(const_iterator begin, const_iterator end);
void ExpressionTree::LazyAppend(iterator begin, iterator end);
  • Start and finish subexpression. args... are forwarded to constructor of SubTree:
void ExpressionTree::OpenBracket<Args...>(OperationType, Args... args);
void ExpressionTree::CloseBracket();
  • Get or set operation of node in cell number i:
OperationType ExpressionTree::GetOperation(size_t i) const ;
void ExpressionTree::SetOperation(OperationType op, size_t i);
  • void ExpressionTree::SetLastOperation(OperationType) - set operation to last appended leaf or last closed subtree or last open subtree if it is empty.

  • bool ExpressionTree::Empty() const - test if the expression empty.

  • size_t ExpressionTree::Size() const - get count of cells the expression occupies.

  • size_t ExpressionTree::Size(size_t i) const - get count of cells subexpression occupies if cell i is head of the subexpression or 1 otherwise.

  • bool ExpressionTree::IsValue(size_t i) const - test if the cell i is not head of a subexpression.

  • void ExpressionTree::Erase(size_t from, size_t to) - remove nodes with indexes from from to to - 1.

  • size_t ExpressionTree::Next(size_t i) const - get index of cell after the last cell of subexpression if cell i is head of the subexpression or i + 1 otherwise. For example, for expression A + B - (C - D + (E - F) - G)

    0 1 2 3 4 5 6 7 8
    +, A +, B -, 7 +, C -, D +, 3 +, E -, F -, G

    -# Next(1) returns 2. -# Next(2) returns 9. -# Next(3) returns 4. -# Next(5) returns 8.

  • Invoke appropriate functor depending on content type for each node, skip if no appropriate functor (invoke ExecuteAppropriate(funcs) for each node):

template <typename... Args>
void ExpressionTree::ExecuteAppropriateForEach(const std::function<void(const Args&)>&... funcs) const;
template <typename... Args>
void ExecuteAppropriateForEach(const std::function<void(Args&)>&... funcs);