SQL's newer Left Outer Join operation is hierarchical in nature. It preserves its left argument over its right argument when there is no match. This enables ANSI SQL to model and define hierarchical structures using SQL's Left Join syntax. This enables the SQL processor to directly process the defined hierarchical structures by naturally following the semantics of the Left Outer Join. Unlike today's XML hierarchical processing, this is a full multipath hierarchical processing allowing SQL queries to reference and process multiple pathways of the structure. The hierarchical data preservation spreading across multiple pathways works correctly by the Left Outer Join syntax shown below.
A /\ BE /\/\ CD FG
SQL Structure Definition
DEFINE VIEW TestView AS SELECT * FROM A LEFT JOIN B ON A.a=B.b LEFT JOIN C ON B.b=C.c LEFT JOIN D ON B.b=D.d LEFT JOIN E ON A.a=E.e LEFT JOIN F ON E.e=F.f LEFT JOIN G ON E.e=G.g
The SQL above models the hierarchical structure above it. You can see how the hierarchical paths are extended and when new paths are started off of existing paths by comparing the SQL to the data structure. The SQL syntax acts as the hierarchical data modeling language with left structure arguments placed over right structure argument and the ON clause specifying the node link points in the structure. The syntax's associated semantics describes the SQL hierarchical preservation operation when directly processed by the SQL engine.
Often overlooked in the XML industry is an extremely imported higher level hierarchical processing logic necessary to assure that hierarchical multipath queries produce correct meaningful results. Take for example an SQL query selecting data from one path of a hierarchical structure based on a data value from another path. Such as SELECT C.c FROM TestView WHERE G.g=4 using the structure and SQL view above. There is a semantic hierarchical relationship expressed across the referenced pathways that is not represented in the Left Join operation. The utilization of this hierarchical structure semantic information is necessary to processes multipath queries.
This hierarchical structure semantics processing is based on Lowest Common Ancestor (LCA) logic. This occurs because the LCA is the node between the two referenced points that defines or limits the range of a valid meaningful relationship between the two points. We determined the existence of this LCA multipath required operation in ANSI SQL and traced its operation to the relational Cartesian product.
ELECT LCA Logic Usage
In the query above repeated here: SELECT C.c FROM TestView WHERE G.g=4, the Lowest Common Ancestor (LCA) node derived from C.c and G.g is node A. The LCA node limits the range of the relationship. In this query: SELECT F.f FROM TestView WHERE G.g=2, the LCA node in this case is Node E. There can be multiple LCA nodes from a single query as in: SELECT C.c, F.e FROM TestView WHERE G.g=2, which utilizes LCA node A and node E applied separately for each SELECTed data value.
This automatic use of LCA logic in multipath processing is called a Schema-free query in academic research. This is because the navigation for multipath queries needs to be automatic and this means that the user does not need to know the data structure. XQuery support of LCA and Schema-free query is still in academic research.
WHERE LCA Logic Usage
LCA logic is also required for a complex WHERE clause filtering which references multiple pathways as is in: SELECT A.a FROM TestView WHERE C=4 AND D=5. In this case node B is the LCA node and all combination of matches is applied under the B node. This is called XML Keyword Search in academic research; it is also currently in XQuery academic research. Both types of LCA logic usage can be applied together.
The automatic processing of multipath queries means that they are always processed correctly and automatically without user navigation or knowledge of the data structure necessary. To try this first hand, click on the demo below.