Re: [Still LONG] Re: [LONG] reasons why "advanced" expressions are needed

From: Giulio Cesare Solaroli (slrgcs..bn-italy.com)
Date: Thu Oct 02 2003 - 03:49:20 EDT

  • Next message: Dirk Olmes: "Re: DataContext delegate?"

    Andrus,

    thank you very much for your reply. It is very interesting.
    Today I will be out of the office almost all time, so I will take time
    to read it carefully and reply to it.

    I am very interested in this topic, and I would be very pleased to help
    in bringing these feature to Cayenne.

    Stay tuned. I'll come back tomorrow! ;-]

    Giulio Cesare

    On Thursday, Oct 2, 2003, at 08:35 Europe/Rome, Andrus Adamchik wrote:

    > On Tuesday, September 30, 2003, at 01:18 PM, Giulio Cesare Solaroli
    > wrote:
    >> I hope this LONG message have shown you my point.
    >
    > It sure did. I perfectly understand the deficiencies of Cayenne
    > expression API in this respect (and "vanilla" EOF for that matter). I
    > am glad you are not just pointing them out, but also offering an O/R
    > solution. I am aware of SQL solutions; we just needed a push from
    > somebody to actually start bringing this to Cayenne :-).
    >
    >> Now the main question is: which is the best place, given Cayenne
    >> current architecture, to place the logic to handle these cases?
    >> I will be very pleased to "migrate" our current EOF/Objective-C
    >> implementation into Cayenne if this is achievable with a reasonable
    >> effort.
    >
    >
    > CAYENNE API
    >
    > 1. Expressions (org.objectstyle.cayenne.exp)
    >
    > As you noticed, Expression class simply defines semantics of the
    > expression, and doesn't contain any processing logic. Processing of
    > SQL generation is done by the access layer. In-memory evaluation is
    > done by ExpressionEval (called from Expression.eval(), still
    > incomplete). Since creating expressions directly is sort of
    > counter-intuitive (at least in their current form), ExpressionFactory
    > static methods are used instead. Their names (ideally should :-) )
    > follow the "common logic".
    >
    > Also note that some of currently defined expression types (mostly
    > aggregate functions and things like ALL, EXISTS, etc.) are not used
    > or supported in Cayenne. We may reuse some of them where it makes
    > sense in the new API described below.
    >
    >
    > 2. QueryTranslator/QualifierTranslator
    >
    > These access layer classes define algorithms for SQL translation.
    > There is a default implementation, which can be (optionally)
    > customized by each DbAdapter (e.g. if some database doesn't support
    > feature X, or implements it differently).
    >
    >
    > IMPLEMENTATION IDEAS/NOTES
    >
    > 0. Preparation.
    >
    > My +1 for the idea to start by creating tests for all the cases we
    > plan to cover.... Creating upfront user documentation is another thing
    > that will help us here.... Document cases that absolutely require
    > EXISTS support in the database and will blow without it.
    >
    >
    > 1. Unit Tests
    >
    > Though Cayenne is not using DBUnit, it has an extensive testing
    > framework of its own, and it should be easy to write all relevant
    > tests. A few hints:
    >
    > - Test cases are located under cayenne/src/tests/java.
    > - Ant scripts will run all matching "*Tst" as unit tests, so adding
    > a new test suite is as simple as creating an XYZTst class in an
    > appropriate package.
    > - Subclass CayenneTestCase to get access to the Cayenne stack during
    > testing.
    > - Test DataMap is located under src/tests/resources/test-resources.
    > Db schema is dropped and recreated on each run (but not on each test
    > of course).
    >
    > There were some proposals to use DBUnit in the past, but it didn't get
    > too far, so we have to use our own API to create test data sets.
    >
    >
    > 2. Expression Semantics.
    >
    > The fact that Expressions are abstract and do not have any processing
    > logic should allow to define semantics regardless of how it may be
    > eventually translated to SQL...as long as there is enough info/hints
    > collected in the expression.
    >
    > From what I can tell from the cases provided (if there are more,
    > please bring them on), we are dealing with a general problem of
    > matching a *collection* of values against a relationship path (with
    > special cases being : matching an attribute value instead of
    > relationship, matching a to-one relationship, matching against a
    > single-object collection and, finally, matching against a wildcard
    > value, e.g. "relationship not empty"). Additional logical operation
    > applied after the match is "none", "any", "all". Since collections can
    > consist of either DataObjects or scalar values, we may introduce
    > another variable into the equation - match type (e.g use ">" instead
    > of "="), but lets not do it just yet, or my head will explode :-).
    > This is how it can possibly be defined using ExpressionFactory:
    >
    > [please let me know if this analysis attempt fails to cover any of the
    > cases we are trying to solve]
    >
    >
    > // Wildcards...
    >
    > /** Qualifier for not empty relationship. Creates unary expression
    > of type EXISTS.
    > * Translated to "FK is not null" for to-one, and "EXISTS" for
    > to-many.
    > */
    > public static Expression hasAnyExp(String relationshipPath);
    >
    > /** Qualifier for an empty relationship. Creates unary expression of
    > type NOT_EXISTS.
    > * Translated to "FK is null" for to-one, and "NOT EXISTS" for
    > to-many.
    > */
    > public static Expression hasNoneExp(String relationshipPath);
    >
    >
    >
    > // Single objects... some existing API can be reused...
    >
    > /** Don't think current implementation has to change. */
    > public static Expression matchExp(String relationshipPath, Object
    > value);
    >
    >
    > /** Redo translator for the existing expression to use NOT EXISTS
    > for to-many. */
    > public static Expression noMatchExp(String relationshipPath, Object
    > value);
    >
    >
    > // Collections...
    > // additionally have to handle "doNotSplit"... internally i suggest
    > making OBJ_PATH
    > // expression a binary expression containing two operands to describe
    > the full path and the doNotSplit part....
    > // This can be easily made backwards compatible, assuming "doNotSplit
    > is fullPath"
    >
    > /** Qualifier for matching all of collection values. Creates binary
    > expression of type EQUAL_TO with
    > * second parameter being a collection. Translated to the list of
    > joins taking split policy into account.
    > * Should blow during execution if used with to-one relationship
    > and collection with size > 1. (??)
    > */
    > public static Expression hasAllOfExp(String relationshipPath, String
    > doNotSplitPath, Collection values);
    >
    > /** Qualifier for matching any of collection values. Creates binary
    > expression of type IN with
    > * second parameter being a collection.
    > * Notes:
    > * - Is this any different from our current "inExp", other that
    > in doNotSplitPath?
    > * - Matching on a compound PK DataObject will definitely
    > prevent using IN, instead a group of OR
    > * statements will be needed: ((PK1 = v1 AND PK2 = v2) or ...)
    > */
    > public static Expression hasAnyOfExp(String relationshipPath, String
    > doNotSplitPath, Collection values);
    >
    >
    > /** Qualifier for matching none of collection values. Creates binary
    > expression of type NOT_IN with
    > * second parameter being a collection. Translated to NOT EXISTS
    > with correct joins for to-many,
    > * or NOT IN for to-one, or NOT ((PK1 = v1 AND PK2 = v2) or ...)
    > for compound PK DataObjects.
    > */
    > public static Expression hasNoneOfExp(String relationshipPath,
    > String doNotSplitPath, Collection values);
    >
    >
    >
    > 3. SQL Translation
    >
    > SQL Translators is arguably the messiest part of our current codebase.
    > Anyway, I can identify the following pieces needed in the translator:
    >
    > 1. I like Scott's idea about having support for EXIST and subqueries.
    > I've been toying with this idea from the day one of Cayenne, but never
    > got back to actually implementing it.... Doing it independently might
    > as well give us building blocks to do the spec above.
    >
    > 2. Add support for "doNotSplit" OBJ_PATH and DB_PATH.. This shouldn't
    > be too hard to do (in a backwards compatible way too - current policy
    > is "doNotSplit = fullPath").
    >
    > 3. (1) and (2) being prerequisites, implement support for all the new
    > expressions... may turn out to be less work than it seems now. Esp.
    > after all the translator refactoring that might be needed by [1], and
    > fresh eyes looking at the translator's messy flow :-)
    >
    > I suspect we will need a separate discussion of SQL Translators... I
    > don't know if it is time for a redesign contest :-/ Maybe it is....
    > It'll definitely be a good thing to discuss alternative translator
    > implementations once we stumble on any problems while extending them
    > (if we don't...oh well). Current test cases should give us a good
    > cushion in case we have to seriously redo them.
    >
    > Andrus
    >



    This archive was generated by hypermail 2.0.0 : Thu Oct 02 2003 - 03:48:38 EDT