Re: Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]

From: Laszlo Spoor (lspoor_cayenn..otmail.com)
Date: Tue May 13 2003 - 15:54:06 EDT

  • Next message: martin ruff: "strange behaviour"

    Hi Andrus,

    You are right. Still, I would like to plead for a small note in the JavaDoc,
    explaining this. :-)

    Thanks, Laszlo

    >From: "Andrus Adamchik" <andru..bjectstyle.org>
    >Reply-To: andru..bjectstyle.org
    >To: <cayenne-use..bjectstyle.org>
    >Subject: Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened
    >relationships...]
    >Date: Tue, 13 May 2003 12:28:03 -0400 (EDT)
    >
    >Laszlo,
    >
    >I kind of expected such result, but your message prompted me to look for
    >an alternative solution. I did some research on various queries
    >performance on Oracle. In my case there are 2 tables with to-many from one
    >to another, both having around 100000 to 200000 rows. I tried running 3
    >flavors of the same query, directly on Oracle, with no JDBC involved:
    >
    >1. joining on to-many with no DISTINCT
    >2. joining on to-many with DISTINCT
    >3. Using correlated subquery with EXISTS
    >
    >(1) was the fastest - 2 seconds in my case, but produced incorrect results
    >(duplicate rows), (2) was the slowest - 4 seconds. (3) was in between (1)
    >and (2) - 3 to 4 seconds. So using correlated subquery with EXISTS may
    >yield to faster queries in some cases. Though I believe that playing with
    >indexes may alter the results dramatically either way.
    >
    >We may look into adding this optimization in the future. It will have to
    >be optional though, since many databases do not support correlated
    >subqueries and EXISTS (MySQL prior to 4.*).
    >
    >Andrus
    >
    >
    >
    > > Hi Andrus,
    > >
    > > About the 'DISTINCT' part that I would write you back about.
    > >
    > > To be sure I formulated my case well, I had to look it up in a couple of
    > > guides. As you will notice I will use the Oracle database as the
    > > datastore, simply because I have the most experience with this database.
    > >
    > > I'd got confirmed that the most important reason not to use DISTINCT is
    > > the fact that is forces the database to do a 'SORT UNIQUE' which can
    > > cost more disk reads and more processing time when reading the results.
    > >
    > > This information comes from the 'Oracle 9i Performance and Tuning Guide,
    > > p2-14):
    > > The distinct part results in a SORT-UNIQUE action... SORT UNIQUE occurs
    > > if a user specifies a DISTINCT clause (Example 2û15)or if an operation
    > > requires unique values for the next step (Example 2û16). Example 2û15
    > > DISTINCT Clause Causing SORT UNIQUE
    > >
    > > SELECT DISTINCT last_name, first_name
    > > FROM per_all_people_f
    > > WHERE full_name LIKE :b1;
    > >
    > > Plan
    > > --------------------------------------------------
    > > SELECT STATEMENT
    > > SORT UNIQUE
    > > TABLE ACCESS BY INDEX ROWID PER_ALL_PEOPLE_F
    > > INDEX RANGE SCAN PER_PEOPLE_F_N54
    > >
    > > Another, not technical, reason is that in general it is a bad practice
    > > to use
    > > DISTINCT, because the query writer 'admits' that his 'where-logic' is
    > > not strong
    > > enough to return the results that are required. However, I don't quite
    > > think this is 100% valid for Cayenne, because Cayenne can not always
    > > tell if that is
    > > the case.
    > >
    > > Still I do feel that the developer can have the option leaving the
    > > distinct out
    > > by using query.setDistinct(false), because it gives somewhat more space
    > > to tuning.
    > >
    > > Regards, Laszlo
    > >
    > > _________________________________________________________________
    > > MSN 8 with e-mail virus protection service: 2 months FREE*
    > > http://join.msn.com/?page=features/virus
    >
    >
    >

    _________________________________________________________________
    Help STOP SPAM with the new MSN 8 and get 2 months FREE*
    http://join.msn.com/?page=features/junkmail



    This archive was generated by hypermail 2.0.0 : Tue May 13 2003 - 15:53:41 EDT