Re: [question] Multiple flattened relationships in a single SelectQuery

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Mon May 05 2003 - 16:19:51 EDT

  • Next message: Jean-Paul Le Fèvre: "Re: location of map and driver xml files"

    > Thanks for responding. I don't hope Craig is moving out of the Cayenne
    > development team - he helped me pretty good too.

    I sure hope he is not. I was referring to this message on cayenne-devel:

      http://objectstyle.org/cayenne/lists/cayenne-devel/2003/05/0000.html

    > I will log this as a bug in a few days

    That would be great!

    > and will send you something that
    > will show you that distinct can cost performance. It all depends on how
    > big the dataset is of course. But it basically means that it will have
    > to do a whole table scan to determine if all rows returned are
    > 'distinct' (if the column is a non-unique one).

    When doing this comparison, take into account the cost of NOT using
    DISTINCT, and reading all the duplicate rows in the memory and sorting
    them out on the Java side (basically doing the same thing that DISTINCT
    does, but in Java). I think the alternative is even more costly.

    If there was a third option to achieve the same result... Maybe something
    like "EXISTS" in Oracle (not sure if this is supported by other engines).
    But joins are still cheaper than EXISTS in most cases...

    Andrus

    >
    > From: Andrus Adamchik <andru..bjectstyle.org>
    >>To: cayenne-use..bjectstyle.org
    >>Subject: Re: [question] Multiple flattened relationships in a single
    >> SelectQuery
    >>Date: Sun, 4 May 2003 18:37:41 -0400
    >>
    >>Laszlo,
    >>
    >>Since Craig, who did most of the original flattened relationships
    >> implementation, is moving to other things, I guess I might dig into
    >> this more myself. This does look like a query translator bug. This
    >> will take some time though, since I am working on many other things.
    >> Could you file this as a bug, so that we do not overlook it?
    >>
    >>I know the answer to the "DISTINCT" question though: any time the
    >> qualifier is built using "to-many" relationships, DISTINCT is enforced,
    >> since joining from primary key, will result in duplication of the
    >> root entity rows. There is no reason to bring them all to the client.
    >> Do you have any confirmation that DISTINCT leads to serious
    >> performance degradation?
    >>
    >>Thanks
    >>
    >>
    >>
    >>On Saturday, May 3, 2003, at 07:15 AM, Laszlo Spoor wrote:
    >>
    >>>Cayenne 1.0.beta2
    >>>MySQL 3.23 / 4.x
    >>>
    >>>Hi Cayenne Team,
    >>>
    >>>I have a problem when I write a SelectQuery where multiple flattened
    >>> relationships occur.
    >>>
    >>>Some of my Object Model is:
    >>>Photo --> Metainformation about a picture
    >>>PhotoType --> E.g. scenery, overview, blueprint, etc.
    >>>Complex --> Collection of houses
    >>>HousingType --> E.g. appartment, villa, row house, etc
    >>>[...etc...]
    >>>
    >>>Some of my Relationships:
    >>>HousingType |(1:n)| Complexes
    >>>Photo |(n:n)| PhotoTypes
    >>>Photo |(n:n)| Complexes
    >>>[...etc...]
    >>>
    >>>My (partial) data model:
    >>>fotos maps to Photo
    >>>photo_types maps to PhotoTypes
    >>>complexes maps to Complex
    >>>cpx_pto is an intersection table between complexes and Photos
    >>> pte_pto is an intersection table between phototypes and Photos
    >>>
    >>>The flattened Relationships are mapped using the modeler, like (my ;-)
    >>> example:
    >>>http://www.objectstyle.org/cayenne/userguide/advmodel/
    >>> flattenedrel.html. The Java code is:
    >>>
    >>> [...]
    >>> SelectQuery qry = new SelectQuery (Photo.class);
    >>> qry.setPageSize(15);
    >>> qry.setDistinct(false);
    >>>
    >>> Expression qualifier
    >>> = ExpressionFactory.matchExp("complexes.toHousingType.name",
    >>>"villa");
    >>>
    >>> qry.setQualifier(qualifier);
    >>>
    >>> Expression pIt
    >>> = ExpressionFactory.matchExp("fototypes.name", "blueprint");
    >>>
    >>> qry.andQualifier(pIt);
    >>> qry.setLoggingLevel(Level.WARN);
    >>>
    >>> mFotoList = super.getDataContext().performQuery(qry);
    >>> [...]
    >>>
    >>>
    >>>This SQL-code that Cayenne generated is:
    >>>
    >>>SELECT DISTINCT t0.file_name
    >>> , t0.url_foto
    >>> , t0.url_thumbnail
    >>> , t0.id
    >>>FROM photos t0
    >>> , cpx_pto t1
    >>> , complexes t2
    >>> , housing_types t3
    >>> , fte_fto t4
    >>> , phototypes t5
    >>>WHERE t0.id = t1.fto_id
    >>>AND t1.cpx_id = t2.id
    >>>AND t2.hwe_id = t3.id
    >>>AND t0.id = t4.fto_id
    >>>AND t4.fte_id = t5.id
    >>>AND ((t3.name = ?)
    >>>AND (t4.name = ?))
    >>>[params: 'villa', 'blueprint']
    >>>
    >>>Now, there are two points that I don't understand:
    >>>- t4 should be t5
    >>> As you can see, t4.name points to the intersection table, so the
    >>>execution of
    >>> the query fails. However, I expect it to point to phototypes.name
    >>> (thus t5.name). If I replace it with t5, the query returns what I
    >>> expect.
    >>>
    >>>- distinct
    >>> qry.setDistinct(false) or qry.setDistinct(true), does not result in
    >>>
    >>>ommitting
    >>> the distinct part of the query. Distincts are usually a bad idea,
    >>> cause
    >>>it
    >>> results in a full table scan. It this expected behaviour?
    >>>
    >>>Can anyone please explain to me what is happening, why and what I
    >>> should do to overcome this?
    >>>
    >>>Thanks in advance, Laszlo Spoor
    >>
    >
    >
    > _________________________________________________________________
    > MSN 8 with e-mail virus protection service: 2 months FREE*
    > http://join.msn.com/?page=features/virus



    This archive was generated by hypermail 2.0.0 : Mon May 05 2003 - 16:19:51 EDT