Re: Idea for index/limit support

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Tue Sep 06 2005 - 11:06:10 EDT

  • Next message: Gentry, Michael \(Contractor\): "RE: Cayenne 1.2 + PostgreSQL"

    In fact it will be fairly easy to extend Cayenne to do that... Just
    swap ToManyList (current List implementation in a relationship) with
    IncrementalFaultList. This is 100% db-independent. Of course this
    would require a hint in the mapping.

    Andrus

    On Sep 6, 2005, at 10:59 AM, Gili wrote:

    >
    > Michael, to the best of my knowledge Hibernate provides index/
    > limit support in a database-agnostic way without fetching all the
    > results, so this *is* possible. Even if not all databases support
    > this feature we should support it for those who do (which are the
    > majority). The performance benefit is huge.
    >
    > If Cayenne wants to know how many results there are, issuing a
    > "select count(*)" query should be enough and far more efficient
    > than actually retrieving a list of all PKs.
    >
    > Lastly, I don't think 100,000 records is a lot. I mean, this is
    > what databases are all about: massive amounts of data. We should be
    > designing for this kind of thing, as it is not the exceptional
    > case. Currently I've got 2,464 records in one of my tables and my
    > application is just at its birth stages. In another year I wouldn't
    > be surprised if I reach or surpass 100,000 records.
    >
    > Gili
    >
    > Gentry, Michael (Contractor) wrote:
    >
    >> Cayenne (or another ORM) will always need to know how many records
    >> are
    >> in the relationship. It doesn't matter if it fetches the entire
    >> object
    >> or the primary key, it still needs to fetch all of them. There is no
    >> database-agnostic way of fetching N records starting at record
    >> 300, so
    >> Cayenne/etc needs to have them all (or at least the PKs) in order to
    >> function. If it only fetched the PKs, it can still do 1 SELECT
    >> passing
    >> the DB N PKs (I'd like to buy a vowel there), which is at least
    >> fairly
    >> efficient. If you have 100,000 records in a relationship, you might
    >> have bigger issues, too ... :-)
    >> /dev/mrg
    >> -----Original Message-----
    >> From: Gili [mailto:cowwo..bs.darktech.org] Sent: Tuesday,
    >> September 06, 2005 10:41 AM
    >> To: cayenne-deve..bjectstyle.org
    >> Subject: Re: Idea for index/limit support
    >> Right, this is related to pagination. All I'm saying is that if
    >> I have 100,000 results (not necessarily blobs) and I only want to
    >> pick up the first five results, I should be able to do just that.
    >> Hitting all 100,000 results when I know in advance I will not read
    >> them is a huge waste of both memory and time.
    >> Regarding my suggestion, the default would be the current
    >> behavior (i.e. retrieve all values) but I'm saying it would be
    >> nice to have the option to tweak this further. To my
    >> understanding, even with Pagination,
    >> Cayenne will retrieve all primary keys from a result. I saw an
    >> open JIRA
    >> issue for this I think.
    >> Gili
    >> Gentry, Michael (Contractor) wrote:
    >>
    >>> I really can't imagine my to-many faults not returning everything in
    >>>
    >> the
    >>
    >>> relationship. I certainly wouldn't want it to only fault in what I
    >>>
    >> try
    >>
    >>> to read (that would require far too many SELECTs and hurt
    >>>
    >> performance).
    >>
    >>> And it's probably too arbitrary to pick N records for it to read
    >>> in at
    >>>
    >> a
    >>
    >>> time.
    >>>
    >>> Given your example, it still seems like you are still having your
    >>> BLOB
    >>> data inside a table with other data instead of isolated. So,
    >>> when you
    >>> fault your to-many relationship, you are getting all of the BLOB
    >>> data
    >>>
    >> (N
    >>
    >>> rows) in addition to the other attributes. This is almost always
    >>>
    >> going
    >>
    >>> to be a lose situation using ORM.
    >>>
    >>> All that being said, perhaps a paginated query could help you out?
    >>>
    >>>
    >>>
    >> http://www.objectstyle.org/cayenne/userguide/perform/paged-
    >> queries.html
    >>
    >>> /dev/mrg
    >>>
    >>> -----Original Message-----
    >>> From: Gili [mailto:cowwo..bs.darktech.org] Sent: Monday,
    >>> September 05, 2005 3:29 AM
    >>> To: cayenne-deve..bjectstyle.org
    >>> Subject: Idea for index/limit support
    >>>
    >>>
    >>>
    >>> Aside from the obvious idea that queries should be able to
    >>> set index/limit, I'm thinking we should modify the behavior of
    >>>
    >> relationship
    >>
    >>> getting methods. Currently they return a List from xxxArray
    >>> relationships. I'm thinking we should implement the List interface,
    >>>
    >> and
    >>
    >>> provide special index/limit/fetchSize methods in our
    >>> implementation. This way old code continues working as is and new
    >>> code that wishes to optimize performance does something like this:
    >>>
    >>> List images = dataObject.getImageArray();
    >>> CayenneList typedImages = (CayenneList) images;
    >>>
    >>> // (optional) configure List before using it
    >>> typedImages.setFetchSize(10);
    >>> typedImages.setMinIndex(15);
    >>> typedImages.setMaxIndex(30);
    >>>
    >>> // use list like we always did
    >>> typedImages.get(20);
    >>>
    >>> now... min/max indexes are hints to the List implementation as
    >>> to what index/limit values to issue the query with. If a user
    >>> tries to get()
    >>>
    >> an
    >>
    >>> index outside this range we will issue a second query but this is
    >>> unlikely to happen. The typical usage scenerio would be if the above
    >>>
    >> had
    >>
    >>> 100,000 rows in the result we'd see a noticable performance
    >>> benefit of
    >>> only retrieving 15 rows (index 15 - 30 inclusive). And best of all,
    >>>
    >> this
    >>
    >>> change is completely backwards compatible with older releases.
    >>> What do
    >>> you think?
    >>>
    >>> Gili
    >>>
    >
    > --
    > http://www.desktopbeautifier.com/
    >
    >



    This archive was generated by hypermail 2.0.0 : Tue Sep 06 2005 - 11:06:13 EDT