DB query: Finding only the id values

jason_sam · June 15, 2014, 10:29am

My solution works, but I wonder if there is a better one.

I have a model (:cards), which has a foreign key :box_id. I am
interested in the id’s of those cards which have a certain box_id.
Currently I assume that I can easily hold an array of all thos :cards in
memory (i.e. no cursor needed).

This is my current solution:

Card.where(box_id: params[:box_id]).map {|c| c.id }.each do |cid|

… Do something with cid

end

This is compact, but I don’t like the fact that first, all data from the
retrieved Card objects needs to be stored in memory at least temporarily
(there is even a ‘text’ field in Card!), but only the id is needed.

I thought as an alternative to use find_by_sql, but the API
documentation warns that this should be only used as a “last resort”,
because it makes us dependent on the syntax for a particular database.
Although in my case, the SQL query would be so simple that I don’t fear
I would run into compatibility problems when exchanging one database for
another, I wonder whether there is a simpler solution, using just
ActiveRecord functions.

Ronald

ronaldfischer · June 15, 2014, 10:40am

On 15 June 2014 09:29, Ronald F. [email protected] wrote:

My solution works, but I wonder if there is a better one.

I have a model (:cards), which has a foreign key :box_id. I am
interested in the id’s of those cards which have a certain box_id.
Currently I assume that I can easily hold an array of all thos :cards in
memory (i.e. no cursor needed).

Assuming that you have the relationships setup accordingly (so card
belongs_to box and box has_many cards or something similar) then to
get the cards belonging to a certain box you can just use
@cards = @box.cards
then to get the id of each box in just use box.id

If the above does not make sense then I suggest you work right through
a good tutorial such as railstutorial.org, which is free to use
online, which will show you the basics of Rails.

Colin

ronaldfischer · June 15, 2014, 3:47pm

On Sun, Jun 15, 2014 at 1:29 AM, Ronald F. [email protected]
wrote:

Card.where(box_id: params[:box_id]).map {|c| c.id }.each do |cid|

… Do something with cid

end

This is compact, but I don’t like the fact that first, all data from the
retrieved Card objects needs to be stored in memory at least temporarily
(there is even a ‘text’ field in Card!), but only the id is needed.

I wonder whether there is a simpler solution, using just
ActiveRecord functions.

As Colin pointed out,

(assume that ) box = Box.find(box_id)

box.cards # has the cards you want if you need all the attributes

box.cards.pluck(:id) # builds a query to fetch only the card ids

HTH,

Hassan S. ------------------------ [email protected]

twitter: @hassan

ronaldfischer · June 15, 2014, 3:53pm

On Jun 15, 2014, at 9:45 AM, Hassan S. wrote:

I wonder whether there is a simpler solution, using just
HTH,
–
Hassan S. ------------------------ [email protected]
Hassan Schroeder | about.me
twitter: @hassan

What about box.card_ids ? Doesn’t that do much the same thing?

Walter

ronaldfischer · June 15, 2014, 4:14pm

On Sun, Jun 15, 2014 at 6:52 AM, Walter Lee D. [email protected]
wrote:

box.cards.pluck(:id) # builds a query to fetch only the card ids

What about box.card_ids ? Doesn’t that do much the same thing?

Yes, but ‘pluck’ is more all-purpose – it enables fetching
arbitrary
(and if desired, multiple) fields.

–
Hassan S. ------------------------ [email protected]

twitter: @hassan

ronaldfischer · June 16, 2014, 6:46am

Hassan S. wrote in post #1149759:

On Sun, Jun 15, 2014 at 1:29 AM, Ronald F. [email protected]
wrote:
(assume that ) box = Box.find(box_id)

box.cards # has the cards you want if you need all the attributes

box.cards.pluck(:id) # builds a query to fetch only the card ids

May I ask how this works (internally)? For pluck() to be applied, Ruby
has first to execute box.cards() to get an Array of the cards. Unless
the Card objects are implemented as proxies, which only fetch their data
from the database if one asks for it, I would at this point, at least
temporarily, have all the selected Card data in memory, isn’t it?

Ronald

ronaldfischer · June 16, 2014, 6:40am

Colin L. wrote in post #1149753:

On 15 June 2014 09:29, Ronald F. [email protected] wrote:
Assuming that you have the relationships setup accordingly (so card
belongs_to box and box has_many cards or something similar)

Actually I have both (belongs_to in :cards and has_many :cards in box);
would it be sufficient to have only one?

then to
get the cards belonging to a certain box you can just use
@cards = @box.cards
then to get the id of each box in just use box.id

I see. Thus, applying your suggestion to my case, it would be:

Box.find_by(params[:box_id]).cards.map {|c| c.id }.each do |cid|

… Do something with cid

end

But this solution still has the effect of having an array of all the
Cards, so I don’t really see an improvement over my original solution.
Or did I miss something?

ronaldfischer · June 16, 2014, 9:55pm

On Sunday, 15 June 2014 23:47:31 UTC-5, Ruby-Forum.com User wrote:

May I ask how this works (internally)? For pluck() to be applied, Ruby
has first to execute box.cards() to get an Array of the cards. Unless
the Card objects are implemented as proxies, which only fetch their data
from the database if one asks for it, I would at this point, at least
temporarily, have all the selected Card data in memory, isn’t it?

The short version: box.cards doesn’t actually load the records until
they
are needed; that’s why things like box.cards.limit(10) can work.
box.cards returns a Relation, which can return records (if you call
something on it that requires them, like to_a or each) but can also
be
used to construct SQL queries.

This is a bit tricky to see in the console, since typing box.cards
there
calls inspect on the Relation, which loads all the records.

For more detail, see the source.

–Matt J.

ronaldfischer · June 16, 2014, 7:23am

On Sun, Jun 15, 2014 at 9:46 PM, Ronald F. [email protected]
wrote:

May I ask how this works (internally)? For pluck() to be applied, Ruby
has first to execute box.cards() to get an Array of the cards. Unless
the Card objects are implemented as proxies, which only fetch their data
from the database if one asks for it, I would at this point, at least
temporarily, have all the selected Card data in memory, isn’t it?

No. You can watch the queries generated for confirmation. Or look
at the source. Or both.

–
Hassan S. ------------------------ [email protected]

twitter: @hassan

ronaldfischer · June 18, 2014, 4:55pm

In rails 3, I believe find_by(params[:box_id]) is insecure and creates a
SQL injection attack vector.

I think in Rails 4 that is fixed and is secure.

Either way, why would you write such a complicated statement using map
(this has go to be inefficient because you’re creating a lot of objects
then plucking their ids with map – remember, object instantiation is
expensive in active record)

did you try something like this?

box = Box.find(params[:box_id])
box.cards.each do |card|
#… do something with each card
end

ronaldfischer · June 18, 2014, 5:00pm

It took me several years of working with Rails to understand this
nuance, and I believe it is poorly documented in the AR guide. “reading
the source” may be a good idea for some, but remember the Rails source
isn’t easy for everyone to read (although a good idea!).

Since it is so important to how AR works, I think this facet of AR
should be documented better in the AR guide (specifically, that the AR
methods return ActiveRelation objects which don’t actually fetch
anything until you want to look at them). It’s a brilliant
implementation pattern, but counter-intuitive to newbies.

-Jason

ronaldfischer · June 18, 2014, 4:46pm

Colin is right here, just use relationships.

However, as you are considering writing raw SQL, keep in mind the danger
of SQL injection. Beyond the reason stated to keep your SQL
database-independant (which is strange advice IMO since it is very rare
to move between data stores on a large project, and even if you do it is
pretty easy to re-write SQL), the most important thing here is that you
don’t a security vulnerability for SQL injection.

In your original example, Card.where(box_id: params[:box_id]) is
actually safe, since where with a hash of parameters sanitizes the data.
However, Card.where(“box_id: #{params[:box_id]”) is NOT SAFE – don’t
ever do that.

Check out http://rails-sqli.org and take some time to learn how to run
http://brakemanscanner.org against your code (it’s very easy)

But actually Colin is right, you’re over engineering, just use
relationships and don’t worry about writing your own SQL. FWIW, I do
sometimes write my own SQL, but only when I absolutely have to because
ActiveRecord can’t capture what I need to do in SQL (which is very, very
rare)

-Jason

ronaldfischer · June 18, 2014, 6:29pm

Jason Fb wrote in post #1150058:

However, as you are considering writing raw SQL, keep in mind the danger
of SQL injection. Beyond the reason stated to keep your SQL
database-independant (which is strange advice IMO since it is very rare
to move between data stores on a large project,

Not so rare. For example, I do the development with SQLite, but the
production is on Heroku and uses Postgres, and customers might want to
use MySql.

and even if you do it is
pretty easy to re-write SQL), the most important thing here is that you
don’t a security vulnerability for SQL injection.

Oh, you are absolutely right. I see the danger.

These are the times where I’m missing Perl’s concept of “tainted”
strings…

Ronald

ronaldfischer · June 18, 2014, 6:47pm

You say tainted, and I hear Soft Cell…

Walter

ronaldfischer · June 18, 2014, 7:11pm

On Jun 18, 2014, at 12:36 PM, Ronald F. [email protected]
wrote:

Since it is so important to how AR works, I think this facet of AR
should be documented better in the AR guide (specifically, that the AR
methods return ActiveRelation objects which don’t actually fetch
anything until you want to look at them). It’s a brilliant
implementation pattern, but counter-intuitive to newbies.

It’s just “lazy evaluation” at work, and as such not so much
“counter-intuitive”. It’s only that it is not obvious that the design
choice was done in this way…

Perhaps “counter-intuitive” is the wrong word choice. It is something
many people new to the framework get tripped up on. I think this is due
to the fact that in console, if you type Person.with_blue_eyes (assuming
with_blue_eyes is a scope), they see the SQL execute and the result
immediately.

if you type Person.with_blue_eyes; puts “x”; you don’t see the SQL
execute (in fact, it doesn’t). It’s the fact that console calls inspect
on the last thing typed that makes it counter-intutive to newbies.

And of course, you have to know what lazy evaluation is and why AR does
it, which is the secret sauce, also not something newbies can grok
easily.

I guess all I really want is for the AR guide to have a section that
discusses this nuance (it doesn’t) as I find this is something that is
easy to trip up on and something I find myself explaining to people who
are new to Rails all the time.

-Jason

ronaldfischer · June 18, 2014, 6:36pm

Jason Fb wrote in post #1150062:

It took me several years of working with Rails to understand this
nuance, and I believe it is poorly documented in the AR guide. “reading
the source” may be a good idea for some, but remember the Rails source
isn’t easy for everyone to read (although a good idea!).

I don’t mind so much reading the source then relying on the
information we get from it. Even if you see that something is
implemented in a particular way, you don’t know whether it is just
incidentally (and in the next minor release will be implemented
differently), or whether it is an undocumented feature (which might or
might not make it in the API), or whether it is really part of the
“official” interface. If something is specified in the API, there is at
least good hope that we don’t run into incompatibilities (although when
looking at the evolution of Rails, this principle seems to have been
violated occasionally).

Since it is so important to how AR works, I think this facet of AR
should be documented better in the AR guide (specifically, that the AR
methods return ActiveRelation objects which don’t actually fetch
anything until you want to look at them). It’s a brilliant
implementation pattern, but counter-intuitive to newbies.

It’s just “lazy evaluation” at work, and as such not so much
“counter-intuitive”. It’s only that it is not obvious that the design
choice was done in this way…

Ronald

-Jason

ronaldfischer · June 19, 2014, 7:40am

Walter D. wrote in post #1150083:

You say tainted, and I hear Soft Cell…

This too, of course

Ronald