Hello all!
Background: I have a table consisting of roughly 14k rows running on
postgresql. I want to do this in a performance friendly way.
y TimedAsset.count
(1.6ms) SELECT COUNT(*) FROM “timed_assets”
— 13982
y TimedAsset.pluck(:id).count
(7.5ms) SELECT “timed_assets”.“id” FROM “timed_assets”
— 13982
This shows me that pluck is indeed going over all the rows.
y TimedAsset.first.id
TimedAsset Load (0.7ms) SELECT “timed_assets”.* FROM “timed_assets”
ORDER BY “timed_assets”.“id” ASC LIMIT 1
— 44
Shows me the first row contains an ID of 44
y TimedAsset.pluck(:id).first(5)
(7.2ms) SELECT “timed_assets”.“id” FROM “timed_assets”
- 5700
- 5701
- 5702
- 5703
- 5704
I would expect the first ID would be 44, not 5700, so I assume that
pluck
gives me an un-ordered list?
y TimedAsset.order(“id”).pluck(:id).last(10)
(14.4ms) SELECT “timed_assets”.“id” FROM “timed_assets” ORDER BY id
- 14047
- 14048
- 14049
- 14050
- 14051
- 14052
- 14053
- 14054
- 14055
- 14056
So instead, it seems I have to order the entire table first, then pluck,
and then getting the last few rows. This seems extremely inefficient
since
I have to order all rows of it first, even though I am specifically
using
pluck over select for performance and memory reasons. Does anyone have
any
suggestions or ideas for a higher performance method of getting the a
specific column of the last few rows of a table ordered by the primary
key?
I also extremely reccomend mentioning in the pluck API documentation
that
pluck returns an un-ordered list so others do not fall into the same
mistake I did.