What is the best way to get the Top-N values from a table in Oracle? In the past the answer to this question has always been a little convoluted. There are a couple of clear ways to perform this task but they can feel kludgey when explaining them to an inexperienced Oracle developer or analyst. The processes work, but they don't seem as clear cut as they should.
With 12c Oracle has added a feature to allow for Top-N queries with FETCH FIRST|NEXT|PERCENT clauses. This approach appears more elegant than the "old" ways since it is easier to explain. Also there is some nice functionality with this syntax to allow for offsetting the values returned, including duplicate values, and even allowing to fetch the top X percent of values. See the Oracle documentation for examples.
In this post I want to do a quick comparision between the new Top-N approach and some older approaches. In the example I ran one case of the old approach ran much better when a particular index was involved, so it goes to show that something "new" isn't always "better".
As with any feature, be sure to investigate the performance against representative amounts of data as you develop.
In the following example I'll query against big_tab which is a copy of the dba_objects table. After initial tests I'll unhide an index and see that only one query uses that index.
Note, all of the queries return the same data so there are no issues seen with functionality.
The tests were using the parameters
compatible 12.1.0.0.0
optimizer_features_enable 12.1.0.1
The FETCH FIRST clause runs in 2.4 seconds and does 48,190 logical reads. This is comparable to the other methods investigated which included a RANK() clause and the older subquery with rownum<=10.
Option 1.
SELECT /*+ gather_plan_statistics*/ owner, object_id FROM op.big_tab
ORDER BY object_id DESC
FETCH FIRST 10 ROWS ONLY
Option 2.
select /*+ gather_plan_statistics */ owner, object_id
from
(select owner, object_id, rank() over (order by object_id desc ) as rank
from big_tab )
where rank <=10
Option 3.
select /*+ gather_plan_statistics */ stage.owner, stage.object_id
from ( select owner, object_id from op.big_tab order by object_id desc) stage
where rownum <=10
The detailed execution plans are listed out in the following sections. These show the details for Actual time and Buffer Gets to allow for comparing behavior.
Execution plan for Option 1.
SQL_ID 7thwrzawu40gz, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics*/ owner, object_id FROM op.big_tab
ORDER BY object_id DESC FETCH FIRST 10 ROWS ONLY
Plan hash value: 4205637774
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:02.40 | 48190 | 48185 | | | |
|* 1 | VIEW | | 1 | 2866K| 10 |00:00:02.40 | 48190 | 48185 | | | |
|* 2 | WINDOW SORT PUSHED RANK| | 1 | 2866K| 10 |00:00:02.40 | 48190 | 48185 | 80896 | 80896 |71680 (0)|
| 3 | TABLE ACCESS FULL | BIG_TAB | 1 | 2866K| 2866K|00:00:08.15 | 48190 | 48185 | | | |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("OBJECT_ID") DESC )<=10)
Execution plan for Option 2
SQL_ID 4y49s6tk4crha, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ owner, object_id from (select
owner, object_id, rank() over (order by object_id desc ) as rank from
big_tab ) where rank <=10
Plan hash value: 4205637774
----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:02.47 | 48190 | 48185 | | | |
|* 1 | VIEW | | 1 | 2866K| 10 |00:00:02.47 | 48190 | 48185 | | | |
|* 2 | WINDOW SORT PUSHED RANK| | 1 | 2866K| 11 |00:00:02.47 | 48190 | 48185 | 95232 | 95232 |83968 (0)|
| 3 | TABLE ACCESS FULL | BIG_TAB | 1 | 2866K| 2866K|00:00:08.52 | 48190 | 48185 | | | |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RANK"<=10)
2 - filter(RANK() OVER ( ORDER BY INTERNAL_FUNCTION("OBJECT_ID") DESC )<=10)
Execution plan for Option 3
SQL_ID 7x3nv0d1r475r, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ stage.owner, stage.object_id
from ( select owner, object_id FROM op.big_tab ORDER BY object_id DESC)
stage where rownum <=10
Plan hash value: 2431194888
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:02.54 | 48198 | 48187 | | | |
|* 1 | COUNT STOPKEY | | 1 | | 10 |00:00:02.54 | 48198 | 48187 | | | |
| 2 | VIEW | | 1 | 2866K| 10 |00:00:02.54 | 48198 | 48187 | | | |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 2866K| 10 |00:00:02.54 | 48198 | 48187 | 80896 | 80896 |71680 (0)|
| 4 | TABLE ACCESS FULL | BIG_TAB | 1 | 2866K| 2866K|00:00:13.03 | 48198 | 48187 | | | |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
3 - filter(ROWNUM<=10)
Note that when an index was created on the object_id field Options 1 and 2 did not utilize the index. Option 3 (with the subquery), however, did utilize the index as we can see below. This is an interesting result that should be kept in mind as these types of queries are developed.
Execution plan for Option 3 (utilizing index)
SQL_ID 7x3nv0d1r475r, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ stage.owner, stage.object_id
from ( select owner, object_id FROM op.big_tab ORDER BY object_id DESC)
stage where rownum <=10
Plan hash value: 1151062491
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 4 | 4 |
|* 1 | COUNT STOPKEY | | 1 | | 10 |00:00:00.01 | 4 | 4 |
| 2 | VIEW | | 1 | 10 | 10 |00:00:00.01 | 4 | 4 |
| 3 | TABLE ACCESS BY INDEX ROWID| BIG_TAB | 1 | 2866K| 10 |00:00:00.01 | 4 | 4 |
| 4 | INDEX FULL SCAN DESCENDING| BIG_IDX | 1 | 10 | 10 |00:00:00.01 | 3 | 3 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
In conclusion, in most cases it looks like the new FETCH FIRST query will run just as well as the pre-12c methods for collecting the TOP-N queries. In this test, the subquery method (Option 3) was able to utilize an index the others could not, so if there are performance consideration you should check the plans against production level amounts of data.