Oracle Exadata Smart Scan: Why 40% of Your Queries Don't Use It

What Smart Scan Actually Does

In a conventional Oracle full table scan, the database layer reads blocks from storage into the SGA buffer cache, then applies predicates and projection in the database layer. For a 2TB table with a selective predicate returning 0.1% of rows, you're still moving 2TB from storage to the database node, then discarding 99.9% of it after reading.

Exadata Smart Scan pushes the predicate evaluation and column projection down to the storage cells. The cells read the data, apply the WHERE clause filter, extract only the requested columns, and return only the qualifying rows to the database layer. For the same 2TB table with 0.1% selectivity, you return roughly 2GB instead of 2TB. The I/O reduction is the source of Exadata's analytics performance advantage.

Smart Scan also enables storage index offloading (eliminating regions of storage that can't contain qualifying rows) and columnar cache (in-memory columnar format on the storage cells). None of these work without Smart Scan being active.

The Seven Eligibility Requirements

Smart Scan activation requires all seven of the following conditions to be true simultaneously. Meeting six out of seven gives you zero Smart Scan.

1. Full Segment Scan

Smart Scan only activates for full segment scans — full table scans and fast full index scans. Index range scans, index unique scans, and rowid-based lookups do not use Smart Scan. The query must be doing a full scan of a segment (table, partition, or index) to qualify.

This is why high-selectivity OLTP queries on Exadata don't benefit from Smart Scan — they use indexes and return small result sets via rowid lookups, which is already efficient and doesn't need offloading.

2. Direct Path I/O

Smart Scan requires direct path I/O, which bypasses the SGA buffer cache entirely and reads directly from storage into PGA. Direct path I/O is not used for:

Tables that have any blocks currently cached in the buffer cache (small tables below a threshold are cached)
Segments smaller than _small_table_threshold (default: 2% of buffer cache, approximately 20MB on a typical configuration)
Tables with active parallel query at high parallelism degrees where the coordinator forces cache usage

The small table threshold is one of the most common Smart Scan disqualifiers in mixed workloads. A "large" lookup table that sits at 15MB — just below the threshold — will be cached and never Smart Scanned, even if you're running analytical queries against it.

-- Check current small table threshold
SELECT name, value
FROM v$parameter
WHERE name = '_small_table_threshold';

-- Check whether a specific segment is below the threshold
SELECT
  segment_name,
  bytes / 1024 / 1024 AS size_mb,
  blocks
FROM dba_segments
WHERE owner = 'SCHEMA_NAME'
  AND segment_name = 'TABLE_NAME';

3. The Object Must Reside on Exadata Storage

Objects in the KEEP buffer cache pool (STORAGE (BUFFER_POOL KEEP)) are explicitly kept in the SGA and bypass direct path I/O. Smart Scan is not used for KEEP pool objects. This is a surprisingly common issue in systems that were tuned for conventional storage and then migrated to Exadata without revisiting the storage hints.

-- Find tables assigned to KEEP pool that may be blocking Smart Scan
SELECT owner, table_name, buffer_pool
FROM dba_tables
WHERE buffer_pool = 'KEEP'
  AND owner NOT IN ('SYS', 'SYSTEM');

-- Reassign to DEFAULT pool to re-enable Smart Scan eligibility
ALTER TABLE schema_name.table_name STORAGE (BUFFER_POOL DEFAULT);

4. No Row Chaining or Migration for LOB Columns

Tables with LOB columns store the LOB data out-of-line (in a separate LOB segment) by default when the LOB exceeds the inline threshold. Smart Scan on the base table is still possible, but Smart Scan cannot offload LOB data retrieval — the database layer must fetch LOB values via rowid after the Smart Scan returns qualifying rows. If your query SELECTs LOB columns and the LOBs are stored out-of-line, Smart Scan handles the row identification but not the LOB fetch, which limits the I/O benefit.

5. Not a Temporary Segment

Temporary segments (TEMP tablespace usage from sort operations, hash joins, or GTTs) are never Smart Scanned. Queries that generate large temp segment I/O during execution don't benefit from Exadata offloading for that I/O component. This matters for queries that produce large intermediate result sets in temp — the sort or hash join phases run at conventional I/O speed regardless of Exadata.

6. Parallel Query or _serial_direct_read Enabled

Smart Scan requires either parallel query execution or serial direct read enabled for the session. In serial execution mode (no parallel hints, no parallel table degree), Smart Scan is only used if _serial_direct_read is set to always or if the segment is large enough to trigger automatic serial direct read (which requires the segment to exceed the serial direct read threshold, approximately 5x the buffer cache size).

On most Exadata configurations, _serial_direct_read is set to always at the database level to ensure serial analytics queries also benefit from Smart Scan. If it's not set, large serial full scans fall back to buffer cache reads:

-- Check serial direct read setting
SELECT name, value
FROM v$parameter
WHERE name = '_serial_direct_read';

-- Set at system level (requires restart or alter system)
ALTER SYSTEM SET "_serial_direct_read" = always SCOPE=BOTH;

7. No Column Encryption

Transparent Data Encryption (TDE) at the column level disables Smart Scan for the encrypted columns. TDE tablespace encryption (encrypting the entire tablespace) is compatible with Smart Scan on Exadata X6 and later hardware, but column-level TDE is not. Teams that implemented column-level TDE for compliance on conventional storage and then moved to Exadata often find their encrypted columns never Smart Scan.

Diagnosing Smart Scan Usage

The primary diagnostic is the cell physical IO bytes eligible for predicate offload and cell physical IO interconnect bytes returned by smart scan statistics from V$SQL_STATISTICS or V$SESSTAT:

-- For a specific SQL_ID, check Smart Scan statistics
SELECT
  ss.name,
  ss.value
FROM v$sql s
JOIN v$sql_optimizer_env e ON s.sql_id = e.sql_id
JOIN v$statname sn ON sn.name IN (
    'cell physical IO bytes eligible for predicate offload',
    'cell physical IO interconnect bytes returned by smart scan',
    'cell IO uncompressed bytes'
)
CROSS JOIN v$sesstat ss
WHERE s.sql_id = '&sql_id'
  AND ss.statistic# = sn.statistic#;

-- Better: use the session-level statistics immediately after running a query
SELECT
  n.name,
  s.value
FROM v$sesstat s
JOIN v$statname n ON s.statistic# = n.statistic#
WHERE s.sid = SYS_CONTEXT('USERENV','SID')
  AND n.name LIKE 'cell%'
  AND s.value > 0
ORDER BY s.value DESC;

The ratio that matters:

Smart Scan efficiency =
  "cell physical IO interconnect bytes returned by smart scan"
  / "cell physical IO bytes eligible for predicate offload"

A ratio of 0.01 means Smart Scan filtered 99% of data at the cell layer.
A ratio of 1.0 means Smart Scan returned all bytes — predicate was not selective.
A value of 0 for "bytes returned by smart scan" while "bytes eligible" is non-zero
means Smart Scan was attempted but produced no benefit (or the session stat wasn't captured).

If both values are zero, Smart Scan was not used at all — the query ran conventional I/O. Check the eligibility requirements against that specific query's execution context.

The AWR Report Smart Scan Section

AWR reports on Exadata include a "Cell Statistics" section that aggregates Smart Scan usage across all SQL during the snapshot window:

-- Direct query of DBA_HIST_SYSSTAT for Smart Scan trend
SELECT
  s.snap_id,
  TO_CHAR(sn.end_interval_time, 'YYYY-MM-DD HH24') AS hour,
  SUM(CASE WHEN n.name = 'cell physical IO bytes eligible for predicate offload'
           THEN s.value END) / 1024/1024/1024 AS eligible_gb,
  SUM(CASE WHEN n.name = 'cell physical IO interconnect bytes returned by smart scan'
           THEN s.value END) / 1024/1024/1024 AS returned_gb
FROM dba_hist_sysstat s
JOIN dba_hist_snapshot sn ON s.snap_id = sn.snap_id
JOIN v$statname n ON s.stat_id = n.statistic#
WHERE n.name IN (
    'cell physical IO bytes eligible for predicate offload',
    'cell physical IO interconnect bytes returned by smart scan'
)
  AND sn.end_interval_time > SYSDATE - 7
GROUP BY s.snap_id, sn.end_interval_time
ORDER BY s.snap_id;

If eligible_gb is consistently much lower than your total I/O throughput, a significant fraction of queries are not qualifying for Smart Scan. Cross-reference the top I/O SQL from AWR against the eligibility requirements to find the disqualifiers.

Smart Scan and Exadata Migration Planning

When teams are evaluating whether to stay on Exadata or migrate to AWS (typically to Redshift, Aurora, or RDS), the Smart Scan utilization number is a critical input. If actual Smart Scan usage is low — because the workload is OLTP-dominant, tables are below the small table threshold, or column-level TDE is widespread — the Exadata Smart Scan advantage is already not being realized. The migration cost-benefit calculation changes significantly when the feature you're paying Exadata licensing costs for is running at 30% utilization.

Measure actual Smart Scan eligibility ratios over 30 days before making platform decisions. The number is frequently lower than the team expects.

Running Exadata and uncertain about actual Smart Scan utilization?

Start with a free 20-minute assessment call to scope the problem and decide whether a paid diagnostic or implementation step is worthwhile. Written findings are not included in the free call.

Get a Free Assessment → More Articles