Select Journal - First Quarter 2013 - (Page 23)
Retrieving Large Volumes
of Data
By Andrei Dzianisau
Edited by Ian Abramson
T
What are the things you can do today with a basic Oracle installation? This
article will discuss numerous strategies and approaches we at EPAM Systems
have seen to be very successful in optimizing how we use and deploy data
solutions. The top concepts we will discuss include the following:
•
•
•
•
•
Narrowing down amount of data to access
Using aggregates to optimize data retrieve and minimize calculations
Changing the design of your solutions
Using some of the built-in functions supplied by Oracle
Combining processes to streamline processing
These are just a few of the approaches one could take to support and ensure
that your data warehouse performs well and consistently. Each approach we
will discuss may appear basic on the surface, but the value each can provide
can significantly impact your overall performance and save your organization
from investing in unneeded hardware.
oday’s database systems are being asked to store
more information than ever before. According to
research firm IDC, the size of data — which it calls
the “digital universe” — will grow to 2.7 zetabytes in
2012, up 48 percent from 2011’s record year. This growth is
unprecedented, and we must be able to manage databases
that contain significantly more information that needs to be
retrieved in less and less time. It is more important than ever
to understand the options Oracle provides to users within the
database that can truly empower your applications regardless
of the size of your data sets.
Selecting Only the Data You Need
One of the most basic concepts you need to embrace is to minimize how
much information is retrieved from disks. Disks tend to be the slowest part of
the I/O equation, and, therefore, you need to ensure that only the blocks you
are interested in are being read. Selecting data and reducing the number of
full table scans is a lifelong journey for many Oracle professionals.
Hardware appliances such as Oracle’s Exadata have helped significantly, but
keeping the laws of physics in mind, there is a limit. Whether you use the
faster memory, solid state disks or read technology, sometimes even very
powerful hardware doesn’t help. For instance, it’s not possible to read 60 GB
of data off from disk in just a few seconds. There are limits to how powerful
a storage system can be. As a result, we need to understand how to take
advantage of Oracle and the features and facilities available to us. Not
everyone can afford to purchase an Exadata machine to solve their data
volume issues, so this article investigates some of these capabilities that all
Oracle databases can take advantage of.
The following code shows statement with very few filters with wide selection
criteria defined:
It is a very common situation that a business asks for information to answer
important business questions on daily/hourly basis. It then demands that
these questions are answered within seconds (or, at very most, a matter of
minutes). It is also very common to expect business intelligence (BI) systems
to create reports that require vast amounts of data to support complex
reporting needs and be able to use that data quickly and efficiently. The tools
today try to optimize how they retrieve data, but, sometimes, with poorly
deployed data warehouses, this is not always possible.
Filters with low selectivity bring a database to conditions when full table scans
of very big fact tables are the most efficient way to access and calculate data.
For instance, report with the time dimension filter applied on year level like
year = 2011 and no other filters provided for products or departments will
cause the cost-based optimizer (CBO) to switch to full-table scan (FTS).
Here is a simple example, following query extracts data for whole year 1998:
create bitmap index x1 on times(fiscal_month_number) compute statistics;
create bitmap index x2 on times(fiscal_year) compute statistics;
SELECT
t.fiscal_year
, t.fiscal_month_number
, c.channel_desc
, SUM(s.quantity_sold)
, SUM(s.amount_sold)
, count(distinct promo_id)
FROM sales s
, times t
, channels c
WHERE s.time_id = t.time_id
AND s.channel_id = c.channel_id
and t.fiscal_year = 1998
group by t.fiscal_year, t.fiscal_month_number
, c.channel_desc;
continued on page 24
1st Qtr 2013 ■ Page 23
Table of Contents for the Digital Edition of Select Journal - First Quarter 2013
Select Journal - First Quarter 2013
Table of Contents
From the Editor
From the IOUG President
Enterprise Manager 12c Cloud Control: What’s Changed, What’s New
Introduction to Oracle Enterprise Manager Command Line Interface
Users Group Calendar
Retrieving Large Volumes of Data
A Multilayered Approach to Oracle Database Availability
Advertisers’ Index
Ask an Oracle ACE
Select Journal - First Quarter 2013
https://www.nxtbook.com/nxtbooks/smithbucklin/ioug_bestpractices2013
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2013q2
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2013q1
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2012q4
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2012q3
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2012q2
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2012q1
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2011q4
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2011q3
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2011q2
https://www.nxtbook.com/nxtbooks/smithbucklin/selectjournal_2011q1
https://www.nxtbookmedia.com