OBIEE and Database
Performance Tuning
Antony Heljula
Technical Architect
© Yogi Indicators Limited
Agenda
q Aim of Presentation
q Test Scenario
q Performance Tests
q Summary & Conclusion
q And Finally….
© Yogi Indicators Limited
2
q Aim of Presentation
© Yogi Indicators Limited
3
OBIEE Performance Issues
q When reports take a long time
to return
results, end users tend to say “OBIEE does
not perform well”
q The truth is that most (99%?)
of the
processing typically takes place on the
underlying data sources
Ø If another BI tool was used
to deliver the
same reports, the queries would probably
still run just as slowly
q It would therefore be more
accurate to say
that it is the databases that are not
performing well!
WebLogic
BI Presentation Services
© Yogi Indicators Limited
4
“Analytics”
BI Plug-In
BI Server
Aim of Presentation
q We are going to investigate
how a set of OBIEE Dashboard queries can be
tuned to deliver satisfactory performance
q We’ll begin testing with a
set of dashboards that perform very poorly
q We’ll then implement a number
of tuning features one-by-one to see how
things improve….hopefully by the end of it we’ll have
decent
performance!
q The aim will be to get all
reports to return in less than 10 seconds
q The performance test results
will be captured and reported on using
Usage Tracking
© Yogi Indicators Limited
5
Aim of Presentation
q We are going to assume that
the database has an optimum and balanced
configuration, so that any performance issues are
“software” related and
not “hardware”
© Yogi Indicators Limited
6
Aim of Presentation
q On an Oracle database, there
are many known ways to improve query
performance. For
example:
Ø Gather statistics
Ø Remove snow-flakes
Ø Star Transformation
Ø Partitioning
Ø Bitmap Indexes
q We will attempt to answer the
following questions:
Ø Do they actually work?
Ø What performance gains do
they actually deliver?
Ø In what situations are they
most effective?
Ø Bitmap Join Indexes
Ø Compression
Ø Parallel Query
Ø Aggregation
Ø Denormalization
© Yogi Indicators Limited
7
Aim of Presentation
Notes
q The aim is to deliver
satisfactory performance using standard relational
database features
Ø The customer won’t be happy
if they are told mid-way through UAT that they
need to purchase additional software licenses e.g. Oracle
OLAP / Essbase
Ø We will however be looking at
“Partitioning”, although this option does
require additional license cost it is generally purchased
by most/all customers
who have large data volumes
q We are going to avoid the use
of “hints”
Ø Hints effectively hard-code
the optimization rules for database queries
Ø Hints are “old” technology
dating back to the Rule Base Optimizer (RBO), they
do not take into account the size and complexity of the
query in the way that
the Cost Based Optimizer (CBO) does
Ø Hints should always be the
last resort (in my view)
© Yogi Indicators Limited
8
q Test Scenario
© Yogi Indicators Limited
9
Test Scenario
Hardware
q Dell Latitude E6400:
Ø Windows 7 64-bit
Ø 2.54Ghz dual-core CPU
Ø 8GB RAM
Ø 250GB SATA internal hard disk
(7,200 RPM)
q Software:
Ø Oracle Database Enterprise
Edition 11g R2
Ø Oracle BI Enterprise Edition
11.1.1.3
© Yogi Indicators Limited
10
Test Scenario
Data-Model
q To conduct the investigation,
a database data-model was built entirely
from scratch:
Product Dimension
Fact (Daily Summary)
Time Dimension
Organization Dimension
Customer Dimension
© Yogi Indicators Limited
11
Test Scenario
Data Volumes
q The tables were then
populated with a completely fabricated set of data
Ø Number of records in each
table is show in red
Ø Approximately 10GB total
volume
500
5,000
30 Million
11,000
500,000
5,000
10,000
9,000
© Yogi Indicators Limited
12
1,000
Test Scenario
Fabricated Data
q Examples of the fabricated
data that was generated:
© Yogi Indicators Limited
13
Test Scenario
RPD
q An RPD was developed using
all the modern best-practices for a starschema
data-model:
© Yogi Indicators Limited
14
Test Scenario
Dashboards
q A “Sales Orders” dashboard
was created containing 6 pages with only 1
analysis per page
Ø 4 pages contained a “summary”
analysis (month level or above)
Ø 2 pages contained a “detail”
analysis (day/week level)
© Yogi Indicators Limited
15
Test Scenario
Test Rules
q Each performance test was
conducted manually but in a strict sequence:
1. Log
on
2. Go
to each dashboard page one-by-one (in the same order every time)
3. Only
move to the next page once the current page has returned results
4. Log
off
q To ensure every performance
test was fair, the following steps were taken
before each test was conducted:
Ø Restart database (to purge
all database cache)
Ø Purge BI Server cache
Ø Purge BI Presentation
Services cache
© Yogi Indicators Limited
16
Test Scenario
Final Notes
q Before starting the tests, we
had no indication as to what the results would be
– there was no certainty any firm conclusions could be
made afterwards
q No attempt was made to
“prepare” the data-model or the performance tests
so that the final results would look good or bad
q When each feature was tested,
no effort was made to tune the particular
feature – we simply used the default settings to see if it
would work straight
“out of the box”
q If we ran the exact same test
twice, the timings could vary by about 5-10
seconds or 10%. We
will therefore assume that timings have to be different by
10% or >10 seconds in order to conclude that any tuning
feature has made a
difference
© Yogi Indicators Limited
17
q Performance Tests
© Yogi Indicators Limited
18
1. Starting Point
Overview
q To begin with, the data-model
had the following features / issues:
Ø Plenty of snow-flaking
Ø No statistics generated (RBO
is therefore in use)
Ø B-Tree indexes used
throughout
Ø Star Transformation disabled
© Yogi Indicators Limited
19
1. Starting Point
Result
q 446 seconds in total to run
all 6 dashboard pages in sequence
q The 4 “summary” reports all
perform poorly
q The 2 “detail” reports are
returning in less than 10 seconds
© Yogi Indicators Limited
20
2. Gather Stats 30%
Overview
q If you have a performance
issue with a database query, one of the first
questions you will get asked is “have you analysed your
tables and
indexes?”
q The Oracle Database comes
with a “Cost Based Optimizer” (CBO) which
determines the most appropriate way to process your query
based on the
size and contents of the required tables and their indexes
Ø But you need to analyze your
tables (or “gather statistics”) in order for the
CBO to be used, otherwise the “Rule Based Optimizer” (RBO)
is used
q With large data warehouses,
it is sometimes not possible to analyze all the
data and indexes as the process will take too long, so
this first test will see
if gathering statistics using only a 30% sample of data is
sufficient to make
the CBO work efficiently
EXEC
DBMS_STATS.GATHER_TABLE_STATS (ownname => 'YOGITEST',
tabname
=> 'WH_CUSTOMER',
estimate_percent => 30);
© Yogi Indicators Limited
21
2. Gather Stats 30%
Result
© Yogi Indicators Limited
22
2. Gather Stats 30%
Summary
q Surprisingly, performance got
worse overall by 52%!
q “Summary” Reports
Ø 2 reports had significant
improvements >65%
Ø 2 reports did not change more
than 10%
q “Detail” Reports
Ø Both detail reports suffered
worse performance, one of them actually took
125 times longer than before!
Ø The over-use of “hash joins”
is causing the issue
© Yogi Indicators Limited
23
3. Gather Stats 100%
q If you have performance
issues after gathering statistics using a 30%
sample of data, it is quite likely that Oracle Support or
a DBA will
recommend that you try gathering statistics using a
greater sample of data
q So this next test will
provide an indication as to whether gathering
statistics using a “full” 100% sample will make a
significant difference…
Overview
EXEC
DBMS_STATS.GATHER_TABLE_STATS (ownname => 'YOGITEST',
tabname
=> 'WH_CUSTOMER',
estimate_percent
=> 100);
© Yogi Indicators Limited
24
2. Gather Stats 30%
Result
© Yogi Indicators Limited
25
3. Gather Stats 100%
Summary
q Performance improved overall
by 9%
q “Summary” Reports
Ø Performance was essentially
the same as with a 30% statistics sample
Ø Performance levels are still
far from acceptable
q “Detail” Reports
Ø Performance did improve
overall by 14%
Ø Performance levels are still
far from acceptable
© Yogi Indicators Limited
26
4. Star Transformation (No
Bitmaps)
Overview
q The Oracle database has a
special tuning feature designed for optimising
“star schemas” queries, it is enabled by setting the
following parameter:
Ø STAR_TRANSFORMATION_ENABLED =
TRUE
q It is however often
overlooked that the documentation recommends that,
for this feature to work properly, all the foreign key
columns on your fact
tables should have “bitmap indexes” created and not just
“b-tree indexes”:
q This test will see what the
effect is if you don’t have bitmap indexes on
your foreign key columns…
© Yogi Indicators Limited
27
4. Star Transformation (No
Bitmaps)
Result
© Yogi Indicators Limited
28
4. Star Transformation (No
Bitmaps)
Summary
q Performance overall got
slightly worse, but not by much
q “Summary” Reports
Ø Enabling Star Transformation
without bitmap indexes had no real effect
q “Detail” Reports
Ø 1 report improved by 50% (11
seconds) but overall there was no significant
difference
© Yogi Indicators Limited
29
5. Star Transformation
(Bitmaps)
Overview
q This test will see what the
impact is when Star Transformation is enabled
with bitmap indexes created for the foreign keys on the
fact table
(as
recommended by Oracle)
CREATE BITMAP INDEX WH_SALES_FACT_DAY_XFK_1 ON
WH_SALES_FACT_DAY_XFK (CUSTOMER_KEY);
CREATE BITMAP INDEX WH_SALES_FACT_DAY_XFK_2 ON
WH_SALES_FACT_DAY_XFK (PRODUCT_KEY);
CREATE BITMAP INDEX WH_SALES_FACT_DAY_XFK_3 ON
WH_SALES_FACT_DAY_XFK (ORG_KEY);
CREATE BITMAP INDEX WH_SALES_FACT_DAY_XFK_4 ON
WH_SALES_FACT_DAY_XFK (TIME_DAY_KEY);
© Yogi Indicators Limited
30
5. Star Transformation
(Bitmaps)
Result
© Yogi Indicators Limited
31
5. Star Transformation
(Bitmaps)
Summary
q Performance improved overall
by 50%!
Ø Finally things are starting
to get under control, but we still need to do better
q “Summary” Reports
Ø 15% overall improvement
Ø 2 of the reports had
significant improvements
q “Detail” Reports
Ø A 78% improvement overall,
although one of the reports increased from 8 to
60 seconds
© Yogi Indicators Limited
32
6. Remove Snow-Flakes (Dim
Cols.)
Overview
q Snow-flaking occurs when you
have a chain of Dimension tables joined
together
q The Oracle Data-Warehousing
states that performance can be improved
(especially with “Star Transformation”) if you data-model
does not consist
of snow-flakes
q So this test will show what
happens when we eliminate snow-flaking by
combining the snow-flaked tables into the main dimension
table (forming a
pure star-schema):
© Yogi Indicators Limited
33
6. Remove Snow-Flakes (Dim
Cols.)
Result
© Yogi Indicators Limited
34
6. Remove Snow-Flakes (Dim
Cols.)
Summary
q Surprisingly, it made things
slightly worse, no reports showed improvement!
q “Summary” Reports
Ø Moving snow-flaked dimension
columns into the main dimension table had no
real effect, just a few seconds worse overall
q “Detail” Reports
Ø Moving snow-flaked dimension
columns into the main dimension table had no
real effect, just a few seconds worse overall
© Yogi Indicators Limited
35
7. Remove Snow-Flakes (Add
FKs)
Overview
q As the previous method of
eliminating snow-flakes did not work, this time
we will try a slightly different approach
q In this test we will
eliminate snow-flakes by adding extra foreign keys to the
central fact table which join directly to the snow-flaked
dimension tables
(with “bitmap indexes” created)
q Will this more perfect
“star-schema” improve performance significantly?
© Yogi Indicators Limited
36
7. Remove Snow-Flakes (Add
FKs)
Result
© Yogi Indicators Limited
37
* Test 6 was backed out
7. Remove Snow-Flakes (Add
FKs)
Summary
q It made things much
worse….response times more than doubled!
Ø Adding more FKs made the fact
table much larger, making it longer to scan?
Ø More FKs more complexity?
q “Summary” Reports
Ø 136% worse overall
q “Detail” Reports
Ø 90% worse overall
© Yogi Indicators Limited
38
8. Bitmap Join Indexes
Overview
q Oracle promote “bitmap join
indexes” as a way of increasing performance by
“orders of magnitude”
q A bitmap join index is a
bitmap index which stores the actual results of a join
between two or more tables. For example, here is a bitmap join index
that
stores the result of the joins from the fact table to four
columns in the “Day”
dimension table:
CREATE BITMAP INDEX
WH_SALES_FACT_DAY_BMJ_8
ON WH_SALES_FACT_DAY (PER_NAME_YEAR,
PER_NAME_QTR, PER_NAME_MONTH,PER_NAME_WEEK)
FROM WH_SALES_FACT_DAY F, WH_TIME_DAY TD
WHERE F.TIME_DAY_KEY = TD.TIME_DAY_KEY;
q For this test 8 bitmap join
indexes were created to store the join results of all
the joins made between the fact and dimension tables
across the 6
dashboard pages
© Yogi Indicators Limited
39
8. Bitmap Join Indexes
Result
* Tests 6 and 7 were backed out
© Yogi Indicators Limited
40
8. Bitmap Join Indexes
Summary
q Bitmap join indexes improved
performance overall by 55 seconds (18%)
Ø However only 1 “detail”
report actually used a bitmap join index
Ø It seems the Oracle database
will only use a bitmap join index if it deems the query is
at a low enough granularity to make it worthwhile
q “Summary” Reports
Ø No changes in performance
Ø Bitmap join indexes did not
get used for any of the summary reports!
q “Detail” Reports
Ø One report reduced from 60
down to only 3 seconds. A massive
improvement!
Ø The other detail report did
not change, bitmap join indexes did not get used
© Yogi Indicators Limited
41
8. Bitmap Join Indexes
Overview
q Bitmap Join indexes do have
some restrictions:
© Yogi Indicators Limited
42
9. Partition by Month
Overview
q Table “partitioning” is a
very common recommendation when dealing with very large
tables. On a data
warehouse the most obvious thing to do is partition your large fact
tables by Day / Week / Month
q It is essential to make sure
that your partition strategy will result in effect “partition
pruning”. This
means that, when OBIEE runs a query, the database will know it only
needs to scan a sub-set of the table partitions, rather
than scanning the whole table
q In our case, we test out
partitioning tables by “Month” based on the existing
“TIME_DAY_KEY” foreign key column on the fact table, it is
in YYYYMMDD format
q When OBIEE queries for a
specific time period, the database will lookup the
TIME_DAY_KEY values in the “Time” dimension table and know
exactly which fact
table partitions to scan:
TIME_DAY_KEY
TIME_DAY_KEY
© Yogi Indicators Limited
43
9. Partition by Month
Overview
q Here is the SQL statement
used to build our partitioned fact table which has a
partition for each of the 24 months of data that exist:
© Yogi Indicators Limited
44
9. Partition by Month
q Note that when you partition
a table, you need to create “LOCAL” bitmap indexes on
the foreign key columns
q “LOCAL” means that the bitmap
index will also be broken down into partitions,
potentially improving performance even further as the
smaller index partitions will
not take so long to process:
CREATE BITMAP INDEX WH_SALES_FACT_DAY_1 ON
WH_SALES_FACT_DAY (CUSTOMER_KEY) LOCAL;
CREATE BITMAP INDEX WH_SALES_FACT_DAY_2 ON
WH_SALES_FACT_DAY (PRODUCT_KEY) LOCAL;
CREATE BITMAP INDEX WH_SALES_FACT_DAY_3 ON
WH_SALES_FACT_DAY (ORG_KEY) LOCAL;
CREATE BITMAP INDEX WH_SALES_FACT_DAY_4 ON
WH_SALES_FACT_DAY (TIME_DAY_KEY) LOCAL;
Overview
q Remember to analyze / gather
statistics afterwards!
© Yogi Indicators Limited
45
9. Partition by Month
Result
* Tests 6 and 7 were backed out
© Yogi Indicators Limited
46
9. Partition by Month
Summary
q Overall performance improved
by 57 seconds (23%)
q “Summary” Reports
Ø 3 out of the 4 summary
reports had excellent improvement >50%
Ø 1 summary report took 30
seconds (33%) longer. This report
contains two
“Year Ago” and “Year-to-Date” calculations which meant
that most/all the table
partitions had to be scanned (scanning lots of small partitions may
therefore be
less efficient than scanning a smaller number of larger
partitions)
q “Detail” Reports
Ø A good overall improvement of
6 seconds (35%)
© Yogi Indicators Limited
47
10. Parallel Query (Auto)
Overview
q The Oracle database offers
parallel query capability, the idea being that one
sequential task can be broken down in the multiple tasks
running in parallel
q Parallel query is a good
contender once you have implemented table
partitioning, is the Oracle database can easily process
each partition in
parallel
Ø On our test environment we
have 1 CPU and 1 disk – not the most appropriate
environment for testing parallel query
q For this test, we are
enabling parallel query simply by enabling the “auto
tuning” parallel query feature for the Oracle Database 11g
R2
Ø This feature ignores any
“parallel” degree settings you may have manually set for
your tables:
alter system set PARALLEL_DEGREE_POLICY=AUTO;
© Yogi Indicators Limited
48
10. Parallel Query (Auto)
Result
* Tests 6 and 7 were backed out
© Yogi Indicators Limited
49
q A slight performance
improvement overall to the “Summary” reports, the
10. Parallel Query (Auto)
Summary
“Detail” reports were unaffected
q NOTE:
Ø Even with no parallelism
enabled, our single disk was already operating at >80% capacity
Ø So by enabling parallelism
there was not much for improvement as our disk was already
near to bottlenecking
Ø It would be better to run
this test on a system with >1 CPU and >1 disk
q “Summary” Reports
Ø The longest running report
showed the best improvement of 30 seconds (25%)
q “Detail” Reports
Ø The detail reports had no
change, the explain plans showed that the Oracle
Database did not attempt to use parallel query for the
granular queries (an
excellent feature)
© Yogi Indicators Limited
50
10. Parallel Query (Auto)
Explain Plans with
PARALLEL_DEGREE_POLICY=AUTO
“Summary” report
has parallel query
“Detail” report has
no parallel query
© Yogi Indicators Limited
51
11. Compression
Overview
q The Oracle database allows
you to store data in a compressed (zipped) format,
with three possible benefits:
Ø Reduced overall storage
requirements
Ø Performance improvement due
to less disk reads (each record uses less space)
Ø Improved memory efficiency
(data is stored in memory in compressed format)
q The potential drawback is
with higher CPU activity since all data blocks have to be
uncompressed at run-time in order to be processed
Ø Compression may be a good
option in our test environment, since the CPU usage is
small compared to the disk usage (20-40% versus 80-100%)
q Compressing our fact table
reduced the data volume from 1.46GB down to
1.09GB, a reduction of about 25%
Ø Database was given a default
8K block size, increasing to 32/64K could increase the
compression ratio significantly
q NOTE: bitmap indexes always
store data in compressed format
© Yogi Indicators Limited
52
11. Compression
Overview
q Here is the SQL statement
used to build our compressed and partitioned fact table:
NOTE: There are limitations!
e.g. insert records with the hint:
INSERT /*+ APPEND */
© Yogi Indicators Limited
53
11. Compression
Result
* Tests 6 and 7 were backed out
© Yogi Indicators Limited
54
11. Compression
Summary
q With a 25% compression ratio,
performance improved by about 15% across
the board, although it had more effect on the “Summary”
reports where
more data was being extracted from disk
Ø NOTE: Oracle also has an
“Advanced Compression” feature which is designed to
provide even greater compression ratios
q “Summary” Reports
Ø Approximately 15% improvement
q “Detail” Reports
Ø No real change
© Yogi Indicators Limited
55
12. Aggregation (MVs)
Overview
q Many customers can find
themselves in this situation: we have
already implemented
several different tuning mechanisms but our “Summary”
dashboards are still taking far
too long…..an average of 41 seconds each (with only 1
user!)
q Apart from adding more
hardware (CPUs/memory/disks), what is the next option?
q Sometimes there is no
alternative to summarising the data in order to reduce the
amount of data being processed
© Yogi Indicators Limited
56
12. Aggregation (MVs)
Overview
q Implementing an aggregate
strategy is often a balancing act:
Ø You want to keep the number
of aggregates to an absolute minimum
Ø You want to summarise the
data as much as possible
BUT
Ø You want aggregates that can
serve multiple reports/dashboards (plus ad hoc)
Ø You want aggregates to
support multiple drill-down levels across many hierarchies
q Sometimes you have no choice
other than to create an aggregate for each dashboard
(in which case you may need to consider cube engines, more
hardware etc):
© Yogi Indicators Limited
57
12. Aggregation (MVs)
Overview
q In our test environment, we
have managed to build a single aggregate which can:
Ø Reduce the number of records
from 31M down to 8M
Ø Support all 4 “Summary”
dashboards
Ø Provide at least 2 levels of
drill-down across all the hierarchies
© Yogi Indicators Limited
58
12. Aggregation (MVs)
Overview
q HINT: Use the “Number of
Elements” in the RPD to help you determine how much
more efficient an aggregate table could be
© Yogi Indicators Limited
59
12. Aggregation (MVs)
Overview
q Aggregate tables on an Oracle
database can be either:
Ø Standard physical tables
(updated or rebuilt during ETL)
Ø Materialized Views
q In theory there is no
difference between the two options as they both contain
snapshots of the summarised data
Ø Materialized Views are easier
to maintain and quicker to develop
q If you use Materialized
Views, then please note:
Ø Do not rely too heavily on
the “Query Rewrite” feature as OBIEE can generate
queries that are too complex for the database to rewrite
Ø It is often better to model
the MVs into the RPD as if they were normal physical
aggregate tables (so you rely on OBIEE “aggregate
navigation” rather than
database “query rewrite”)
Ø Remember to partition your
MVs and create bitmap indexes on any foreign keys
© Yogi Indicators Limited
60
12. Aggregation (MVs)
Overview
q s
MV partitioned by
month
© Yogi Indicators Limited
61
LOCAL bitmap indexes
(don’t forget to analyze)
12. Aggregation (MVs)
Result
* Tests 6 and 7 were backed out
© Yogi Indicators Limited
62
12. Aggregation (MVs)
Summary
q Finally! All queries are now taken 10 seconds or less
q “Summary” Reports
Ø All reports improved, running
over 7 times faster than before! (a
total of 143
seconds down to 23)
q “Detail” Reports
Ø The aggregate MVs were
designed for the summary reports, so the detail
reports were unaffected and perform exactly the same as
before
© Yogi Indicators Limited
63
q Summary & Conclusion
© Yogi Indicators Limited
64
Summary
The biggest gains for
“Summary” reports:
• Gathering Stats 30%
• Aggregation
• Star Transformation
• Partitioning
The biggest gains
for “Detail” reports:
• Star Transformation
• Bitmap Join Indexes
• Partitioning
© Yogi Indicators Limited
65
Conclusion
q The Oracle database provides
a wide variety of tuning features, but you need to
adopt a combination of tuning features
q The various tuning mechanisms
suit different types of reports
q If you have performance
issues with “Summary” reports then consider:
Ø Gathering statistics 30/100%
Ø Star Transformation
Ø Partitioning
Ø Parallel Query
Ø Aggregation
Ø Compression
q If you have performance
issues with “Detail” reports then consider:
Ø Gathering statistics
100% (as opposed to 30%)
Ø Star Transformation
Ø Partitioning
Ø Bitmap Join Indexes
© Yogi Indicators Limited
66
Conclusion
q Always test thoroughly! Whilst the various tuning mechanisms can lead
to an
overall positive improvement, there
can be surprises where specific reports
suffer worse performance (sometimes significantly worse):
© Yogi Indicators Limited
67
And Finally….
q What happens if none of the
tuning mechanisms discussed give you the
desired level of performance?
Ø Buy bigger/better
hardware (more CPUs+Disks+Memory)
Ø Archive data to reduce volume
Ø Oracle OLAP (now properly supported with OBIEE
11.1.1.5)
Ø Oracle Essbase
Ø Oracle Exadata
© Yogi Indicators Limited
68
q Questions?
© Yogi Indicators Limited
Helping Your Business Intelligence Journey
© Yogi Indicators Limited
No comments:
Post a Comment