Academic Tutorials



English | French | Portugese | German | Italian
Home Advertise Payments Recommended Websites Interview Questions FAQs
News Source Codes E-Books Downloads Jobs Web Hosting
Chats

Data Warehousing
Datawarehousing Introduction
Datawarehousing Glance
Datawarehousing Overview
Datawarehousing Tools
Datawarehousing Methods
Datawarehousing Design
Datawarehousing Assess
Datawarehousing Structure
Datawarehousing Protect
Datawarehousing Uses
Datawarehousing Historical Information
Datawarehousing Manage a Data
Datawarehousing Manage Meta Data
Datawarehousing Architecture
Datawarehousing Records
Datawarehousing Advantages
Datawarehousing Data Mart
Datawarehousing Principles
Datawarehousing Benefits
Datawarehousing Disadvantages
Datawarehousing Issues
Datawarehousing Requirements
Datawarehousing Useful
Datawarehousing Themes
Datawarehousing Building
Datawarehousing Rating
Datawarehousing differ from a Database
Datawarehousing Efficient Process
Datawarehousing Quality Management
Datawarehousing Evaluate the Software
Datawarehousing Challenges
Datawarehousing Non Technical

HTML Tutorials
HTML Tutorial
XHTML Tutorial
CSS Tutorial
TCP/IP Tutorial
CSS 1.0
CSS 2.0
HLML
XML Tutorials
XML Tutorial
XSL Tutorial
XSLT Tutorial
DTD Tutorial
Schema Tutorial
XForms Tutorial
XSL-FO Tutorial
XML DOM Tutorial
XLink Tutorial
XQuery Tutorial
XPath Tutorial
XPointer Tutorial
RDF Tutorial
SOAP Tutorial
WSDL Tutorial
RSS Tutorial
WAP Tutorial
Web Services Tutorial
Browser Scripting
JavaScript Tutorial
VBScript Tutorial
DHTML Tutorial
HTML DOM Tutorial
WMLScript Tutorial
E4X Tutorial
Server Scripting
ASP Tutorial
PERL Tutorial
SQL Tutorial
ADO Tutorial
CVS
Python
Apple Script
PL/SQL Tutorial
SQL Server
PHP
.NET (dotnet)
Microsoft.Net
ASP.Net
.Net Mobile
C# : C Sharp
ADO.NET
VB.NET
VC++
Multimedia
SVG Tutorial
Flash Tutorial
Media Tutorial
SMIL Tutorial
Photoshop Tutorial
Gimp Tutorial
Matlab
Gnuplot Programming
GIF Animation Tutorial
Scientific Visualization Tutorial
Graphics
Web Building
Web Browsers
Web Hosting
W3C Tutorial
Web Building
Web Quality
Web Semantic
Web Careers
Weblogic Tutorial
SEO
Web Site Hosting
Domain Name
Java Tutorials
Java Tutorial
JSP Tutorial
Servlets Tutorial
Struts Tutorial
EJB Tutorial
JMS Tutorial
JMX Tutorial
Eclipse
J2ME
JBOSS
Programming Langauges
C Tutorial
C++ Tutorial
Visual Basic Tutorial
Data Structures Using C
Cobol
Assembly Language
Mainframe
Forth Programming
Lisp Programming
Pascal
Delphi
Fortran
OOPs
Data Warehousing
CGI Programming
Emacs Tutorial
Gnome
ILU
Soft Skills
Communication Skills
Time Management
Project Management
Team Work
Leadership Skills
Corporate Communication
Negotiation Skills
Database Tutorials
Oracle
MySQL
Operating System
BSD
Symbian
Unix
Internet
IP-Masquerading
IPC
MIDI
Software Testing
Testing
Firewalls
SAP Module
ERP
ABAP
Business Warehousing
SAP Basis
Material Management
Sales & Distribution
Human Resource
Netweaver
Customer Relationship Management
Production and Planning
Networking Programming
Corba Tutorial
Networking Tutorial
Microsoft Office
Microsoft Word
Microsoft Outlook
Microsoft PowerPoint
Microsoft Publisher
Microsoft Excel
Microsoft Front Page
Microsoft InfoPath
Microsoft Access
Accounting
Financial Accounting
Managerial Accounting
Network Sites


Challenges of Using Data Warehouses


Previoushome Next






The Challenges of Using Data Warehouses

As today's decisions in the business world become more real-time, the systems that support those decisions need to keep up. It is only natural that Data Warehouse, Business Intelligence, Decision Support, and OLAP systems quickly begin to incorporate real-time data. Data warehouses and business intelligence applications are designed to answer exactly the types of questions that users would like to pose against real-time data. They are able to analyze vast quantities of data over time, to determine what is the best offer to make to a customer, or to identify potentially fraudulent, illegal, or suspicious activity. Ad-hoc reporting is made easy using today's advanced OLAP tools. All that needs to be done is to make these existing systems and applications work off real-time data. The following are a few of the challenges of adding real-time data to these systems:

A D V E R T I S E M E N T

Enabling Real-time ETL

One of the most difficult parts of building any data warehouse is the process of extracting, transforming, cleansing, and loading the data from the source system. Performing ETL of data in real-time introduces additional challenges. Almost all ETL tools and systems, whether based on off-the-shelf products or custom-coded, operate in a batch mode. They assume that the data becomes available as some sort of extract file on a certain schedule, usually nightly, weekly, or monthly. Then the system transforms and cleanses the data and loads it into the data warehouse.
This process typically involves downtime of the data warehouse, so no users are able to access it while the load takes place. Since these loads are usually performed late at night, this scheduled downtime typically does not inconvenience many users.
When loading data continuously in real-time, there can't be any system downtime. The heaviest periods in terms of data warehouse usage may very well coincide with the peak periods of incoming data. The requirements for continuous updates with no warehouse downtime are generally inconsistent with traditional ETL tools and systems. Fortunately, there are new tools on the market that specialize in real-time ETL and data loading. There are also ways of modifying existing ETL systems to perform real-time or near real-time warehouse loading.

Modeling Real-time Fact Tables

The introduction of real-time data into an existing data warehouse, or the modeling of real-time data for a new data warehouse brings up some interesting data modeling issues. For instance, a warehouse that has all of its data aggregated at various levels based on a time dimension needs to consider the possibility that the aggregated information may be out of synch with the real-time data. Also some metrics such as month-to-date and week-to-date may behave strangely with a partial day of data that continuously changes. The main issue regarding modeling however revolves around where the real-time data is stored, and how best to link it into the rest of the data model.

OLAP Queries vs. Changing Data

OLAP and Query tools were designed to operate on top of unchanging, static historical data. Since they assume that the underlying data is not changing, they don't take any precautions to ensure that the results they produce are not negatively influenced by data changes concurrent to query execution. In some cases, this can lead to inconsistent and confusing query results.
Relational OLAP tools are particularly sensitive to this problem because they perform all but the simplest data analysis operations by issuing multi-pass SQL. A multi-pass SQL statement is made up of many smaller SQL statements that sequentially operate on a set of temporary tables.
This presents two problems. The first problem is that the results of a query that takes even one minute are arguably not exactly real-time anymore. While this data latency may be acceptable to a retail division manager, it might not be ok for an application that is looking for atmospheric trends that indicate the presence of a tornado, or for an application detecting real-time credit card or telecommunications fraud.
The second problem is that given the multiple passes of SQL required to perform almost any relational OLAP reporting or analytical operation, any real-time warehouse is likely to suffer from the result set internal inconsistency issue discussed above. There's nothing like the numbers not adding up properly to make a user skeptical of a report. For more complex product affinity or trend detection analytics, the results may be so confusing as to be completely useless.

Scalability & Query Contention

The issue of query contention and scalability is the most difficult issue facing organizations deploying real-time data warehouse solutions. Data warehouses were separated from transactional systems in the first place because the type of complex analytical queries run against warehouses don't "play well" with lots of simultaneous inserts, updates, or deletes.
Usually the scalability of data warehouse and OLAP solutions is a direct function of the amount of data being queried and the number of users simultaneously running queries. Given a fixed amount of data, the number of users on the system is proportional to query response time. Lots of concurrent usages causes reports to take longer to execute.
While this is still true in a real-time system, the additional burden of continuously loading and updating data further strains system resources. Unfortunately the additional burden of a continuous data load is not just equivalent to one or two additional simultaneously querying users due to the contention between data inserts and typical OLAP select statements. While it depends on the database, the contention between complex selects and continuous inserts tends to severely limit scalability. Surprisingly quickly the continuous data loading process may become blocked, or what used to be fast queries may begin to take intolerably long to return.

Real-time Alerting

Most alerting applications associated with data warehouses to date have been mainly used to distribute email versions of reports after the nightly data warehouse load. The availability of real-time data in a data warehouse makes alerting applications much more appealing, as users can be alerted to real-time conditions as they occur in the warehouse, not just on a nightly basis.
The availability of real-time data makes products such as MicroStrategy's NarrowCaster and similar products from Cognos and Business Objects very valuable. But real-time alerting using these products brings its own set of challenges, as surprisingly these products, like many query tools from the same vendors, were not designed to operate on or tested against real-time data feeds.
These products operate on a schedule or event basis, so they can either trigger an alert every few minutes or hours, or need to be triggered by an external system. Solutions to address these challenges are available in the market. There is also the issue of threshold management. When alerts are triggered frequently , there needs to be a mechanism in place to make sure that once an alert is sent due to a condition in the warehouse that the alert is not continuously sent over and over again during each alerting cycle.

As we have seen in this article, real-time data warehousing and OLAP are possible using today's technology, but challenges lurk seemingly every step of the way. For the determined team armed with the right knowledge and experience, it is possible to make real-time reporting, analysis, and alerting systems work. The challenge is to make the right tradeoffs along the way, to make sure the systems meet the needs of the user base while ensuring that they don't collapse under their own weight, or cause existing production warehouses to malfunction
It is likely that a lot of the challenges discussed above will become less challenging over time, as database, ETL, OLAP, reporting, and alerting tool vendors begin to add features to their systems to make them work better with real-time data streams. In the meantime, it is important to make sure real-time warehousing systems are well planned and designed, and thoroughly tested under realistic data and user load conditions before they are deployed.
The benefits of data warehousing in real-time are becoming clearer every day. With the right tools, designs, advice, approaches, and in some cases tricks, real-time data warehousing is possible using today's technologies, and will only become easier in the future.



Be the first one to comment on this page.




  Data Warehousing eBooks

No eBooks on Data Warehousing could be found as of now.

 
 Data Warehousing FAQs
More Links » »
 
 Data Warehousing Interview Questions
More Links » »
 
 Data Warehousing Articles
More Links » »
 
 Data Warehousing News
More Links » »
 
 Data Warehousing Jobs
More Links » »

Share And Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Previoushome Next

Keywords: data warehouse structure,data warehouse archiecture,data warehouse advantages,data warehouse design,data warehouse issues

HTML Quizzes
HTML Quiz
XHTML Quiz
CSS Quiz
TCP/IP Quiz
CSS 1.0 Quiz
CSS 2.0 Quiz
HLML Quiz
XML Quizzes
XML Quiz
XSL Quiz
XSLT Quiz
DTD Quiz
Schema Quiz
XForms Quiz
XSL-FO Quiz
XML DOM Quiz
XLink Quiz
XQuery Quiz
XPath Quiz
XPointer Quiz
RDF Quiz
SOAP Quiz
WSDL Quiz
RSS Quiz
WAP Quiz
Web Services Quiz
Browser Scripting Quizzes
JavaScript Quiz
VBScript Quiz
DHTML Quiz
HTML DOM Quiz
WMLScript Quiz
E4X Quiz
Server Scripting Quizzes
ASP Quiz
PERL Quiz
SQL Quiz
ADO Quiz
CVS Quiz
Python Quiz
Apple Script Quiz
PL/SQL Quiz
SQL Server Quiz
PHP Quiz
.NET (dotnet) Quizzes
Microsoft.Net Quiz
ASP.Net Quiz
.Net Mobile Quiz
C# : C Sharp Quiz
ADO.NET Quiz
VB.NET Quiz
VC++ Quiz
Multimedia Quizzes
SVG Quiz
Flash Quiz
Media Quiz
SMIL Quiz
Photoshop Quiz
Gimp Quiz
Matlab Quiz
Gnuplot Programming Quiz
GIF Animation Quiz
Scientific Visualization Quiz
Graphics Quiz
Web Building Quizzes
Web Browsers Quiz
Web Hosting Quiz
W3C Quiz
Web Building Quiz
Web Quality Quiz
Web Semantic Quiz
Web Careers Quiz
Weblogic Quiz
SEO Quiz
Web Site Hosting Quiz
Domain Name Quiz
Java Quizzes
Java Quiz
JSP Quiz
Servlets Quiz
Struts Quiz
EJB Quiz
JMS Quiz
JMX Quiz
Eclipse Quiz
J2ME Quiz
JBOSS Quiz
Programming Langauges Quizzes
C Quiz
C++ Quiz
Visual Basic Quiz
Data Structures Using C Quiz
Cobol Quiz
Assembly Language Quiz
Mainframe Quiz
Forth Programming Quiz
Lisp Programming Quiz
Pascal Quiz
Delphi Quiz
Fortran Quiz
OOPs Quiz
Data Warehousing Quiz
CGI Programming Quiz
Emacs Quiz
Gnome Quiz
ILU Quiz
Soft Skills Quizzes
Communication Skills Quiz
Time Management Quiz
Project Management Quiz
Team Work Quiz
Leadership Skills Quiz
Corporate Communication Quiz
Negotiation Skills Quiz
Database Quizzes
Oracle Quiz
MySQL Quiz
Operating System Quizzes
BSD Quiz
Symbian Quiz
Unix Quiz
Internet Quiz
IP-Masquerading Quiz
IPC Quiz
MIDI Quiz
Software Testing Quizzes
Testing Quiz
Firewalls Quiz
SAP Module Quizzes
ERP Quiz
ABAP Quiz
Business Warehousing Quiz
SAP Basis Quiz
Material Management Quiz
Sales & Distribution Quiz
Human Resource Quiz
Netweaver Quiz
Customer Relationship Management Quiz
Production and Planning Quiz
Networking Programming Quizzes
Corba Quiz
Networking Quiz
Microsoft Office Quizzes
Microsoft Word Quiz
Microsoft Outlook Quiz
Microsoft PowerPoint Quiz
Microsoft Publisher Quiz
Microsoft Excel Quiz
Microsoft Front Page Quiz
Microsoft InfoPath Quiz
Microsoft Access Quiz
Accounting Quizzes
Financial Accounting Quiz
Managerial Accounting Quiz
Testimonials | Contact Us | Link to Us | Site Map
Copyright ? 2008. Academic Tutorials.com. All rights reserved Privacy Policies | About Us
Our Portals : Academic Tutorials | Best eBooksworld | Beyond Stats | City Details | Interview Questions | Discussions World | Excellent Mobiles | Free Bangalore | Give Me The Code | Gog Logo | Indian Free Ads | Jobs Assist | New Interview Questions | One Stop FAQs | One Stop GATE | One Stop GRE | One Stop IAS | One Stop MBA | One Stop SAP | One Stop Testing | Webhosting in India | Dedicated Server in India | Sirf Dosti | Source Codes World | Tasty Food | Tech Archive | Testing Interview Questions | Tests World | The Galz | Top Masala | Vyom | Vyom eBooks | Vyom International | Vyom Links | Vyoms | Vyom World | Important Websites
Copyright ? 2003-2024 Vyom Technosoft Pvt. Ltd., All Rights Reserved.