Data Warehouse Principles
A data warehouse is a large repository of historical data that can be integrated for decision support. The use of a data warehouse is markedly different from the use of operational systems. Operational systems contain the data required for the day-to-day operations of an organization. This operational data tends to change quickly and constantly. The table sizes in operational systems are kept manageably small by periodically purging old data. The data warehouse, by contrast, periodically receives historical data in batches, and grows over time. The vast size of data warehouses can run to hundreds of gigabytes, or even terabytes. The problem that drives data warehouse design is the need for quick results to queries posed against huge amounts of data.
A D V E R T I S E M E N T
A DW can quickly become a quagmire if it's not designed, implemented and maintained properly. Following are The Seven Principles of Data Warehousing that helps a DW design and implementation on the road to achieving your desired results
Data Warehouse Business Principles
- Organizational Consensus :
From the outset of the data warehousing effort, there should be a consensus-building process that helps guide the planning, design and implementation process. If your knowledge workers and managers see the DW as an unnecessary intrusion - or worse, a threatening intrusion - into their jobs, they won't like it and won't use it. Make every effort to gain acceptance for, and minimize resistance to, the DW. If you involve the stakeholders early in the process, they're much more likely to embrace the DW, use it and, hopefully, champion it to the rest of the company.
- Data Integrity :
The brass ring of data warehousing - of any business intelligence (BI) project - is a single version of the truth about organizational data. The path to this brass ring begins with achieving data integrity in your DW. Therefore, any design for your DW should begin by minimizing the chances for data replication and inconsistency. It should also promote data integration and standardization. Any reasonable methodology you choose to achieve data integrity should work, as long as you implement the methodology effectively with the end result in mind.
- Implementation Efficiency :
To help meet the needs of your company as early as possible and minimize project costs, the DW design should be straightforward and efficient to implement. This is truly a fundamental design issue. You can design a technically elegant DW, but if that design is difficult to understand or implement or doesn't meet user needs, your DW project will be mired in difficulty and cost overruns almost from the start. Opt for simplicity in your design plans and choose (to the most practical extent) function over beautiful form. This choice will help you stay within budgetary constraints, and it will go a long way toward providing user needs that are effective.
- User Friendliness :
User friendliness and ease of use issues, though they are addressed by the technical people, are really business issues. Why? Because, again, if the end business users don't like the DW or if they find it difficult to use, they won't use it, and all your work will be for naught. To help achieve a user-friendly design, the DW should leverage a common front-end across the company - based on user roles and security levels, of course. It should also be intuitive enough to have a minimal learning curve for most users. Of course, there will be exceptions, but your rule of thumb should be that even the least technical users will find the interface reasonably intuitive.
- Operational Efficiency :
This principle is really a corollary to the principle of implementation efficiency. Once implemented, the data warehouse should be easy to support and facilitate rapid responses to business change requests. Errors and exceptions should also be easy to remedy, and support costs should be moderate over the life of the DW.
The reason I say that this principle is a corollary to the implementation efficiency principle is that operational efficiency can be achieved only with a DW design that is easy to implement and maintain. Again, a technically elegant solution might be beautiful, but a practical, easy-to-maintain solution can yield better results in the long run.
Data Warehouse IT Principles
- Scalability :
Scalability is often a big problem with DW design. The solution is to build in scalability from the start. Choose toolsets and platforms that support future expansions of data volumes and types as well as changing business requirements. It's also a good idea to look at toolsets and platforms that support integration of, and reporting on, unstructured content and document repositories.
- Compliance with IT Standards :
Perhaps the most important IT principle to keep in mind is to not reinvent the wheel when you build your DW. That is, the toolsets and platforms you choose to implement your DW should conform to and leverage existing IT standards. You also want, as much as possible, to leverage existing skill sets of IT and business users. In a way, this is a corollary of the user friendliness principle. The more your users know going in, the easier they'll find the DW to use once they see it.
Following these principles won't guarantee you will always achieve your desired results in designing and implementing your DW. Beware of any vendors that tell you it's a slam-dunk if you follow their methodology. There will almost always be problems that seem intractable at first - and may eventually prove to be so. Nevertheless, if you build your DW following these seven principles, you should be in a better position to recognize and address potential problems before they turn into project killers.