Self Service BI using Excel 2010 + PowerPivot

Always Excel has been used as a tool by end users for BI analysis in a self-sufficient fashion. The biggest problems with Excel are its limitations in size of the data that can be stored and the complexity of BI analysis that can be performed. For those reasons, when the end user is facing such limitation, he/she needs to get help from the IT department. This ultimately leads to the typical problem in which IT must to assign IT staff to analyze the requirements with the user, and then continue with technical specifications and finally build an application.

The good news is that if you use Excel 2010 you can download the free PowerPivot add-in and the Excel limitations previously mentioned will be solved. In that way you can start to enjoy an excellent tool to do Self-Service Business Intelligence.

What is PowerPivot?

PowerPivot for Excel has the ability to take the Self-Service BI to an unprecedented level. PowerPivot exploits the familiarity of Excel while adding a powerful BI analysis engine, along with new data compression algorithms that let you load large data sets in memory. PowerPivot technically is the implementation of Vertipaq, now with the new name xVelocity, which is defined as In-Memory Analytic Engine.

With PowerPivot users can process huge amounts of data at an incredible speed. Using Data Analysis Expression (DAX new language) in addition to the standard features of Excel, advanced users can easily create more complex applications based on relationships between tables in a database from different origins and other sources such as Excel spreadsheets maintained by you at your computer. In many cases, PowerPivot for Excel can establish relationships between tables automatically, and when it is not possible you can easily set up the relations.

Praxis

To illustrate all this, the following is a case studio tested with a database that includes a table of 10,000,000 transactions (TxMaster), and three related tables: Product Categories (ProdCat) Products (Prods ) and Locations (Stores). All these tables are maintained in a SQL Server database.

Let us start with the end result achieved after establishing a connection to a SQL Sever database to load the data, setup the table’s relationships, and finally assemble the analysis report, graphics and filters (Slicers).

Final Result : Excel Dashboard

If we look at new things in the Excel Ribbon section, we can see the PowerPivot tab and as first option on the left side the PowerPivot Window item. If you click on the latter you are switched to the new section of PowerPivot, which is really where is performed all the handling of data to be used by the Excel spreadsheet via a Pivot Table.

In the next picture we have selected Data View, in particular TXMaster table with 10 million transactions (loaded into the computer memory, in this case a modest Notebook with Intel Core Duo and 4Gb memory). In the tabs above you can navigate through the rest of the related tables.

PowerPivot Window – TxMaster  Table

Note that you can add columns to any table loaded. Just point under Add Column and then enter the formula. The formulas are similar to Excel formulas, however the variables are referencing the fields names in the tables and not cells in particular. Additional you can use the new language DAX for more specific calculation. We can see more details of the DAX in a future post.

The table below shows the Product.

PowerPivot Window – Products Table

If you click on Diagram View you will see the diagram with the relationships between the tables we are using.

Power Pivot Window – Diagram View

It is interesting to note that in this diagram you can do a lot of things, for example, create hierarchies as in the case Category / Product. Also, among other things, you could define KPIs, and most importantly in the Get External Data section, you can add data from virtually anywhere (SQL Server, Oracle, DB2, etc.., Etc..).

If we return to the Dashboard, this was created using Pivot Table, applied to the data defined in the PowerPivot window.

In this dashboard there are four important elements to highlight. The first one is the PowerPivot Field List, i.e., the set of measures and dimensions for the analysis of our cube, which is displayed automatically when you create the PivotTable. We will not go into details of how to assemble the table and graph analysis, but this is achieved by simple drag and drop fields from the tables on the sections of Row Labels, Column Labels and Σ values.

Regarding the Excel file, this can take a lot of space, since all data in your tables will be stored in the file. However, PowerPivot has very good data compression ratio, in this case the final size of the Excel file was approximately 62% of the size occupied by the tables in SQL Server.

An important item is to indicate that everything that was defined in PowerPivot is not lost when you need to update your information, that is, re-loads the data sources to incorporate either new records or data updates in the tables. PowerPivot will apply the already defined data relationships and then re-calculate all the aggregated columns and will update all the information for a fresh analysis.

Well, to finish just add that this is a sample of what you can do today with PowerPivot for Excel, without going into further details. As you can see, PowerPivot is a real opportunity to enter the world of Self-Service Business Intelligence (SSBI).

SSBI – Self-Service Business Intelligence.

You could call it a tool that helps the Data Analytics allowing users to access and work with corporate information to perform their own queries and reports without dependence on IT department.

It is widely known the conflict that occurs between an IT department and end users, primarily with respect to the report requirements of the end users requested to the IT department and the slow response received, which conduct to long waiting lists to satisfice such requirements. For long time end users have resolved to use some tools to at least solve their needs in a self-sufficient fashion. In this sense, the widespread use of Excel is the clearest example of how many users have substituted the slow or no response of the IT department to such requirements.

This conflict has been characterized by the term Shadow IT or Shadow Systems, to indicate that non-IT departments begin to produce their own solutions supported by end-user oriented tools that end up in some cases to build these systems in the shadows which are characterized to be out of the control and management of the IT department.

In the area of Business Intelligence also we can see this conflict, clearly intensified by the time we live, where requests for information and analysis are very changeable and requiring information now, because tomorrow could be too late to take an important business decision.

In the evolution of BI systems, always has been end-user tools that allow to create their analysis reports, however, the continuous change that we are facing has made the original DWH and DataMart cannot respond the current demand for information analysis, which is not only based on data stored in these DWH and DataMart, but also in other types of external data and / or subject to continuous change.

In that way has born the approach and trend of the SSBI. The idea, in general of many tools available in the market, is that both the IT department and end users may take benefit of such technology.

Thus, a good SSBI solution should offer easy to use features for the end user, in order to create reports and graphs analysis of corporate data sources and other ad-hoc data sources which are external data and/or with very special features.

In such scenario, the IT department focuses on maintaining a good infrastructure and service access to their DWH, DataMart, and OLAP cubes, allowing end users to develop their reports with data controlled by IT and the external data manipulated by the user. At the same time, IT can provide facilities to maintain a centralized repository with user-generated reports, and have under control security and confidentiality aspects.

No less important in this approach would be the collaboration of IT to educate end users about the content and structure of DWH, DataMart, and OLAP Cubes, so the end user is clear about the available information and the levels of details that can be analized.

Thus, a SSBI solution seems to be the happy world for IT and End Users: IT does not feel that the reports created by users are Shadow Systems, as they depend on the databases maintained by IT, while the long list of reporting requirements decrease, the end users with the SSBI tool can iterate several times to get what they need (no requirements are not understood or misunderstood by IT). Of course, IT will have to monitor closely their databases so that any query created by a user does not produce a degradation of services, and thus help create new database objects that allow optimal implementation of those more complex queries.

Finally, nothing more to add, just to say that the trend today with the adoption of SSBI approach is illustrated by the interest of companies and their wide range of products offered, for example, we have Microsoft (PowerPivot), SAP (BusinessObjects Explorer), IBM (Cognos Express), Information Builders (WebFOCUS Visual Discovery), and Targit, MicroStrategy and SAS Institute, representing the segment of pure BI system providers.

According HP Alpha End-of-Life Roadmap, the support for Tru64 UNIX AlphaServer stops in 2012

Most probable is that your systems on the AlphaServers boxes are running silently without important operational support. However, such situation will change in the future without the official support of HP and without spare parts. Add to this that each time there is less availability of Tru64 UNIX specialists.

Maybe the main problem with the Tru64 UNIX AlphaServer systems is regarding to suffer hardware and configuration problems due to the age of the systems. In the coming years, spare parts for Alphaservers should be a real problem in order to maintain normal levels of availability of the system running on such technology, because buying spare parts should be difficult and expensive.

Are you prepared to continue using Tru64 UNIX AlphaServer today and in the near future?

 

The recommendation is to have a clear Roadmap for the continuity of the system running under TRU64 UNIX Alphaserver. The roadmap will depend of each particular case, however the most typical solutions are: migrate to another platform (Linux for example) or run your systems on an AlphaServer virtualized software. Both solutions should use modern hardware like Intel x64. However, depending on the particular case, an analysis of the PROS and CONS of each solution must be put over the table.

Interesting should be to know some aspects of the utilization of the Tru64 UNIX AlphaServer in your company.  Important points to take in account are:

  • Main server purpose: Application Server, Web Server, Database Server
  • Programing Languages
  • Databases
  • Installation Date
  • Replacement plans: Migration to Linux, Emulation, Just retirement
  • HW support: Contract with 3rd party to provide spare parts and services. Own spare parts?
  • SW support: Own or Contract with a 3rd party.