A Database That Responds Like a Spreadsheet
Our team has spent years analyzing data and complaining (mostly to each other) about the limited tools available to us. Along the way, we built a couple of spend analysis platforms that helped to address some of our issues. In 2001, ExpenseMap pioneered dashboard-wide multi-filtering and a speedy in-memory OLAP database. In 2005, BIQ introduced in-system DIY familying and mapping.
With Spendata, we’ve cracked the code on the rest of what’s bothered us. To paraphrase James Carville, “It’s the analysis, stupid.” Finding out what the data means has always involved dumping subsets of it elsewhere, typically to a database or spreadsheet. But now, inside Spendata, we can quickly build models at database scale that automatically reconfigure and recalculate themselves as we alter filters, mappings, and data. We get real-time responsiveness – re-derives, re-mappings, and so on – at a mouse click level.
Our users are busy leveraging these capabilities. They do RFP analysis, Scope 3 calculations, pre- and post-merger integration, sales analysis, sourcing project modeling, savings tracking, and more. That’s because Spendata is a database that responds like a spreadsheet.
Why is this important? To answer that, we need to look for the reasons why business analysts remain reliant on spreadsheets, despite the problems they create. Everyone agrees that the lack of rigor and auditability around spreadsheets promotes errors, prevents accountability, and lowers the quality of data analysis. Spreadsheets are too slow once data volumes exceed a few hundred thousand transactions, and they’re entirely useless on datasets with more than a million transactions. [Every data transformation in Spendata is auditable and repeatable, and Spendata remains fast and efficient with millions of transactions.]
The reason that analysts are addicted to spreadsheets is simple. Spreadsheet models gain their analytical power from dependencies – that is, a change in one area of a model will propagate smoothly and predictably through the rest of the model, causing a waterfall of changes. One can create new columns as functions of others, and use those columns as indexes to look up values, create pivot tables, and so on. [Spendata works similarly: users build relationships and transformations as they go, and the resulting complex dependencies are maintained and updated automatically.]
When spreadsheets can’t handle the data volume, databases are the last resort. But databases don’t maintain dependencies, they are slow to respond to queries on transactional data, and they require specialized DBA and SQL skills. That’s why point-and-click BI tools with built-in OLAP databases have proliferated. [Spendata’s UI provides BI-type functionality, along with filtering and display options unavailable in BI tools.]
A BI tool is also not a spreadsheet substitute, because BI tools neither maintain dependencies nor offer any solution to the fixed schemas and inflexibility of their underlying databases. Worse, the pre-computed dimensions of a BI tool must be recalculated when anything changes, an offline process that can take hours. This means a BI system is effectively read-only, and queries can be performed only on predefined dimensions built by the owner of the database. [In Spendata, the schema is dynamic and maintained automatically. Users can make changes to mappings, create new dimensions, build new computations, and see the results of their efforts in seconds.]
The bottom line: analysts now have a tool with
- spreadsheet-equivalent automatic dependency maintenance
- database-like controls and auditability
- BI tool equivalent functionality, and
- the ability to alter mappings, transformations, derivations, and input data in real time.
Spendata’s mission is to continue to find new ways to make data analysis easier. Just last year we introduced View-based measures, where we take rolled up totals at the dimension level and fold them back into the dataset as a new measure. That’s like extracting rolled-up data into a file, then linking it back in – except we do it dynamically, so that real-time filtering on the dimension causes real-time changes in the measure (and all of its dependent Views, derivations, and calculations).