Details of Recent Projects
Reverse Engineering of Existing Data Warehouse
This client, a chain of bakery café's, was approaching the end of their contract with their existing data warehouse provider and was not renewing the contract. They needed to build a new data warehouse in Azure in order to replace all the reports and dashboards, using PowerBI, prior to losing access to the existing system, but their only in-house expert in that system had left a few weeks before.
- Reviewed existing data warehouse model and ETL processes
- Provided complete documentation of data processing from source to target, including all lookups, conversions and calculations, to allow client staff to create matching processes in Azure.
- Reviewed and advised on implementation of ETL processes in Azure, including identifying multiple discrepancies and errors in the new implementation.
- Retrieved the entire catalog of reports and dashboards from the old system and instructed client staff on how they worked, so they could be recreated in Power BI
- Also researched several data problems in the data warehouse and identified fixes for those in the extraction engine supplying data to the old data warehouse, so that the new data warehouse would not have the same problems.
- New data warehouse was ready to go on time, in less than a month.
Student Performance Data Pipeline & Dashboards
This client provides training courses, group coaching, and certification exams intended primarily for teams of employees, and needs to accumulate data from two different web systems into a data warehouse and provide a variety of dashboards for the students & their management, as well as for internal use by the client for sales and customer success. Prior to building the data pipeline, we worked with the client to analyze a number of options for obtaining the data from Thinkific directly or via Segment and storing, managing, and reporting on the data, including: Azure SQL Data Warehouse, Amazon RedShift, Azure Functions, Amazon Lambda Functions, Klipfolio, Airtable, Looker, and others.
- Data Pipeline is a Python program running as a timer trigger in Azure Functions, running hourly to use REST APIs to pull data from the learning management system (Thinkific) and the certification exam system (Brillium) and load it into an Azure SQL Server
- An Azure Function running Python on an HTTP trigger manages just-in-time user provisioning of reporting users in the reporting tool (Klipfolio) and mediate single-sign-on so the users can view the reports embedded in a custom page within Thinkific. This function also logs users logging in to the reporting tool to the data warehouse for usage analysis.
- Another HTTP triggered Azure Function accepts webhook events from Thinkific to record user sign-ins to the LMS to the SQL database for usage analysis.
- Reports and dashboards are built in Klipfolio, with dashboard access security based on group membership and data security embedded in the SQL queries for the reports.
Multi-Lingual Fleet Tracking Data Warehouse
This client, located in Belgium, uses sensors and mobile devices to monitor fleets of vehicles for vehicle performance, driver behavior, vehicle event & alarm monitoring, telephone usage, and activity & task completion. They needed to build an analytics & reporting platform that could be embedded in their existing user portal for fleet managers. They also needed to be able to get the project off the ground fairly quickly while attempting to limit repeating work as they change database platforms and hosting models in the future. In addition, they needed to start off supporting English and French, translating both UI elements and some data components, with the plan of eventually adding support for another 6 languages with a minimum of redevelopment.
- Collaborated with the client to design a data delivery mechanism that delivers incremental data with geo-coding to text files on Amazon S3 where it can be picked up by Birst's S3 application connector.
- Designed and implemented a standard star schema data warehouse with minimal ETL in Birst.
- Designed models for delivering translation of UI elements (column names, dashboard titles, etc.) and certain types of data elements (event types, vehicle types, etc. that are translated in the client's app)
- Implemented several dashboards, each with several reports, to cover the client's release 1 report requirements
- Designed and documented Birst development lifecycle
- Implemented V1.1 dashboard updates
- Preparing to deploy V1.2 with new data sources and additional reports
- Working on transition to Birst appliance with custom Snowflake data warehouse
Major Data Warehouse Replacement/Reimplementation
Our client for this project is an American beverage manufacturing company, located in Connecticut & Ohio, who had a poorly performing, unwieldy Birst data warehouse and needed to make significant improvements. The existing Birst environment was loaded nightly from an existing data warehouse that received data from three sales/ERP systems and took 3-4 hours per night to process a full load of data from those systems before delivering a subset of the data to Birst, which then took hours more to process it. On top of that, the data model forced the generation of extremely complex queries that were difficult to maintain and performed very poorly - one report, generated daily, routinely took 90 minutes to render, so could only be delivered by email. On top of that, there were significant questions about data quality because there were frequently differences between the reports and accounting's information.
- Worked with the client to design a consistent data extract format so that all three ERPs would deliver the data as similarly as possible; the data is delivered incrementally to a central location at the client site where another process converts it from XML to CSV for optimal delivery to Birst via web service APIs.
- Designed and implemented ETL within Birst to further conform the data so that each business unit's data can always be queried exactly the same way as each other business unit, which will substantially streamline query creation, maintenance, and performance.
- Added data sources for freight cost estimating for undelivered orders and freight invoice acquisition for delivered orders, which were not available in the old system, including splitting freight amounts among orders that were shipped together by weight.
- Created a budget data model that can accommodate alternate forecasts, to make it easy for them to add that data in the future.
The new system is so much less complex and works so much faster that the client's new Birst team were able to rebuild all their original reports almost entirely on their own. End-to-end data processing on a nightly basis now takes less than half an hour, with the exact cycle time depending on the amount of data; the client is considering starting to run hourly incremental updates throughout the business day. The report that used to take 90 minutes now takes 60-70 seconds, so they are able to use it in a dashboard (we may be able to optimize that further, if they decide it is worthwhile). QA of the data has shown that it is 100% accurate both compared against the source system and the accounting information.
We also mentored the client on management of the old and new Birst spaces, report development, testing & troubleshooting, and future model enhancements.
Mentoring, Troubleshooting and Other Guidance
A number of clients engage with us to provide mentoring, troubleshooting and other guidance to their staff who are involved with Birst, including:
- ETL performance analysis - review ETL processing that performs more slowly than desired and make recommendations for improvement ranging from minor ETL code optimizations to major delivery & model changes
- Query performance analysis - review query performance either for specific slow reports & dashboards or for entire Birst dashboard systems and make recommendations for improvement, ranging from minor tweaks to queries, optimization of queries for cache usage, to significant model changes
- Data troubleshooting - assist client staff with determining the source of problems in their data and provide guidance on how to correct the problem as efficiently as possible
- Visualization suggestions - provide guidance on effective ways to present information for easy consumption by report users, as well as on ways to create visualizations to substitute for chart types Birst does not support.
- Data model guidance- provide guidance and assistance in creating data models in a variety of systems including SQL Server, Birst Data Warehousing and Birst Live Access
Long-Term Birst Architect for Fleet Management Device Provider
The client is a large, US-based fleet tracking and management service provider who were using Birst Live Access to provide a logical data model for reporting against a heavily-used transactional database. Performance under this configuration was extremely poor, and they were considering dropping Birst despite their investment in it. During our initial engagement, we discussed some ideas of ways to optimize the Live Access model and the underlying database, but quickly concluded that the best course was to move to loading the most relevant data into a Birst data model. We provided extensive guidance on designing the Birst data model, optimizing the extraction and delivery of data, managing the data volumes of their many customers, suggested a number of important improvements to the Birst system that were implemented for this client, and acted as a liaison to the Birst support and product teams.
Over the 5 years since then, we have continued to provide additional guidance and mentoring to the client's staff, helping them become more comfortable with Birst, reviewing their planned data model changes and additions, designing web-based ETL and deployment management software, troubleshooting data and software problems, etc. In addition, we took part in two annual business intelligence summits held by the client and some of their sister companies; one year, to help the sister companies understand how best to plan fitting Birst into their own environments and, the next, providing 12 hours of custom Birst training sessions to the client and several of their sister companies.
Analytics Roadmap Planning
This client was a startup that needed to implement an analytics platform and develop an analytics roadmap to include in their engineering plans. We helped them evaluate a number of options for collecting their data from various systems and delivering it all to Birst in a manageable fashion, what data sets to prioritize and which to leave for later, and how to plan the Birst model in order to ensure easy addition of new data sets later. We also built their initial data models and dashboards in Birst.
HR Equal Employment Opportunity Reporting Reimplementation
This client is a multinational food-manufacturing company with over 30 thousand employees worldwide, based in Michigan. They have a mandate to report on Equal Employment Opportunity compliance, for which they were already using Birst. The existing Birst data warehouse implementation was not working well for them, because of a couple of model design problems. The new Birst implementation substantially reduced the ETL processing time, model & query complexity, and query execution time, resulting in new reports that were more accurate, easier to validate & maintain, and faster.
HR Equal Employment Opportunity Reporting Reimplementation II
When the client was migrating from one HR system to another, they needed to rebuild their Birst Equal Employment Opportunity compliance reporting system to accommodate the new data formats and asked Birst to have us, specifically, to do the work. Initially, we agreed that we would just format the extracted data from the new system so that it could be loaded into the existing Birst space, but when they realized that they wanted to remove some components and change all the column names to match the new system, we concluded that it was best to build a new system mirroring the previous with new names. The transition to the new environment went very well, with minimal impact on end users.