With this new data architecture in place, Gu’s team has been able to make an enormous amount of progress in a very short time: “In one year we onboarded Snowflake, automated all the deployments, migrated all of our Power BI reports from Postgres to Snowflake, migrated over 160 reports, 200 different DAGS over from Airflow to Dagster, replicating 66 databases from Postgres onto Snowflake,” he says. “Nearly 4,000 table feeds were being loaded every single day. Almost four terabytes of data, uncompressed, has been migrated to Snowflake to date—in less than a year with just a team of five, including myself.”
Gu estimates that had he attempted to take a code-first approach or used legacy transformation solutions, he would have needed a team 5x the size and it would have taken twice as long.
“Overall, I’ve always found Coalesce to be about 10 times more productive,” he says. “It was very easy to add columns, introduce columns, refactor some of the tables, redeploy, rerun. Coalesce makes it a lot simpler and faster for even making one change cycle. Part of it is column-level lineage, and part of it is the fact you have highly templated development from a standards perspective. You can have a more nimble, lightweight team to do the same amount of work because you’re more productive, while at the same time maintaining higher levels of standards.”
All of these changes have introduced a fundamental positive shift in the company. “About a month ago we had an initiative to optimize the productivity organization,” Gu recalls, “so we sat down with the users and identified the key outcomes and business processes, got the data points, created the data models, and deployed the insight. It took about two days from idea to insight—this is unheard of in our organization. This is something that usually takes three months, so to reduce this to just two days is a whole new paradigm.”
As far as his future plans, Gu’s goal is to disseminate the platform services his team has built to many different groups across the organization. His hope is this will allow them to become more data-centric without the need to rebuild the end-to-end platform and integrations from scratch: “Now other groups within the Group 1001 organization can start using what we’ve developed to build their own data models.”