Image for post
Image for post

I typically do not get excited when I open up the BigQuery UI, but today we got a late Christmas present. Google Cloud has rolled out some new features for the BigQuery UI. There are layout changes, enhancements to existing features, and all together new features that should help change the way we use BigQuery.

First to get to the new UI make your way over to the BigQuery console and select ‘SHOW PREVIEW FEATURES’ in the top navigation bar. …


Image for post
Image for post
Photo by Shane Hauser on Unsplash

As a Snowflake fan, I love when I get projects that come down the pipe that are Snowflake based. They are much simpler than SQL Server projects, I do not need to jump through hoops to get started downloading SSMS or Data Studio, getting provisioned etc. There are great native tools like the python client library. And probably the best thing is it is just plain fast.

However one of my least favorite things about Snowflake has always been the UI. …


Image for post
Image for post

It may not be the leaderboard at Augusta, but the #PanderaFitnessClub is all about having a leaderboard. In the past few posts I’ve gone through various topics leading up to this, the capstone on my series of building a Fitness Leaderboard. In my last three posts I covered how I went about:

Designing the Architecture I would need
Implementing a Data Vault in BigQuery
Building of the Pipelines

In this article I will focus on building out a Data Studio dashboard to actually be able to serve out to the members of the #PanderaFitnessClub community. To circle back on the…


In my previous posts, I went over how we went about implementing a fitness program at Pandera, the need for a custom solution to handle a little friendly competition, a supporting architecture, and how I would structure a Data Vault.

Designing a Fitness Leaderboard in GCP

Implementing a Data Vault in BigQuery

Now though, we’re building the pipeline. This will take the Strava data and get it into BigQuery. As a reference point, here is the architecture I outlined in my first post:

Image for post
Image for post
Google Cloud Architecture

I previously mentioned my main focus in discussing the code involved in this application is going to…


Image for post
Image for post

In my last post I went over how we went about implementing a fitness program at Pandera, the need for a custom solution to handle a little friendly competition, and an architecture that would support that.

Designing a Fitness Leaderboard in GCP

In this post I want to focus on what a Data Vault model is and what it looks like in BigQuery. I am covering this first instead of the data pipeline as it really shapes the way I handle the messages and events that come in.

First I want to start off by saying Data Vault is extremely…


Image for post
Image for post
Team Muddern Data Architects at Tough Mudder — 2019

With the Corona virus shutting down everyone’s typically norm, we at Pandera have been trying to hold strong to keeping active both individually and active in our culture. Our culture has always been that of being there to support one another, striving for the best, and of course data. This is how I used some of my free time to support that culture.

About a year ago I started playing around with the idea of doing a multi-sport event, specifically a duathlon. I wanted a challenge and an excuse to get back on my neglected road bike. However one of…


Image result for bigquery

In my last post I wrote about building a data pipeline to land leads data from noCRM inside BigQuery, but I really didn’t get much into BigQuery itself.

To catch up on my last post check it out here:

BigQuery is an ANSI SQL-based managed data warehouse solution by Google. Since it’s managed you do not need to worry about scaling, and you only pay for what you use. It also has the benefit of being loaded with additional features like BigQuery ML, which I’ll get into later.

I built my tables using the gcloud SDK command and referencing a…


I started my career surrounded by a world where data moved via ETL and to get started was not an easy thing because the barrier to entry was so high, licensing costs, configuring an environment, and not a whole lot of accessible training.

However today I lead the Data Services Practice at Pandera, and am surrounded by a great number of resources and opportunities to continue learning how to do things with data, both in the traditional ETL space, but also in cloud platforms that are redefining data movement.

In my last post I talked about data quality, and the…


Image for post
Image for post

Typically when we engage with customers, I like to reflect on what they are asking us to do. It helps set the trend for topics I engage in with new customers, and keeps me up-to-date. Twice this week alone, the topics of Data Quality and Data Governance have come up.

Reflection on those topics came while digging through some data for a new forecasting and capacity planning tool we are working on in-house. We are a consulting company, so the majority of our revenue stream is through statements of work where we are staffing resources on projects — and anything…

Daniel Zagales

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store