What you will learn/find on this article ?

  • Some context and history of how and why we are shifting to the cloud paradigm
  • A complete pattern example of how to migrate (or create from scratch) your pyspark jobs to GCP with DataProc workflow templates (you can use the same logic for spark and Hadoop migration, also some further references will be given)
  • A github repo that you can copy and adapt for your purposes of migration

Whom this article might be useful for ?

  • Anyone who wants to migrate his on-premise spark/hadoop infrastructure to GCP, or just want to implement his spark/hadoop workflows on GCP width Dataproc Workflow templates
  • Anyone curious

Introduction : a bit of history & context

Ali Godsi founder of Databricks


Introduction :

Google BigQuery is a great piece of technology that solves many of today’s big data challenges. It abstracts for you the pains of storage which means you don’t think anymore about how big is your dataset in order to store it, and compute, in other words, you don’t think anymore about how to distribute compute operations across multiple nodes. Billing will still be up to you.

But still sometimes you want to process your dataset or a subset of it in your local machine, and in order to do that, at least at the best of my knowledge in python…


What is multi-tenancy ?

Briefly, multi-tenancy is an architecture that allows one instance of a software to serve multiple clients (called tenants), one of the advantages is cost saving. suppose you have licence for a software you use (MS sql server for example), would be nice if you can handle all your client in one instance, and paying for just one licence.

Where you might need multi-tenancy ?

The basic use case, is when you are building a Saas app, you want to serve multiple clients (tenants), and keep at the same time their data isolated, there are many ways multi-tenant architectures are achieved, describing those ways is outside of…

Senhaji Rhazi hamza

Full stack devops based in Paris, python advocate, let’s connect on Linkedin : https://www.linkedin.com/in/hamza-senhaji-rhazi-72170678/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store