Skip to main content
Big Data Test Infrastructure (BDTI)

Mage is an open-source tool designed as a modern alternative to Airflow for building, running, and managing data pipelines.It facilitates the integration and transformation of data by allowing users to develop pipelines in Python, SQL, or R.

It emphasises ease of use, efficiency in pipeline development and testing, and scalability without requiring a large dedicated team. It supports the deployment on major cloud platforms (AWS, GCP, Azure) and offers features such as real-time and batch processing, interactive code previews, and comprehensive observability for operational management.

Mage AI

 

Use Cases

  • Scenario 1: Suppose you're looking to try out Mage for the first time. Launching Mage for the first time with the project name myproject results in the creation of a corresponding folder and the establishment of a user management database with a default owner user. Once logged in as the default owner user, users are encouraged to update the initial credentials promptly (see section below). Subsequent Mage users can be managed via the user interface, with all data tied to the myproject project.
  • Scenario 2: Suppose you're looking to enhance the capabilities of your existing Mage deployment, currently associated with myproject. This involves stopping the current instance and starting a new one with upgraded settings. When you use the same project name, the new deployment will automatically connect to your existing project data. This means you'll need to log in with the credentials that were previously set up.
  • Scenario 3: Suppose you're looking to create a new Mage deployment, but you want to associate it with a new project. Creating a new project, like myproject2 initiates a similar setup process as the one described in Scenario 1, requiring initial access with the default owner's credentials.

 

Libraries Reference

In addition to the base image, the following Mage add-ons are included:

PackageDescription
azureRelated packages for Azure
clickhouseData import or export with Clickhouse
dbtPackages for dbt
google-cloud-storageData import or export with Google Cloud Storage
hdf5Processing of data in HDF5 file format
mysqlData import or export with MySQL
postgresData import or export with PostgreSQL
redshiftData import or export with Redshift
s3Data import or export with S3
snowflakeData import or export with Snowflake
sparkIntegration of Spark (EMR) in Mage pipelines
streamingStreaming pipelines

 

Resources

Find below some interesting links providing more information on Mage:

Mage Website

Mage Documentation

Mage Community

Mage Tutorial