Skip to main content
Big Data Test Infrastructure (BDTI)

Mage AI

Mage AI is an open-source tool designed as a modern alternative to Airflow for building, running, and managing data pipelines.It facilitates the integration and transformation of data by allowing users to develop pipelines in Python, SQL, or R.

It emphasises ease of use, efficiency in pipeline development and testing, and scalability without requiring a large dedicated team. It supports the deployment on major cloud platforms (AWS, GCP, Azure) and offers features such as real-time and batch processing, interactive code previews, and comprehensive observability for operational management.

Mage AI

 

Use Cases

  • Scenario 1: Suppose you're looking to try out Mage AI for the first time. Launching Mage AI for the first time with the project name myproject results in the creation of a corresponding folder and the establishment of a user management database with a default owner user. Once logged in as the default owner user, users are encouraged to update the initial credentials promptly (see section below). Subsequent Mage AI users can be managed via the user interface, with all data tied to the myproject project.
  • Scenario 2: Suppose you're looking to enhance the capabilities of your existing Mage AI deployment, currently associated with myproject. This involves stopping the current instance and starting a new one with upgraded settings. When you use the same project name, the new deployment will automatically connect to your existing project data. This means you'll need to log in with the credentials that were previously set up.
  • Scenario 3: Suppose you're looking to create a new Mage AI deployment, but you want to associate it with a new project. Creating a new project, like myproject2 initiates a similar setup process as the one described in Scenario 1, requiring initial access with the default owner's credentials.

 

Libraries Reference

In addition to the base image, the following Mage AI add-ons are included:

Package Description
azure Related packages for Azure
clickhouse Data import or export with Clickhouse
dbt Packages for dbt
google-cloud-storage Data import or export with Google Cloud Storage
hdf5 Processing of data in HDF5 file format
mysql Data import or export with MySQL
postgres Data import or export with PostgreSQL
redshift Data import or export with Redshift
s3 Data import or export with S3
snowflake Data import or export with Snowflake
spark Integration of Spark (EMR) in Mage pipelines
streaming Streaming pipelines