Published onJuly 13, 2024Data Engineering Zoomcamp | Week 2.5 ETL, API to GCP using Parquet with Partitioning through PyArrow. Pipeline in Mage |MageGCPPyArrowParquetPartitioningload data to GCP. We will leverage PyArrow to output the NY_TAXI dataset partitioned by date using PyArrow.Read more →
Published onJuly 12, 2024Data Engineering Zoomcamp | Week 2.4 Configuring GCP for Mage |MageGCPDockerset up Google Cloud components and configure Mage to read and write data using Google Cloud Storage and BigQuery.Read more →
Published onJuly 11, 2024Data Engineering Zoomcamp | Week 2.3. ETL API to Postgres |GuideMagePosgresPythonDockerPandasloading data from an API in a form of a compressed CSV file and loading it to a local Postgres database.Read more →
Published onJuly 6, 2024Data Engineering Zoomcamp | Week 2.2. Configuring Postgres in Mage |GuideMagePosgresPythonSQLDockerConfiguring postgres in prepartion for I/O in Mage pipelinesRead more →
Published onJuly 5, 2024Data Engineering Zoomcamp | Week 2.1. Mage Intro and Setup |GuideMageBigQueryGCPPythonSQLDockerHigh-level overview of Mage. How to configure Mage ready to start building pipelines.Read more →