🌱 PoC ETL from Azure Storage to CosmosDB

PoC to transfer a CSV file from Azure Storage to Azure CosmosDB.

TL/DR

Concept

Azure Storage Blob → Azure Data Factory → CosmosDB

[File: 295f22f3-158f-4a63-9b34-64646a66c862]

[File: aec6259e-38b4-4fe0-9d42-d50858df816b]

Deployment

Create a resource group and deploy the template:

az group create --name poc-datafactory --location "East US"

az deployment group create \
    --resource-group poc-datafactory \
    --template-file poc.bicep \
    --parameters dataFactoryName=etl

There are four available parameters, all optional:

All other entities are named after their type, eg:

resource databaseContainer 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers@2022-05-15' = {
  parent: database
  name: 'databasecontainer'  

[File: cbdb8dc3-49ce-4419-81cd-61107f84f043]

Testing

  1. Upload a CSV file to the storage container
    1. Must be delimited by ;
    2. Must contain a header row
    3. Must contain the name, protein and rating fields
  2. Manually trigger the pipeline inside Data Factory
  3. Check the output inside the Cosmos Database

Clean up

Delete the entire resource group to prevent waste:

az group delete --name poc-datafactory

Next


Resources


🌱 Seedlings são ideias que recém tive e precisam de cultivo, não foram revisadas ou refinadas. O que é isso?