fb_pixel
cloud computing

Cloud Computing for Data Professionals: Getting Started With AWS, GCP, and Azure in Africa

Cloud Computing for Data Professionals: Getting Started With AWS, GCP, and Azure in Africa
views
8 min read
#cloud computing

Not long ago, building data infrastructure meant buying physical servers, renting space in a data centre, hiring IT staff to maintain hardware, and spending months before any data could flow. For most African startups and organisations, that model was simply out of reach — the upfront capital cost alone was prohibitive.

Cloud computing changed everything. Today, a data engineer in Accra or a data analyst in Nairobi can spin up a fully managed data warehouse, run a Spark cluster across hundreds of machines, or deploy a machine learning model to production — all within minutes, paying only for what they use, with no hardware to buy or maintain.

This guide explains what cloud computing actually is, how the three major platforms compare, and how you as a data professional in Africa can start using cloud services to build real infrastructure today.


What Is Cloud Computing?

Cloud computing is the delivery of computing resources — servers, storage, databases, networking, software — over the internet, on demand, from a provider who owns and manages the underlying hardware.

Instead of buying a server and running it in your office, you rent computing power from Amazon, Google, or Microsoft. You access it through a browser or command line, use what you need, and pay a bill at the end of the month based on consumption.

The Three Service Models

Cloud services are typically grouped into three categories:

  • IaaS (Infrastructure as a Service) — you rent raw computing resources: virtual machines, storage, networking. You manage the operating system and everything above it. Example: an AWS EC2 virtual machine.
  • PaaS (Platform as a Service) — the cloud provider manages the infrastructure and runtime. You focus on deploying your application or pipeline. Example: Google App Engine.
  • SaaS (Software as a Service) — fully managed applications you use directly. Example: Google Workspace, Snowflake.

For data professionals, you will mostly work across IaaS and PaaS, with some SaaS tools like managed data warehouses sitting at the top.


The Big Three Platforms

Amazon Web Services (AWS)

AWS is the oldest and largest cloud provider, launched in 2006. It has the widest range of services and the largest global infrastructure. For data work, the key services are:

ServicePurpose
S3Object storage — store any file at any scale
EC2Virtual machines — run any workload
RDSManaged relational databases (PostgreSQL, MySQL)
RedshiftData warehouse for analytics
GlueManaged ETL — extract, transform, load
LambdaServerless functions — run code without servers
EMRManaged Apache Spark for big data processing

AWS has the strongest market share in Africa, with a Cape Town region (af-south-1) launched in 2020 — meaning lower latency for users across sub-Saharan Africa.

Google Cloud Platform (GCP)

GCP is Google's cloud offering, and it is arguably the best platform specifically for data and machine learning workloads. Its data services are mature, well-integrated, and in many cases simpler to use than AWS equivalents:

ServicePurpose
Cloud StorageObject storage — equivalent to S3
BigQueryServerless data warehouse — no cluster to manage
DataflowManaged Apache Beam for stream and batch ETL
Vertex AIEnd-to-end machine learning platform
Looker StudioFree data visualisation and dashboarding
Cloud ComposerManaged Apache Airflow

BigQuery deserves special mention. It is one of the best tools available for analytics at any scale — you write standard SQL, it runs across petabytes of data in seconds, and you pay only for the data your query scans. For analysts and engineers just getting started with cloud, BigQuery is one of the fastest paths to value.

Microsoft Azure

Azure is Microsoft's cloud platform and dominates in enterprise environments, particularly where organisations already use Microsoft products like Office 365 or SQL Server. Key data services include:

ServicePurpose
Azure Blob StorageObject storage
Azure SQL DatabaseManaged relational database
Azure SynapseIntegrated analytics platform
Azure Data FactoryETL and data integration pipelines
Azure DatabricksManaged Apache Spark
Power BIBusiness intelligence and dashboards

If you work in a corporate environment in Africa — banking, telecoms, government — there is a good chance the organisation is already invested in the Microsoft ecosystem, making Azure the path of least resistance.


How to Choose a Platform

For most data professionals starting out, the choice comes down to a few practical factors:

  • Your employer's existing stack — if your company already uses AWS, learn AWS
  • The job market in your city — search LinkedIn for data engineer jobs in Lagos, Nairobi, or Johannesburg and see which platforms appear most in job descriptions
  • The type of work you want to do — heavy analytics and ML? GCP is hard to beat. Enterprise integration? Azure. Broadest range of options? AWS.

There is no wrong answer. The core concepts — storage, compute, networking, pipelines — transfer across platforms. Learn one deeply and the others become much easier to pick up.


Your First Cloud Project: A Data Pipeline on GCP

Here is a practical walkthrough of building a simple data pipeline using Google Cloud — entirely within the free tier limits that GCP offers new users.

Step 1 — Create a GCP Account

Go to cloud.google.com and sign up. New accounts receive $300 in free credits valid for 90 days, which is more than enough to complete this project multiple times.

Step 2 — Create a Storage Bucket

Cloud Storage is where your raw data files will land — think of it as an infinitely scalable folder in the cloud.

# Install and initialise the Google Cloud CLI first
gcloud init

# Create a storage bucket (bucket names must be globally unique)
gsutil mb -l africa-south1 gs://datafrik-pipeline-demo

Step 3 — Upload Your Dataset

# Upload a local CSV file to your bucket
gsutil cp nigeria_economic_data.csv gs://datafrik-pipeline-demo/raw/

Step 4 — Create a BigQuery Dataset and Table

# Create a BigQuery dataset
bq mk --dataset --location=africa-south1 datafrik_analytics

# Load your CSV directly from Cloud Storage into BigQuery
bq load \
  --source_format=CSV \
  --autodetect \
  datafrik_analytics.nigeria_economic_data \
  gs://datafrik-pipeline-demo/raw/nigeria_economic_data.csv

Step 5 — Query Your Data

Now open the BigQuery console in your browser and run a SQL query:

SELECT
  year,
  gdp_growth_rate,
  inflation_rate,
  unemployment_rate,
  ROUND(gdp_growth_rate - inflation_rate, 2) AS real_growth_estimate
FROM
  `datafrik_analytics.nigeria_economic_data`
WHERE
  year >= 2015
ORDER BY
  year DESC

You just ran a query against a cloud data warehouse. That is the foundation of what data engineers and analysts do at scale every single day.


Key Cloud Concepts Every Data Professional Must Understand

Regions and Zones

Cloud providers divide the world into regions — geographic areas containing data centres. Always deploy your resources in the region closest to your users or data sources to minimise latency.

For African workloads: AWS af-south-1 (Cape Town), GCP africa-south1 (Johannesburg), and Azure South Africa North (Johannesburg) are your primary options.

IAM — Identity and Access Management

Security in the cloud is managed through IAM. Every resource has permissions that control who can read, write, or administer it. Never use your root account for day-to-day work. Create IAM users or service accounts with only the permissions they need — a principle called least privilege.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::datafrik-pipeline-demo/*"
    }
  ]
}

This AWS IAM policy allows reading and writing to one specific S3 bucket — and nothing else.

Cost Management

The biggest mistake beginners make in the cloud is accidentally leaving resources running and receiving a surprise bill. Develop these habits from day one:

  • Set billing alerts — every major cloud platform lets you set email alerts when your spend crosses a threshold
  • Delete resources when done — spinning up a virtual machine and forgetting about it is the most common source of unexpected costs
  • Use free tiers wisely — AWS, GCP, and Azure all have generous always-free tiers for certain services
  • Use spot or preemptible instances — for non-critical batch workloads, spot instances can be 60 to 90 percent cheaper than standard pricing

Cloud Certifications Worth Pursuing

Certifications will not replace hands-on experience, but they are a credible signal to employers that you have structured knowledge of a platform. The most respected entry-level cloud certifications for data professionals are:

CertificationPlatformLevel
AWS Certified Cloud PractitionerAWSBeginner
AWS Certified Data Engineer AssociateAWSAssociate
Google Cloud Professional Data EngineerGCPPro
Azure Data Fundamentals (DP-900)AzureBeginner

Each of these has official free or low-cost study materials available. The Google Cloud Skills Boost platform offers free labs that let you practice in a real GCP environment without spending your own credits.


The Cloud Is the Default Now

Five years ago, cloud skills were a differentiator. Today they are a baseline expectation for anyone working in data. Almost every modern data stack — whether at a Lagos fintech, a Nairobi healthtech startup, or a South African bank — runs on cloud infrastructure.

The good news is that getting started has never been easier or cheaper. Every major provider offers free tiers, extensive documentation, and hands-on tutorials. You can go from zero to running a real data pipeline in the cloud in a single afternoon.

At DatAfrik, cloud computing is integrated throughout our Data Engineering bootcamp and our self-learning tracks. We use real African datasets and build on the same cloud platforms you will encounter in the industry — so you graduate with practical, portfolio-ready experience, not just theoretical knowledge.

The servers are ready. The only question is whether you are.

Logo

datafrik.co

Copyright © 2024 Datafrik.co

All rights reserved

Quick Links
Career Bootcamp
Tools Bootcamp
Stay up to date

1813 Pinsky Lane, North Las Vegas, NV, 89032, USA

Email: support@datafrik.co

© 2026 datafrik.co