Cloud Computing for Data Professionals: Getting Started With AWS, GCP, and Azure in Africa

Not long ago, building data infrastructure meant buying physical servers, renting space in a data centre, hiring IT staff to maintain hardware, and spending months before any data could flow. For most African startups and organisations, that model was simply out of reach — the upfront capital cost alone was prohibitive.
Cloud computing changed everything. Today, a data engineer in Accra or a data analyst in Nairobi can spin up a fully managed data warehouse, run a Spark cluster across hundreds of machines, or deploy a machine learning model to production — all within minutes, paying only for what they use, with no hardware to buy or maintain.
This guide explains what cloud computing actually is, how the three major platforms compare, and how you as a data professional in Africa can start using cloud services to build real infrastructure today.
What Is Cloud Computing?
Cloud computing is the delivery of computing resources — servers, storage, databases, networking, software — over the internet, on demand, from a provider who owns and manages the underlying hardware.
Instead of buying a server and running it in your office, you rent computing power from Amazon, Google, or Microsoft. You access it through a browser or command line, use what you need, and pay a bill at the end of the month based on consumption.
The Three Service Models
Cloud services are typically grouped into three categories:
- IaaS (Infrastructure as a Service) — you rent raw computing resources: virtual machines, storage, networking. You manage the operating system and everything above it. Example: an AWS EC2 virtual machine.
- PaaS (Platform as a Service) — the cloud provider manages the infrastructure and runtime. You focus on deploying your application or pipeline. Example: Google App Engine.
- SaaS (Software as a Service) — fully managed applications you use directly. Example: Google Workspace, Snowflake.
For data professionals, you will mostly work across IaaS and PaaS, with some SaaS tools like managed data warehouses sitting at the top.
The Big Three Platforms
Amazon Web Services (AWS)
AWS is the oldest and largest cloud provider, launched in 2006. It has the widest range of services and the largest global infrastructure. For data work, the key services are:
| Service | Purpose |
|---|---|
| S3 | Object storage — store any file at any scale |
| EC2 | Virtual machines — run any workload |
| RDS | Managed relational databases (PostgreSQL, MySQL) |
| Redshift | Data warehouse for analytics |
| Glue | Managed ETL — extract, transform, load |
| Lambda | Serverless functions — run code without servers |
| EMR | Managed Apache Spark for big data processing |
AWS has the strongest market share in Africa, with a Cape Town region (af-south-1) launched in 2020 — meaning lower latency for users across sub-Saharan Africa.
Google Cloud Platform (GCP)
GCP is Google's cloud offering, and it is arguably the best platform specifically for data and machine learning workloads. Its data services are mature, well-integrated, and in many cases simpler to use than AWS equivalents:
| Service | Purpose |
|---|---|
| Cloud Storage | Object storage — equivalent to S3 |
| BigQuery | Serverless data warehouse — no cluster to manage |
| Dataflow | Managed Apache Beam for stream and batch ETL |
| Vertex AI | End-to-end machine learning platform |
| Looker Studio | Free data visualisation and dashboarding |
| Cloud Composer | Managed Apache Airflow |
BigQuery deserves special mention. It is one of the best tools available for analytics at any scale — you write standard SQL, it runs across petabytes of data in seconds, and you pay only for the data your query scans. For analysts and engineers just getting started with cloud, BigQuery is one of the fastest paths to value.
Microsoft Azure
Azure is Microsoft's cloud platform and dominates in enterprise environments, particularly where organisations already use Microsoft products like Office 365 or SQL Server. Key data services include:
| Service | Purpose |
|---|---|
| Azure Blob Storage | Object storage |
| Azure SQL Database | Managed relational database |
| Azure Synapse | Integrated analytics platform |
| Azure Data Factory | ETL and data integration pipelines |
| Azure Databricks | Managed Apache Spark |
| Power BI | Business intelligence and dashboards |
If you work in a corporate environment in Africa — banking, telecoms, government — there is a good chance the organisation is already invested in the Microsoft ecosystem, making Azure the path of least resistance.
How to Choose a Platform
For most data professionals starting out, the choice comes down to a few practical factors:
- Your employer's existing stack — if your company already uses AWS, learn AWS
- The job market in your city — search LinkedIn for data engineer jobs in Lagos, Nairobi, or Johannesburg and see which platforms appear most in job descriptions
- The type of work you want to do — heavy analytics and ML? GCP is hard to beat. Enterprise integration? Azure. Broadest range of options? AWS.
There is no wrong answer. The core concepts — storage, compute, networking, pipelines — transfer across platforms. Learn one deeply and the others become much easier to pick up.
Your First Cloud Project: A Data Pipeline on GCP
Here is a practical walkthrough of building a simple data pipeline using Google Cloud — entirely within the free tier limits that GCP offers new users.
Step 1 — Create a GCP Account
Go to cloud.google.com and sign up. New accounts receive $300 in free credits valid for 90 days, which is more than enough to complete this project multiple times.
Step 2 — Create a Storage Bucket
Cloud Storage is where your raw data files will land — think of it as an infinitely scalable folder in the cloud.
# Install and initialise the Google Cloud CLI first
gcloud init
# Create a storage bucket (bucket names must be globally unique)
gsutil mb -l africa-south1 gs://datafrik-pipeline-demoStep 3 — Upload Your Dataset
# Upload a local CSV file to your bucket
gsutil cp nigeria_economic_data.csv gs://datafrik-pipeline-demo/raw/Step 4 — Create a BigQuery Dataset and Table
# Create a BigQuery dataset
bq mk --dataset --location=africa-south1 datafrik_analytics
# Load your CSV directly from Cloud Storage into BigQuery
bq load \
--source_format=CSV \
--autodetect \
datafrik_analytics.nigeria_economic_data \
gs://datafrik-pipeline-demo/raw/nigeria_economic_data.csvStep 5 — Query Your Data
Now open the BigQuery console in your browser and run a SQL query:
SELECT
year,
gdp_growth_rate,
inflation_rate,
unemployment_rate,
ROUND(gdp_growth_rate - inflation_rate, 2) AS real_growth_estimate
FROM
`datafrik_analytics.nigeria_economic_data`
WHERE
year >= 2015
ORDER BY
year DESCYou just ran a query against a cloud data warehouse. That is the foundation of what data engineers and analysts do at scale every single day.
Key Cloud Concepts Every Data Professional Must Understand
Regions and Zones
Cloud providers divide the world into regions — geographic areas containing data centres. Always deploy your resources in the region closest to your users or data sources to minimise latency.
For African workloads: AWS af-south-1 (Cape Town), GCP africa-south1 (Johannesburg), and Azure South Africa North (Johannesburg) are your primary options.
IAM — Identity and Access Management
Security in the cloud is managed through IAM. Every resource has permissions that control who can read, write, or administer it. Never use your root account for day-to-day work. Create IAM users or service accounts with only the permissions they need — a principle called least privilege.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::datafrik-pipeline-demo/*"
}
]
}This AWS IAM policy allows reading and writing to one specific S3 bucket — and nothing else.
Cost Management
The biggest mistake beginners make in the cloud is accidentally leaving resources running and receiving a surprise bill. Develop these habits from day one:
- Set billing alerts — every major cloud platform lets you set email alerts when your spend crosses a threshold
- Delete resources when done — spinning up a virtual machine and forgetting about it is the most common source of unexpected costs
- Use free tiers wisely — AWS, GCP, and Azure all have generous always-free tiers for certain services
- Use spot or preemptible instances — for non-critical batch workloads, spot instances can be 60 to 90 percent cheaper than standard pricing
Cloud Certifications Worth Pursuing
Certifications will not replace hands-on experience, but they are a credible signal to employers that you have structured knowledge of a platform. The most respected entry-level cloud certifications for data professionals are:
| Certification | Platform | Level |
|---|---|---|
| AWS Certified Cloud Practitioner | AWS | Beginner |
| AWS Certified Data Engineer Associate | AWS | Associate |
| Google Cloud Professional Data Engineer | GCP | Pro |
| Azure Data Fundamentals (DP-900) | Azure | Beginner |
Each of these has official free or low-cost study materials available. The Google Cloud Skills Boost platform offers free labs that let you practice in a real GCP environment without spending your own credits.
The Cloud Is the Default Now
Five years ago, cloud skills were a differentiator. Today they are a baseline expectation for anyone working in data. Almost every modern data stack — whether at a Lagos fintech, a Nairobi healthtech startup, or a South African bank — runs on cloud infrastructure.
The good news is that getting started has never been easier or cheaper. Every major provider offers free tiers, extensive documentation, and hands-on tutorials. You can go from zero to running a real data pipeline in the cloud in a single afternoon.
At DatAfrik, cloud computing is integrated throughout our Data Engineering bootcamp and our self-learning tracks. We use real African datasets and build on the same cloud platforms you will encounter in the industry — so you graduate with practical, portfolio-ready experience, not just theoretical knowledge.
The servers are ready. The only question is whether you are.