Curious mind, clear insights—driven to make data matter. Hi there, I'm Nhlakanipho Ngubo, a Data Analyst.

Profile picture of Nhlakanipho Ngubo >

About Me

Click the image to view my CV

Passionate about turning raw data into actionable insights, I specialize in data wrangling, exploratory analysis, and creating structured datasets that drive smarter decisions. Using tools like Python (Pandas, Numpy) and SQL, I extract and clean complex information, identify patterns, and support data-driven storytelling across domains.

Whether analyzing learner behavior, modeling databases for reporting, or building APIs that feed analytics workflows, I enjoy solving meaningful data challenges with structure and clarity. Rooted in Agile methodologies, I bring curiosity, focus, and a problem-solving mindset to every dataset I encounter—because behind every number is a decision waiting to be made.

Latest Projects

ETL Data Insights: Top Banks

Bank building Image

Click the image to view the Top Banks project repository.

Specifications

  • Programming Language:
    • Python logo Python
  • Database:
    • SQLite logoSQLite
  • Development Environment:
    • Jupyter logo Jupyter Notebook
  • Frameworks / Libraries:
    • Pandas logoPandas
    • Numpy iconNumpy
    • BeautifulSoup iconBeautifulSoup
    • Requests IconRequests

Description

The Top Banks ETL pipeline automates financial data extraction, transformation, and storage. It scrapes bank rankings, converts market cap values, adds currency conversions (GBP, EUR, INR), and loads the refined data into a CSV file and SQLite database for seamless querying. Designed for accuracy and scalability, it transforms raw data into actionable insights for efficient analysis and informed decision-making.

Risk Analysis: Data Wrangling | Learner Risk Insights

Data Wrangling Image

Click the image to view the Data Wrangling project repository.

Specifications

  • Programming Language:
    • Python logo Python
  • Development Environment:
    • Jupyter logo Jupyter Notebook
  • Frameworks / Libraries:
    • Pandas logoPandas

Description

Data Wrangling transforms messy datasets into clean, structured formats using Pandas, ensuring reliable workflows. By analyzing learners' personality scores and department choices, it identifies "High risk" and "Low risk" learners, enabling performance predictions and guiding proactive actions for mismatches. This process turns raw data into actionable insights, driving smarter decisions.

Data Reporting API: CompTrack API

CompTrack API Image

Click the image to view the CompTrack API project repository.

Specifications

  • Programming Language:
    • Python logo Python
  • Database:
    • SQLite logoSQLite
  • Frameworks / Libraries:
    • Bug On Screen ImageUnittest
    • Flask LogoFlask
    • Flask LogoFlask-SQLAlchemy

Description

CompTrack API streamlines the collection and organization of computer specifications, enabling efficient data management and integration into analytics pipelines. Built for precision and scalability, it transforms raw hardware data into actionable insights, empowering smarter resource management and continuous innovation.

MongoDB Data Storage: Visitor Admin

Visitor Admin Image

Click the image to view the Visitor Admin project repository.

Specifications

  • Programming Language:
    • Python logo Python
  • Database:
    • MongoDB logo MongoDB
  • Containerization:
    • Docker logo Docker
  • Frameworks / Libraries:
    • Bug On Screen ImageUnittest
    • MongoDB LogoPymongo
    • MongoDB LogoBson
    • MongoDB LogoMongomock

Description

Visitor Admin securely captures and stores visitor credentials in a robust MongoDB database. It ensures efficient data collection, organized storage, and seamless retrieval, empowering organizations to manage critical data reliably and confidently. Designed for scalability, it lays the foundation for advanced workflows.

PostgreSQL Database: SQL-Driven Shop Analytics

Shop Database Image

Click the image to view the Shop Database project repository.

Specifications

  • Query Language:
    • SQL logoSQL
  • Database:
    • PostgreSQL logoPostgreSQL
  • Containerization:
    • Docker logo Docker

Description

Developed a normalized PostgreSQL database to support potential business analytics use cases in eCommerce, enabling efficient querying and reporting.

Case Study

Personal Tracker for Learning Progress & Standup Planning

Data Entry Image

Click the image to download the Personal Tracker excel spreadsheet.

Context

During my learnership at Umuzi Academy, consistent performance reporting wasn't readily available—feedback was mostly shared verbally, and digital updates were inconsistent or inaccurate. To maintain momentum and visibility over my own progress, I created a personalized Excel system to track goals, performance, and daily priorities independently.

Goals

  • Structure daily tasks to stay focused and prepare effectively for stand-up meetings.
  • Track Coderbyte test scores and maintain a transparent view of personal performance.
  • Estimate program completion status as a percentage and measure it against my current progress.
  • Supplement missing institutional data with an accurate, self-managed system.

Approach

  • Developed a daily planner in Excel using formulas, conditional formatting, and modular task entries.
  • Logged and visualized Coderbyte scores over time to identify trends and areas for focus.
  • Created a custom formula to calculate where I should be in the program (as a percentage) and compared it to my actual progress.
  • Chose a flexible planning structure over rigid time blocks to better support the unpredictability of remote learning, allowing the system to remain valuable even when the day didn't follow the plan.
  • Iteratively refined the tracker to adapt to evolving needs and sustain consistent usage.

Key Features

  • Progress Estimator: Formula-driven tracking of program advancement—target progress vs. actual progress.
  • Visual Clarity: Clustered bar charts, combination charts, and conditional formatting to highlight task progress status and missed goals.
  • Self-Assessment Framework: Enabled proactive learning and reflection in the absence of accurate external feedback.
  • Productivity Support: Simple task planner for daily stand-up readiness and milestone alignment.

Impact

  • Replaced unreliable feedback with clear, data-driven self-reporting.
  • Enhanced my ability to prepare for stand-ups with structured, visible progress.
  • Improved my focus and productivity by providing a clear view of daily priorities.
  • Demonstrated initiative, structured thinking, and the ability to build solutions when existing systems fall short.

Volunteering

Data Entry And Educational Content Management | Umuzi Academy | April 2025 - Present

Data Entry Image

Transferred and organized learning materials from Google Drive to Google Classroom to support remote education delivery. Wrote clear and concise task headings and descriptions to enhance learner comprehension and navigation. Ensured that content uploads were accurate, timely, and aligned with course structure.

Key Contributions

  • Maintained consistent file organization standards to reduce educator workload.
  • Developed descriptive summaries for assignments, improving clarity and learner engagement.
  • Supported educators in streamlining course content distribution across digital platfoms.

Technologies I Use

Visual Studio Code

GitHub

HTML

CSS

Rabbitmq

Git

Certificates

National Certificate: Business Analysis Support Practice NQF Level 5

Business Analysis Support Practice Certificate

SETA certificate pending. For a credit confirmation letter, please email me.

Blending analytical thinking with creative problem-solving to uncover insights, translate business needs into data-driven solutions, and support decision-making through structured, actionable analysis.

IBM Certificate: Python for Data Science, AI, and Development

Python for Data Science IBM Certificate

Click the image to view the Python for Data Science IBM badge.

Built a solid foundation in Python for data analysis, using pandas and numpy to manipulate and explore datasets in analytical workflows. Applied these skills to manipulate and analyze datasets, and to develop basic data-driven applications.

IBM Certificate: Python Project for Data Engineering | Top Banks

Python project for Data Engineering IBM Certificate

Click the image to view the Python project for Data Engineering IBM badge.

Developed a data pipeline for banking sector analysis, extracting financial data via APIs and web scraping. Transformed datasets across formats, applied structured logging for ETL tracking, and prepared analysis-ready data for repository loading.

Additional Skills

Reviewed 111+ Pull Requests, ensuring high coding standards.

Completed 46+ projects, demonstrating expertise in scalable data solutions.

Solved 100+ problems across multiple coding platforms, sharpening problem-solving skills.

Experienced in Agile workflows, leading peer learning through POD sessions, and creating clear documentation for seamless project onboarding.

Contact Me

Looking for a data analyst who can explore, wrangle, and interpret data with clarity and purpose? I'm open to freelance work and volunteering opportunities to gain experience and deliver impact. Let's connect and uncover insights together.

nhlakanipho.ngubo@umuzi.org:

mpilongubo07@gmail.com:

LinkedIn Profile:

LinkedIn logo