Sopuruchi Chisom

Tanslating Data into Insights

About Me

I am a developer specialized in crafting cutting-edge solutions utilizing Data Science, the power of the Cloud, and Software Engineering principles. My passion lies in deciphering data—how we wield it, how it molds our world, and how it sparks innovation.

Technology Stack

  • Programming Languages: Python, SQL, Java
  • Machine Learning: Regression, Classification, Ensemble, Unsupervised Learning
  • Big Data and Cloud Platforms: AWS, Azure, Databricks, Spark, Airflow, Prefect, Hadoop
  • Analysis & Methodologies: A/B Testing, ETL/ELT, Statistics
  • Software Development: OOP, OOD, APIs, Version Control
  • Operating Systems: Windows, Unix/Linux
  • Car Price Prediction

    Unleash the Power of Predictive Pricing: An end-to-end Machine Learning Pipeline

    Image of Cars

    In the dynamic automotive market, accurately pricing cars is critical for both buyers and sellers. To address this challenge, we present a Car Price Prediction Model - a robust predictive model that estimates car prices effectively.

    • Exploratory Data Analysis in Jupyter and AWS Cloud9
    • Model Development using ensemble models and Optuna hyperparameter tuning
    • Secure Data Versioning both Locally and in the AWS S3 Cloud
    • Model Registry using MLflow
    • Seamless Model Deployment in the AWS Sagemaker Endpoint

    Tool Stack: Python, Regression, Predictive Modelling, Statistics, AWS, Git, Mlflow, AWS Sagemaker

    Recipe Prediction

    Food Produce Image

    With customers having the option of monthly or premium subscriptions customer engagement is a priority. Popular recipes drive traffic to the rest of the websites up by 40% when displayed on the homepage. This model accurately predict recipe traffic 72% of the time. Elevate your culinary platform and anticipate user preferences for a flavor-filled success story!

    Tool Stack: Python, Peredictive Modelling, Statistics, Exploratory Data Analysis, RESTful API, Git

    Demand Forecasting

    Trend Image

    "Unlock the Future Potential of Your Business with Demand Forecasting. Gain Valuable Insights into Future Inventory Needs, Energy Consumption Patterns, and Weather Forecasts. Join us in Forecasting Product Demand for Favorita Corp. Grocery Stores.

    Tool Stack: Python, Exploratory Data Analysis, Demand Forecasting, Git

    Pandemic Analytics: Data Platform

    Data Platform Image

    Experience the power of Azure Data Factory in unraveling COVID-19 trends with our end-to-end data platform. Seamlessly orchestrate, automate, and manage the movement and transformation of critical data from authoritative sources to enable informed decision-making.

    Tool Stack:Azure Data Factory, Databricks, HDInsight, Azure Data Flow, PowerBI, CI/CD(Azure DevOps)

    S&P 500 ETL Pipeline

    ETL Pipeline Image

    Experience seamless data transformation for S&P 500 companies with this ETL Pipeline powered by Apache Airflow. Automate and elevate the batch processing of vast datasets, ensuring precision and efficiency in every step. Unleash the potential of streamlined data workflows for unparalleled insights and informed decision-making.

    Tool Stack: Python, Bash Scripting, ETL, Airflow

    Power Calculator App

    Application System Design

    The Power Calculator Web App is an end-to-end web application created and hosted on AWS. The application allows users to calculate the power of a base number raised to an exponent. It leverages various AWS services for different functionalities, providing a scalable and secure solution.

    Tool Stack: Python, AWS, API

    Nashville Housing Data Cleaning

    Nashville Housing

    Dive into the Nashville Housing Data Cleaning project, a meticulous endeavor to elevate the quality and structure of the Kaggle-acquired Nashville Housing dataset. Processed with precision in SQL Server Management Studio, this initiative unveils valuable insights into property sales in Nashville.

    Tool Stack: SQL, SSMS, Data Analysis

    COVID Data Exploration

    COVID image

    Embark on a journey of exploration with our COVID Data Exploration project. Delve into the depths of COVID-19 data sourced from Our World in Data. Our objective is to unveil valuable insights into the global impact of the pandemic, meticulously analyzed and presented through captivating Tableau visualizations.

    Tool Stack: SQL, Exploratory Data Analysis, SSMS, Tableau

    Tweet Sentiment Analysis

    Tweets on COVID-19

    Dive into the world of real-time insights with our Tweet Sentiment Analysis project. Powered by Apache Spark, this scalable application streams tweets, focusing on #covid19. Witness sentiment analysis in action and explore tweet locations through captivating visualizations in Kibana. Stay ahead with dynamic analytics in the age of social media.

    Tool Stack: Python, NLP, Spark