# Abstract
This study investigates the temporal dynamics of operational and industrial energy consumption for a manufacturing company (referred to as Company XYZ). Using historical time-stamped energy data, the project develops machine learning based time series models capable of forecasting future energy demand over extended horizons.
The analysis integrates exploratory data analysis, feature engineering, and model development, and model comparison specifically Lasso Regression and Random Forest Regressor to identify consumption patterns and build predictive systems.
The final models are productionized using Streamlit and Gradio, enabling real-time inference and public interaction. The results demonstrate strong predictive performance and reveal actionable insights into daily, weekly, and seasonal consumption behaviors.
Energy consumption forecasting is a critical component of operational planning, cost optimization, and sustainability strategy for industrial organizations. Company XYZ generates large volumes of operational and industrial energy usage data, reflecting diverse activities across shifts, departments, and seasons. Understanding these patterns is essential for anticipating demand spikes, reducing wastage, and improving energy efficiency.
This project explores the historical energy consumption of Company XYZ and develops a robust forecasting pipeline capable of predicting future consumption for up to 1,000 hours. The work combines classical time-series reasoning with machine learning techniques, emphasizing practical deployment and interpretability.
In this project, I explored the energy consumption dataset for XYZ company and leveraged various techniques to cleanse, preprocessed and engineered features that allowed me cover great insights that are contributive features to determining the rate of energy consumption considering the type of activities carried out. The dataset comprises of the OPERATIONAL and INDUSTRIAL energy consumption for the XYZ company.
To bring this to perspective, I leveraged on the Time Series Techniques to cover deeper insights on the dataset considering their historical patterns over time and how each consumption varies with respect to underlining activities.
The focus of this project is to develop a robust algorithm that is capable of estimating the energy consumption for the nearest future in hours leveraging on all underlying attributes in the historical events in the dataset and productionalize this model using streamlit and gradio for public consumption.
The repository includes exploratory data analysis notebooks to cover broader scope and contexts, model scripts, and interactive applications for real-world usage.
Explore each dataset related to activities over time and build model that can estimate for future inference.
Aggregated the OPERATION and INDUSTRIAL energy consumption to obtain the total energy consumption for XYZ company.
Explore the total energy dataset and developed a machine learning model that estimate the total energy in Kilowatts for any given future hours.
Built functions that automatatically engineered features and preprocessing for futuristic prediction.
Corrected missing values using the mean strategy.
Two algorithms were trained on the data - Lasso and RandomForestRegressor.
Productionalize the total energy model into streamlit and gradio.
The methodological framework consists of four major components: data preparation, exploratory analysis, model development, and deployment.
Data Preparation:
Exploratory Data Analysis (EDA):
The EDA phase investigates:
Model Development:
Two machine learning algorithms were trained and evaluated:
The models were trained on historical total energy consumption and evaluated on held-out test data. The Random Forest model demonstrated superior performance, particularly in capturing nonlinear temporal patterns.
Deployment:
The final forecasting model was integrated into:
These interfaces allow users to input future timestamps and obtain predicted energy consumption in real time.
The experiments were conducted using the aggregated total energy dataset and the steps included are:
Temporal Pattern Analysis:
Model Training:
Model Evaluation:
The analysis produced several key findings:
Consumption Insights:
Model Performance:
Deployment Outcomes:
This project demonstrates the value of combining time-series analysis with machine learning techniques to forecast industrial and operational energy consumption. By analyzing historical patterns and building predictive models, the study provides actionable insights that can support energy planning, cost reduction, and operational efficiency for Company XYZ.
The Random Forest model proved most effective, capturing nonlinear relationships and delivering accurate forecasts. The deployment of the model through Streamlit and Gradio further enhances accessibility, enabling real-time predictions for stakeholders.
Future work may include: