Abstract

Ever felt like understanding AI requires a PhD and a supercomputer? Throw that idea out. This publication is your invitation to the world of artificial intelligence, stripped of its intimidating jargon and complexity. We’re going back to the basics, not as a history lesson, but as a foundation. We’ll crack open the ‘black box’ of fundamental models, from the straightforward logic of linear regression to the simple yet powerful architecture of a basic neural network. This isn't about chasing the latest hype; it's about building an unshakable, intuitive grasp of how machines actually learn. Consider this your first and most important step forget complex theory and start building a genuine understanding that will last.

Introduction

Artificial Intelligence (AI) is transforming the world, but for beginners, it often feels complex and difficult to understand. The purpose of this study is to simplify AI by focusing on basic models that are easy to learn and implement.
In this work, we have used openly available datasets such as student performance, salary prediction, and iris flower classification to build simple predictive models. These models provide a foundation for understanding core AI concepts such as supervised learning, regression, and classification.

Methodology

Dataset Selection

Student Performance Dataset (Study hours vs Marks).

Additional datasets like Iris, Salary Prediction, and House Prices.

Data Preprocessing

Cleaned and organized the dataset.

Split data into training and testing sets.

Model Building

Applied Linear Regression for continuous predictions.

Used Logistic Regression / Decision Trees for classification tasks.

Evaluation

Measured performance using Mean Squared Error (MSE) for regression models.

Calculated accuracy for classification models.

# Experiments
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

df = pd.read_csv('student_performance.csv')
X = df[['Hours_Studied']]
y = df['Marks_Obtained']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression().fit(X_train, y_train)
y_pred = model.predict(X_test)

print('MSE:', mean_squared_error(y_test, y_pred))
print('R2:', r2_score(y_test, y_pred))from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

lr = LogisticRegression(max_iter=200).fit(X_train, y_train)
dt = DecisionTreeClassifier().fit(X_train, y_train)

for name, m in [('LogReg', lr), ('DecisionTree', dt)]:
    pred = m.predict(X_test)
    print(name, 'accuracy:', accuracy_score(y_test, pred))
    print(classification_report(y_test, pred))
    print(confusion_matrix(y_test, pred))from sklearn.model_selection import cross_val_score
scores = cross_val_score(LinearRegression(), X, y, cv=5, scoring='r2')
print('5-fold R2 scores:', scores, 'mean:', scores.mean())

Experiment

Experiment 1: Student Performance Prediction

Objective: Predict student marks based on study hours.

Dataset: Student Performance Dataset (Study Hours vs Marks).

Method: Applied Linear Regression model.

Procedure:

Load dataset.
Preprocess and split into training (80%) and testing (20%).
Train Linear Regression model.
Test on unseen data.

Result: The model predicted marks with low Mean Squared Error (MSE), showing that study hours strongly influence student performance.

Experiment 2: Iris Flower Classification

Objective: Classify iris flowers into 3 species (Setosa, Versicolor, Virginica).

Dataset: Iris dataset (150 samples, 4 features).

Method: Logistic Regression and Decision Tree Classifier.

Procedure:

Load and preprocess the dataset.
Split into training and testing sets.
Train classification models.
Evaluate accuracy.

Result: Both models achieved high accuracy (>90%), showing that even simple classifiers can work well on structured datasets.

Experiment 3: Salary Prediction

Objective: Predict salary based on years of experience.

Dataset: Salary Dataset (Experience vs Salary).

Method: Linear Regression.

Procedure:

Load dataset.
Train model with experience as input and salary as output.
Test on sample data.

Result: The model predicted salaries with good accuracy, confirming a linear relationship between experience and salary.

Experiment 4: House Price Prediction (Optional)

Objective: Predict house prices based on area, number of rooms, etc.

Dataset: Simple House Price Dataset.

Method: Linear Regression / Decision Tree.

Procedure:

Preprocess dataset.
Train model.
Evaluate results.

Result: Model successfully predicted approximate house prices, demonstrating real-life application of regression models.

Results

The models gave promising outcomes:

The Linear Regression model on the Student Performance dataset showed a clear positive relationship between study hours and marks obtained. The Mean Squared Error (MSE) was very low, indicating that the model predictions were close to actual values.

On the Iris dataset, classification models such as Logistic Regression and Decision Tree achieved high accuracy (>90%), demonstrating the effectiveness of even basic models.

The Salary Prediction dataset showed that years of experience are strongly correlated with salary, making it a useful example for beginners.

These results confirm that simple AI models can provide meaningful insights and accurate predictions even with small datasets

Conclusion

This work demonstrates that learning AI does not always require complex models. By using simple datasets and basic algorithms such as Linear Regression, Logistic Regression, and Decision Trees, beginners can quickly understand the core concepts of Artificial Intelligence.
The study highlights the importance of starting with easy examples before moving on to advanced techniques. This approach builds confidence and provides a strong foundation for further AI learning.

Future Work

Although the models presented here are simple, there are many opportunities for further development:

Use of Larger Datasets – Applying these models to larger, real-world datasets can give more robust insights.
Advanced Algorithms – Future studies can explore deep learning, neural networks, and ensemble methods.
Real-Life Applications – Models can be applied in education (predicting student performance), healthcare (disease prediction), and finance (loan risk prediction).
Automation – Building interactive applications or dashboards to make AI models accessible to non-technical users.

Abstract

Introduction

Methodology

Dataset Selection

Student Performance Dataset (Study hours vs Marks).

Additional datasets like Iris, Salary Prediction, and House Prices.

Data Preprocessing

Cleaned and organized the dataset.

Split data into training and testing sets.

Model Building

Applied Linear Regression for continuous predictions.

Used Logistic Regression / Decision Trees for classification tasks.

Evaluation

Measured performance using Mean Squared Error (MSE) for regression models.

Calculated accuracy for classification models.

# Experiments
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

df = pd.read_csv('student_performance.csv')
X = df[['Hours_Studied']]
y = df['Marks_Obtained']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression().fit(X_train, y_train)
y_pred = model.predict(X_test)

print('MSE:', mean_squared_error(y_test, y_pred))
print('R2:', r2_score(y_test, y_pred))from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

lr = LogisticRegression(max_iter=200).fit(X_train, y_train)
dt = DecisionTreeClassifier().fit(X_train, y_train)

for name, m in [('LogReg', lr), ('DecisionTree', dt)]:
    pred = m.predict(X_test)
    print(name, 'accuracy:', accuracy_score(y_test, pred))
    print(classification_report(y_test, pred))
    print(confusion_matrix(y_test, pred))from sklearn.model_selection import cross_val_score
scores = cross_val_score(LinearRegression(), X, y, cv=5, scoring='r2')
print('5-fold R2 scores:', scores, 'mean:', scores.mean())

Experiment

Experiment 1: Student Performance Prediction

Objective: Predict student marks based on study hours.

Dataset: Student Performance Dataset (Study Hours vs Marks).

Method: Applied Linear Regression model.

Procedure:

Load dataset.
Preprocess and split into training (80%) and testing (20%).
Train Linear Regression model.
Test on unseen data.

Result: The model predicted marks with low Mean Squared Error (MSE), showing that study hours strongly influence student performance.

Experiment 2: Iris Flower Classification

Objective: Classify iris flowers into 3 species (Setosa, Versicolor, Virginica).

Dataset: Iris dataset (150 samples, 4 features).

Method: Logistic Regression and Decision Tree Classifier.

Procedure:

Load and preprocess the dataset.
Split into training and testing sets.
Train classification models.
Evaluate accuracy.

Result: Both models achieved high accuracy (>90%), showing that even simple classifiers can work well on structured datasets.

Experiment 3: Salary Prediction

Objective: Predict salary based on years of experience.

Dataset: Salary Dataset (Experience vs Salary).

Method: Linear Regression.

Procedure:

Load dataset.
Train model with experience as input and salary as output.
Test on sample data.

Result: The model predicted salaries with good accuracy, confirming a linear relationship between experience and salary.

Experiment 4: House Price Prediction (Optional)

Objective: Predict house prices based on area, number of rooms, etc.

Dataset: Simple House Price Dataset.

Method: Linear Regression / Decision Tree.

Procedure:

Preprocess dataset.
Train model.
Evaluate results.

Result: Model successfully predicted approximate house prices, demonstrating real-life application of regression models.

Results

The models gave promising outcomes:

On the Iris dataset, classification models such as Logistic Regression and Decision Tree achieved high accuracy (>90%), demonstrating the effectiveness of even basic models.

The Salary Prediction dataset showed that years of experience are strongly correlated with salary, making it a useful example for beginners.

These results confirm that simple AI models can provide meaningful insights and accurate predictions even with small datasets

Conclusion

Future Work

Although the models presented here are simple, there are many opportunities for further development:

Use of Larger Datasets – Applying these models to larger, real-world datasets can give more robust insights.
Advanced Algorithms – Future studies can explore deep learning, neural networks, and ensemble methods.
Real-Life Applications – Models can be applied in education (predicting student performance), healthcare (disease prediction), and finance (loan risk prediction).
Automation – Building interactive applications or dashboards to make AI models accessible to non-technical users.

Learning AI The Easy Way: A Look at Basic Models

Learning AI The Easy Way: A Look at Basic Models

Table of contents

Abstract

Introduction

Methodology

Results

Conclusion

Table of contents

Abstract

Introduction

Methodology

Results

Conclusion

Datasets

Datasets