Note

Open alphaventures opened this issue 2 years ago • 1 comments

Apr 15 '23 15:04 alphaventures

#Machine learning enables system to learn from data and improve performance . Types Like 1.supervised learning-learns from labelled data here input and output both columns are taken process example Regression (numerical data),Classification(responses like yes/no). 2.unsupervised learning-find patterns in unlabelled data means here only input columns are present , example Clustering like task to find out how many types of groups find in data.

#Supervised learning Algorithms Linear Regression – predicts continuous values.

Logistic Regression – binary classification.

Decision Trees

Random Forest etc.,

#Unsupervised learning Algorithms K-Means Clustering

Hierarchical Clustering

Principal Component Analysis (PCA)

#Regression- MSE (Mean Squared Error)

RMSE (Root Mean Squared Error)

R² Score

#Classification- Accuracy

Precision

Recall

F1 Score

Confusion Matrix

ROC-AUC

In Machine learning for uploading data in google-colab use following code

from google.colab import files uploaded = files.upload()

then choose the file by selecting from the files locating in your system

now, for linear regression /logistic regression(almost same as Linear regression Code)

TASK1.importing the required packages import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split from sklearn.metrics import r2_score,accuracy_score,confusion_matrix,classification_report from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import mean_squared_error from sklearn.tree import plot_tree

#here linear_model.py is sklearn package/library where function/class is pre written ,model_selection.py and metrics.py are python files(modules).

TASK2.Reading and exploring the data data=pd.read_csv('your actualfile name.csv') data.head() data.shape

TASK3.finding the following terms using required coding Null values, check the Duplicates and drop them , Data types of values in each column , detection of Outliers present in data, and using their removal codes ,Necessary visualisations -these processes called as EDA(Exploratory data analysis)

df.isnull().sum().sum() df.dropna(inplace=True) df.duplicated().sum() df=df.drop_duplicates(inplace=True) df.shape

TASK4.Steps in performing MODEL building in ML process 1.creating x and y variables 2.splitting the given dataset into training and testing data 3.Standardization/scaling of data 4.Applying the algorithm on data which is also known as training of ML model 5.Check the performance of model on testing data

Label encoding is change all data from object to numeric type for training of data- from sklearn.prepocessing import LabelEncoder le=LabelEncoder()

for i in df.columns: if df[i].dtype=='object': df[i]=le.fit_transform(df[i]) df.info()

Apr 24 '25 11:04 Mridul-Bhardwaj007