Have you ever thought by looking at numbers you cant really visualize anything but with the aid of graph you can extract information based on it.
So today i will be producing graph using python.
The library we are using is from Matlab.
https://matlab.mathworks.com/ You can try out matlab through this link! Before using matlab on python to have better understanding! Usage of graph are commonly used for machine learning, data science and etc. to see the co-efficient value.
This is a jupyter notebook where it is suitable to use for machine learning and etc.
Matlab: This how it looks like in matlab
but i will be doing it throughout in jupyter notebook instead
Basic plot on graph
The plot function syntax are plot(x,y)
As you can see you can extract info from the graph by looking at it.
The peak is at semester 2 where there are more people that scores better grade.
Increasing the linewidth of line thicker.
Next is plotting a line graph, commonly used to compare the another dataset.
By adding another plot function
Next using a barchart by adding .bar
and look how we can change the type of graph as well
Same as the line graph, i can add another bar chart to compare between two data set. I feel that is can be used to display and compare the number, frequency or other measure (e.g. mean) for different discrete categories of data.
We also have scatter plots!! Scattering the data across into sections. It can be used for k means [1] clustering. Is is easy to visualise which
The K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible.
In simple words, the aim is to segregate groups with similar traits and assign them into cluster
Plotting more than one scatter points.
Let's first understand what is a numpy.
- NumPy is used as an efficient multi-dimensional container of generic data. It is commonly used for computing scientific/mathematical data
In this code, np.arange is being used to generate numbers in between 1 to 4.5 with 0.1 incremental
Syntax: numpy.arange([start, ]stop, [step, ]dtype=None)
Scatter point is used to create a smooth curve by arange multiplying to the power of n
Pie chart
Machine Learning TiMe!!!!!
Its really cool how we can make use of graphics into visual analytics (Graphs)
but lets jump into machine learning!
Now i am sure you are wondering what is machine learning?
Machine learning is a data analytics technique that teaches computers to do what comes naturally to humans and animals: learn from experience.
We make use of data to allow machine to learn.
Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model.
The difference between Regression and classification.
Regression - numerical values (Like predicting the numbers)
Classification - classified objects (Like predicting the objects)
Overfitting - happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data.
Underfitting - cannot capture the underlying trend of the data, it occurs when the model or the algorithm does not fit the data well enough. This is also called High Bias.
Over-fitting issue
Regularization is a way to fix the over-fitting model
Class imbalance issue
To solve this problem we can either:
- under-sample the larger class such to match the smaller classes size.
- over-sample the smaller to match the larger classes size
Some try outs on predictive analysis
We are going to prediction for TV expenditure sales by using scikit learn
What is sci-kit learn?
Sci-kit learn is a package that includes simple and efficient tools for data mining and data analysis. To find out more: https://scikit-learn.org/stable/
First of all we need to have a dataset in order to train the model
We will be using a adversing datasets
Next i will be pulling out the TV column and sales to fit in the model (Linear Regression)
- We will be using linear regression model
A quick intro what is a linear regression model
- Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting.
- Different regression models differ based on – the kind of relationship between dependent and independent variables, they are considering and the number of independent variables being used.
In Scikit learn, it has an inbuilt LinearRegression function therefore there won't be a need to type out the linear regression
Comments