top of page
  • Writer's pictureQuan Sheng

Line, bar, pie and scatter chart with python and matlabplot


Have you ever thought by looking at numbers you cant really visualize anything but with the aid of graph you can extract information based on it.


So today i will be producing graph using python.

The library we are using is from Matlab.

https://matlab.mathworks.com/ You can try out matlab through this link! Before using matlab on python to have better understanding! Usage of graph are commonly used for machine learning, data science and etc. to see the co-efficient value.


This is a jupyter notebook where it is suitable to use for machine learning and etc.


Matlab: This how it looks like in matlab


but i will be doing it throughout in jupyter notebook instead


Basic plot on graph

The plot function syntax are plot(x,y)

As you can see you can extract info from the graph by looking at it.

The peak is at semester 2 where there are more people that scores better grade.


Increasing the linewidth of line thicker.


Next is plotting a line graph, commonly used to compare the another dataset.

By adding another plot function




Next using a barchart by adding .bar

and look how we can change the type of graph as well


Same as the line graph, i can add another bar chart to compare between two data set. I feel that is can be used to display and compare the number, frequency or other measure (e.g. mean) for different discrete categories of data.


We also have scatter plots!! Scattering the data across into sections. It can be used for k means [1] clustering. Is is easy to visualise which


The K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible.


In simple words, the aim is to segregate groups with similar traits and assign them into cluster


Plotting more than one scatter points.

Let's first understand what is a numpy.

- NumPy is used as an efficient multi-dimensional container of generic data. It is commonly used for computing scientific/mathematical data


In this code, np.arange is being used to generate numbers in between 1 to 4.5 with 0.1 incremental

Syntax: numpy.arange([start, ]stop, [step, ]dtype=None)


Scatter point is used to create a smooth curve by arange multiplying to the power of n



Pie chart




Machine Learning TiMe!!!!!


Its really cool how we can make use of graphics into visual analytics (Graphs)

but lets jump into machine learning!


Now i am sure you are wondering what is machine learning?

Machine learning is a data analytics technique that teaches computers to do what comes naturally to humans and animals: learn from experience.


We make use of data to allow machine to learn.


Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model.


The difference between Regression and classification.


Regression - numerical values (Like predicting the numbers)

Classification - classified objects (Like predicting the objects)


Overfitting - happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data.

Underfitting - cannot capture the underlying trend of the data, it occurs when the model or the algorithm does not fit the data well enough. This is also called High Bias.


Over-fitting issue

Regularization is a way to fix the over-fitting model


Class imbalance issue

To solve this problem we can either:

- under-sample the larger class such to match the smaller classes size.

- over-sample the smaller to match the larger classes size



Some try outs on predictive analysis


We are going to prediction for TV expenditure sales by using scikit learn

What is sci-kit learn?

Sci-kit learn is a package that includes simple and efficient tools for data mining and data analysis. To find out more: https://scikit-learn.org/stable/


First of all we need to have a dataset in order to train the model

We will be using a adversing datasets


Next i will be pulling out the TV column and sales to fit in the model (Linear Regression)


- We will be using linear regression model


A quick intro what is a linear regression model

- Linear Regression is a machine learning algorithm based on supervised learning. It performs a regression task. Regression models a target prediction value based on independent variables. It is mostly used for finding out the relationship between variables and forecasting.

- Different regression models differ based on – the kind of relationship between dependent and independent variables, they are considering and the number of independent variables being used.


In Scikit learn, it has an inbuilt LinearRegression function therefore there won't be a need to type out the linear regression




7 views

Comments


bottom of page