There are many good book on Data Analytics. Recently I borrowed a book from office library titled – “Even You Can Learn Statistics and Analytics: An Easy to Understand Guide to Statistics and Analytics” authored by David M. Levine and David F. Stephan. I feel this is a good one for beginners on Data Analytics.

There are also other good books like – ‘Python for Data Analysis’, ‘Python: Data Analytics and Visualization‘ , ‘Python for Finance’ etc.

One important aspect of presenting data is in Graph format (visual format – also known as Data Visualization).

A bar chart is useful for presenting categorical data. I has rectangular bars whose length is proportional to the categorical values we want to present.

E.g. We want to represent the Marks of a Student in several subjects:

Subject |
Score out 100 |

Maths | 94 |

Physics | 85 |

Chemistry | 66 |

French | 55 |

Computers | 89 |

English | 64 |

￼￼

This is a vertical bar graph. This can be achieved by a small Python code:

import matplotlib.pyplot as plot

import numpy as np

subjects = ['Maths', 'Physics', 'Chemistry', 'French', 'Computers', 'English']

marks = [94,85,66,55,89,64]

m = np.arange(len(subjects))

plot.bar(m, marks)

plot.xlabel('Subject')

plot.ylabel('Marks')

plot.xticks(m, subjects)

plot.title('Marks obtained out of 100')

plot.show()

Matplotlib is a Python 2D plotting library and Numpy is the fundamental package for scientific computing with Python are two impotant packages in Python

Here subjects and score (marks) are represented in Python arrays. len(subjects) return the length of subjects – in this case 6.

numpy.arrange is used to arrange the subjects on the graph in order. On x-axis, we have the subject names (xticks) and on y-axis, we have marks.

plot.bar is plotting the vertical bar chart.

We can also define other parameters for the graph such as fontsize, weight, rotation:

plot.bar(m, marks,color='indigo')

plot.xlabel('Subject',fontsize=15, fontweight='bold', color='blue')

plot.ylabel('Marks',fontsize=15, fontweight='bold', color='blue')

plot.xticks(m, subjects, fontsize=10, fontweight='bold', rotation=35, color='blue')

plot.title('Marks obtained out of 100',fontsize=15, fontweight='bold', color='blue')

The same can be represented as a horizontal bar graph. Instead of bar function, we use barh function.

import matplotlib.pyplot as plot

import numpy as np

subjects = ['Maths', 'Physics', 'Chemistry', 'French', 'Computers', 'English']

marks = [94,85,66,55,89,64]

m = np.arange(len(subjects))

plot.barh(m, marks,color='indigo')

plot.ylabel('Subject',fontsize=15, fontweight='bold', color='blue')

plot.xlabel('Marks',fontsize=15, fontweight='bold', color='blue')

plot.yticks(m, subjects, fontsize=10, fontweight='bold', rotation=35, color='blue')

plot.title('Marks obtained out of 100',fontsize=15, fontweight='bold', color='blue')

plot.show()

In order to add the data value on the graph, we need to

for i, value in enumerate(marks):

plot.text(value, i, str(value), color='indigo', fontweight='bold')

I am using Spyder IDE (from Anaconda Navigator) in order to run this code. It has a handy feature, a variable explorer that shows the details of the variables used in code.