Setting up Python Environment-Anaconda Installation

For a newbie like me, it is difficult to keep upgrading Python and associated packages while resolving package dependencies. This is where Anaconda comes to my rescue. Anaconda is a free Python distribution and package manager. It comes with lot of pre-installed packages (primarily for data science).

It can be downloaded for Linux from the Continuum’s site https://www.continuum.io/downloads#linux . The instructions for installation on Linux are available on the same site. I have downloaded and installed 64 bit Python 3.6 version on my Linux Mint.

In order to update Anaconda and Python to latest version, you need to run the below command on the Terminal.

Screenshot from 2017-06-06 20-39-16

However, I continue to have older version of Python. You can see in below screenshot, Python 3.5.2 which I manually installed and Python 2.7.12 which was pre-installed on Linux Mint are still available.

Screenshot from 2017-06-06 20-47-40

conda update anaconda

Screenshot from 2017-06-06 20-50-45.png

On my Linux Mint, I have already updated to Anaconda version 4.4.0 (latest available as of date). This way, it is easy to keep upgrading Python and required packages.

On my PyCharm, I can choose Python 3.6 (installed through Anaconda / conda update) as the project interpreter.

Screenshot from 2017-06-06 20-57-17.png

Anaconda also comes with Anaconda Navigator – a GUI useful to launch Applications, manage packages, learning Python etc,

Screenshot from 2017-06-06 21-08-23

Spyder is an open source cross platform IDE for scientific programming in Python. Spyder integrates NumPy, SciPy, Matplotlib and IPython, as well as other open source software.

Screenshot from 2017-06-06 21-10-01

To conclude, Anaconda is a Python distribution with lot of useful features and learning opportunity in one place.

 

 

Setting up Python Environment-Installing Packages

In order to build useful applications, we need Python Libraries or Packages. Majority of such useful pckages can be downloaded from PyPI, the Python Package Index https://pypi.python.org/pypi

Best way to install the packages is by using a tool called pip. We can get pip from https://pip.pypa.io/en/latest/installing.html. However, on Linux Mint, pip is already installed along with Python 2.7.12. Similarly, when I installed Python 3.5.2, pip3 tool is installed. To upgrade to latest pip, you need to run below command on terminal

pip install -U pip

Now let us look at some of the useful packages for analyzing data.

Numpy (http://www.numpy.org/): Is useful for processing for numbers, strings, records, and objects.

pip install numpy

Pandas (http://pandas.pydata.org/): Python Data Analysis Library provides various data analysis tools for Python.

pip install pandas

Matplotlib (https://matplotlib.org/): Matplotlib is a Python 2D plotting library to produce publication quality graphs and figures.

pip install matplot

OpenPyXL (https://openpyxl.readthedocs.io/en/default/): Openpyxl is a Python library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.

pip install openpyxl

Once these packages are installed, they can be imported and used in your application. E.g.,

import numpy
import pandas
import openpyxl
import matplotlib

from pandas import DataFrame
from pandas import *

 

 

Setting up Python Environment-SQLite

For applications involving data storage and usage, we need a Database. SQLite is a simple yet very useful SQL database engine. It can be downloaded from the website – https://www.sqlite.org.

There is a very nice description of when to consider using SQLite database and when to consider client server databases like MySQL and PostgreSQL here – https://www.sqlite.org/whentouse.html

I installed SQLite on my Linux Mint using the below command on terminal:

sudo apt-get update
sudo apt-get install sqlite

In order to create database, tables, views etc, you may use a DB client for SQLite DB called SQLiteStudio from – https://sqlitestudio.pl/index.rvt

All you need is download, unpack and run the app.

As you can see from below screenshots, there are are many number of good features available in SQLite like, Constraints, Indexes, Triggers, Views etc.

Data can be inserted using the user interface

Screenshot from 2017-05-20 20-28-43Screenshot from 2017-05-20 20-31-17Screenshot from 2017-05-20 20-33-46

Screenshot from 2017-05-20 20-40-49.png

How I got bitten by Python programming

Many years ago, I used to be a Java programmer. In fact, I started my information technology career in the year 1999 as a software developer in a small company which focussed on application software development. During my engineering days, I learnt Fortran and C programming. When I completed my engineering, there was Y2K problem (https://en.wikipedia.org/wiki/Year_2000_problem) which helped many job aspirants to jump into IT industry irrespective of their educational background.

During the same time, Java was one of the bleeding edge technologies. There was a saying – ‘To get into IT job,  all you need to know  is spelling of Java’.

After few years of programming (mainly in Java, Web development, SQL, Database design), like many others, I moved on to project management and with more focus on day to day operations, I gradually lost hold on coding but not the zeal.

Several years later in the current digital world, data analytics caught my attention. I am interested in learning data analytics and visualization. Since few years, I started using Linux Mint Cinnamon OS more frequently on my personal laptop as it is free and open source(FOSS). I was fascinated by Cinnamon Desktop Environment. The website – https://en.wikipedia.org/wiki/Linux_Mint, claims most of the Linux Mint is developed in Python language – https://www.python.org/. I was aware of the fact that majority of Unix/Linux development happens in C but was surprised when I saw Python. This was my first encounter /awareness on Python.  This is when I started gathering my understanding of Python from internet.

Why learn Python ? 

  • It is a free and open source (FOSS)
  • Already available in several Linux distributions
  • Easy to learn for beginners (minimal coding is required)
  • One of the languages widely used for Data Analytics
  • Popular (http://www.tiobe.com/tiobe-index/) and good Community support
  • Availability of code libraries / packages
    • Many Web development frameworks – Django, Bottle, Flask etc
    • Scientific and numeric computing – Numpy, Matplotlib, Pandas etc
    • Rich GUI development – pyQt, wxPython

Python 2.x or 3.x ? 

Several books and websites debate on whether to use Python 2 or Python 3. I have noticed that by default, Python 2.7 was installed on Linux Mint 18 (Sarah). When I started learing, I felt that going forward the focus would be on developing Python 3.x  as  it is the present and future. Hence I started with Python 3.5 interpreter. Fortunately, Python 3.5 is also pre-installed on the latest Linux Mint 18.1 (Serena).

My favourite books for learning Python / Data Analytics

There are several online books and tutorials available. One of my favourite is Tutorials Point – https://www.tutorialspoint.com/python/

I follow the Google plus Python community frequently – https://plus.google.com/u/0/communities/103393744324769547228

Also, Stackoverflow (http://stackoverflow.com/questions/tagged/python) comes to my rescue whenever I encounter some hurdles.

Screenshot from 2016-12-31 20-44-32.png

Disclaimer: The opinions and experiences listed on the site are my personal. In some cases, my understanding could be incorrect as I am a beginner to intermediate programmer. Please point out if any correction is required so that I can consider editing  the blog.