GSOC 2016 blog: wrap-up


This blog post summarizes the main tasks that I have done during these 3 months as a GSoC’16 Intern, and the things I have learned along with that.

I have been working on the Ecodata Retriever project, with my mentors Henry Senyondo and Ethan White under Numfocus.

All the commits I have made during this period are listed here on this Github link:


Note: All the code has been merged to master branch.


 Lists of tasks:

a) Upgrade scripts to Datapackage.JSON standard.

This was my main GSoC task, that I spent most of the last 6 weeks in, that includes code and documentation.


Summary 1:

  1. The scripts have been updated from .script format to .json using the parse_script_to_json module I wrote.
  2. A new CLI (command-line-interface) tool has been added by me, that can:
    1. Create new JSON scripts: Takes input for all the relevant fields from the user, validates the input, and stores them in valid JSON format (Datapackage.JSON standard).
    2. Delete JSON scripts: Deletes any script based on the script’s shortname. Searches the list of python scripts  (SCRIPT_LIST) and deletes the scripts that match the users requirement after confirming.
    3. (Experimental) Edit JSON scripts: This feature has not been completely tested, so currently disabled. This allows users to edit existing retriever scripts.
    4. Added unit-tests and modified integration tests to test input validation and JSON script integration (download and installation regression tests).
    5. Added documentation (link) to guide the user on this new tool.


b) Port retriever to Python 3, maintaining backwards compatibility.
Not a cakewalk at all. I already highlighted the various csv and encoding issues (UTF-8 / latin-1) in the previous post. But nevertheless, the library is now fully compatible, both on Python 2 and 3, on all major *NIX and Windows platforms (tested on Ubuntu, Mac, Windows 7).

I completed this in the first month of the GSoC period, and have been adding fixes related to all the bugs that came up during the rest of the coding period. I refactored the code so that there is no more need for explicit OS checks, thanks to help from my mentors Henry and Ethan.


Summary 2:

  1. retrievercan now be installed in either Python 2 or Python 3, without any difficulties.
  2. Cross-platform compatiblity (with python 2 and 3 both).
  3. Updated documentation(link) to reflect Python 3 support.



1. Python idioms

2. Unit testing (with pytest)

3. Different types of unicode encodings (UTF-8 and ISO 8859-1)

4. sphinx documentation system

5. git-fu!

6. Python 2 vs python 3    syntax and package-support differences.


In closing, it was an immensely rewarding learning experience, and I look forward to remain associated with the retrieverproject 😀

Thanks for reading!


GSoC Blog – Part II

This blog marks the end of the first 4 weeks of my GSoC internship with NumFocus.  As I have mentioned, I am working on the project EcoData Retriever (an awesome tool to download and examine ecological datasets) and its been a great learning experience so far.


Python 3

First things first – Ecodata Retriever now completely supports Python 2 and Python 3 natively. That isn’t to say that there aren’t bugs, but the build passes all tests on python2 and 3 on *nix and Windows systems. I would appreciate any bugs filed regarding the compatibility on the issue tracker.

For this task, I used the future package from pip, which made adding a lot of these changes very easy. Its a wonderful piece of software, and if you are looking to port your library to python 3 and maintain backwards compatibility, then you should look into it as well.

Even after using future though, there were a lot of issues, mainly involving”

  1. Unicode (especially UTF-8) and
  2. The csv module (which is difficult to backport).

The unicode changes were not that hard. All I did was decode() and encode() strings where Unicode or bytes value was needed (strings are bytes by default on python 2 and Unicode in python 3). Until unless bytes-type was required, I cast all strings to Unicode (UTF-8 by default).

The csv module though, was a lot of pain. It took me a while to realise that csv doesn’t work that well cross-platform (adds extra \r on opening in text mode on windows). Plus, it doesn’t play nice with the str module from the future.builtins module. I had to insert python version checks ( sys.version_info ) and OS checks (nt vs posix) to get it compatible on both python 2 & 3 across all platforms.

Datapackage standard

Next is my main GSoC task – Upgrading the dataset scripts to datapackage.json standard. This, thankfully, proves to be much easier than the former task. This has three main parts:

  1. Upgrade existing scripts to JSON
  2. Add CLI tool to create new JSON scripts
  3. and edit the existing ones.

I had already done the first part during my community bonding period, and thus did not have to spend a lot of time on that.

I have completed the major portions of the second task, by creating a new function to get input data from a user using python input() prompts. It was fairly easy, as I already had a discussion with Henry on the major changes that needed to be incorporated into the tool. And based on the datapackage.json specification, I came up with a nice format to port the current YAML like scripts to JSON.

We are currently reviewing the changes on this. Its a work in progress, and the final release will only come by the end of this month (or by August-end).

I hope to add the changes for the third sub-task in the next week. I’ll keep updating and posting as I go along.

Raspbian is overrated (or Why Lubuntu rocks)

Hello all.

One of my very old posts was based on my nascent interaction with Raspbian, the most popular distro that people generally install on their Raspberry Pi the first time around.

Now its no secret that Raspbian is very kludgy. The version that I was running on was based on Debian Wheezy, so I can’t vouch for what upgrades and fixes (Rasbian related) Jessie might have brought to the table.

I guess its more of a personal preference, but the Raspbian (Wheezy) felt very outdated and slow for my needs. And it came installed with Scratch and Mathematica, but not python GPIO libraries. Go figure.

But after experimenting on the Pi2 for a while, it suddenly dawned on me, that why a long time Ubuntu user like me, is not using Lubuntu on this board instead? One of my friends runs his old Pentium 4 on it, and its pretty smooth. I didn’t even hate using his desktop for some Arduino programming, until its UPS died and I had to switch to another system :p

So I decided to wipe and install Lubuntu 16.04 on my microSD for Pi2, and it didn’t disappoint me at all. The performance is good, browsers work without any hacks or lags, and best of all, I get an environment that I am very familiar with.

I would personally recommend it for all Pi2 hobbyists out there.




GSoC blog : The beginning

It’s been a long time that I posted here. Thankfully, I will have something useful to talk about this time.

I am very excited to have been selected as one of the interns for the Google Summer of Code (GSoC) program for the year 2016. Thanks to GSoC, I have become very interested in open source, and even become one of the mentors for an open-source org on Github ( Link ).

The organisation that I have been selected under is NumFOCUS. The project that I have chosen is the EcoData Retriever project on Github. My mentors are Ethan White and Henry Senyondo. They have been very helpful and encouraging during the GSoC application and community bonding phase.

My project’s main goals are as follows:

  1. Convert data scripts to datapackage.json standard.
  2. Add python 3 support.
  3. Resolve important issues to reach Retriever 2.0 milestone.

Thanks to working on this project, I have improved my Python programming skills and also learned how to use git properly.

Hoping to have a great time this summer!



Trials and tribulations of Theano+Jupyter

An ongoing post about how I setup my machine to use Theano for development.


    1.  Activate GPU support:
      • Install the latest CUDA
      • Get cuDNN libs from NVIDIA
      • copy cuDNN libs to CUDA folder (/usr/local/cuda-7.5/ for example)
    2. Setup a default editor:
      • Make a python file with name in ~/.jupyter
      • Add the following code (replace ‘subl’ with your editor of choice):

        c = get_config()
        c.JupyterWidget.editor = ‘subl’

Adding Google Calendar to desktop (Ubuntu 14.04.3)

What an idea Sirji!

Linux is meant to make you productive. And keeping track of your calendar might be the best way to do that.

This short blog post will summarise the procedure that I applied while adding Google Calendar screen to my desktop. Surprisingly, its very easy to do so.


Step 1: Get conky!

Install conky using your distro’s repos. In Ubuntu, this can be done by:

sudo apt-get install conky conky-all


Step 2: Install the calendar backend.

We need to use a python utility known as gcalcli, and its dependencies.

sudo apt-get install python-pip python-dateutil python-gflags
pip install –upgrade gcalcli googleapipythonclient vobject parsedatetime


Step 3: Authenticate gcalcli

You need to allow gcalcli to pull your calendar automatically. For this, you can run the command:

gcalcli list

This will open in a web browser. Grant gcalcli all the privileges it seeks.


Step 4: Add the output configuration file

We need the output of gcalcli in conky-friendly format. Copy this file as in your .config folder.


Step 5: Edit conky settings

Now, we just need to setup our conky window and let it know which commands to run. Here’s my config file. Just add it to your Home (~/) folder.

You can edit it as per your requirement using the documentation.


Step 6: Add conky to startup programs.

You don’t want to start it up manually each time you login, do you?

Just add conky in the startup applications. (Alt key -> Startup applications)


That’s it! Logout and login to see your new desktop 🙂

Known Issues:

The calendar window is superimposed over active windows

For this, I suggest adding this line to the ~/.config/autostart/conky.desktop file



Edits and suggestions are welcome.


Qualcomm Wifi driver issues (Acer Aspire E5-573)

(Edited)UPDATE – Works on Ubuntu 14.04.3 and upwards.

So I struggled with finding a decent wifi driver for my Acer Aspire E-573 on Ubuntu Mate 15.10 (kernel version 4.2.0), whose particulars are as follows:

03:00.0 Network controller: Qualcomm Atheros Device 0042 (rev 30)
Subsystem: Lite-On Communications Inc Device 0806

But unfortunately, until now, I was making do with dirty hacks such as these. Needless to say, stability was needed.

But thanks to my tinkering with a parallel Arch installation on the same machine, I stumbled upon this – Qualcomm-Atheros-QCA9377-Wifi-Linux.

The README is enough to guide you through.

Thank god for Arch forums!