Image for post
Image for post
Mascot visiting Python booth at OSCON 2018

Another Pathway Through Python

In an earlier story, I was exulting about Sphinx, a documentation generator that turns restructured text (a smattering of punctuation) into a handsome website.

Turning your Python project into something with official-looking state of the art documentation is an ego boost for anyone.

Sometimes a lonely slog through some project needs a shot in the arm, and boosting one’s ego is just the ticket. “Maybe this project will have more lasting value, now that it’s documented” thinks the little cogito.

I’m still a big Sphinx fan, bigger than ever, but have since learned not to dwell on it too much at first. One must sometimes temper one’s enthusiasm.

Show it off as a valuable asset in the ecosystem, but start in with Jupyter Notebooks instead, as the skills are similar.

Instead of Restructured Text, we use Markdown. The cheat sheet is pretty short.

You get handsome HTML / CSS out the other end, and the code stays interactive.

Put it on Github. The ego gratification is more immediate. Then use Git to keep making it better, whatever it is.

Once in the context of a Jupyter Notebook stash, how best to learn Python?

I approach this task in terms of levels, which I spiral through, adding more to each level in turn, in contrast to providing any exhaustive treatment of one before advancing to the next. By “levels” I mean:

  1. keywords and punctuation, basic syntax, dot notation, brackets and colon.

I’m tempted to add a sixth dimension: ways to learn all of the above, curriculum materials, videos, “meta-Python” if you will.

What I’ve neglected entirely is going into the machinery of the interpreter itself, perhaps implemented in C, C# or some other language.

The Python virtual machine may be coded in any other Turing complete language, in theory. But I wouldn’t call learning the guts of a Python virtual machine the same thing as learning Python, which is our focus here.

So yes, I spiral through these five levels, sharing a bit more from each with each turn of the spiral.

Then comes the need for overview, combined with zoomed in treatments of the nitty-gritty, and relating these two. That’s where my most recent Jupyter Notebook fits in (the one that inspires me to write this story).

The title is pretty ordinary: Data Structures: Keeping Data Organized.

The concept of “organizing” is fairly complex.

Image for post
Image for post
common shapes in a volumes hierarchy

In Intro to OOP: Organizing Polyhedrons, a separate notebook, the topic is sorting shapes by name, or by volume.

Given ((‘VE’, 20), (‘O’, 4), (‘RD’, 6), (‘RT’, 5), (‘C’, 3), (‘T’, 1)) how does one sort these into:

  1. Name order: ((‘C’, 3), (‘O’, 4), (‘RD’, 6), (‘RT’, 5), (‘T’, 1), (‘VE’, 20))
Image for post
Image for post
defining polyhedrons

Hint: the sorted function takes a named argument, key=, you may use to tell which element in the pair is the key (leftmost by default, with tie-breaking to the right).

However, sorting requires first getting the data into a structure, in the above case a “tuple”. But that’s pretty nitty-gritty. We could use a bigger picture going in.

So I start with the whole idea of a website as a data store. Readers relate to websites, and to the fact that they:

(a) synthesize web pages on demand and
(b) use stored data to do so.

The MVC classic web framework, such as Web2py, Django or Flask, help us set the stage.

More generally, and mythologically, I find it useful to transform our Python into a Dragon at this point (by way of serpent if necessary).

Image for post
Image for post
Youtube about Djangocon etc.

In fairy tales, such as Lord of the Rings, a dragon guards a hoard of treasure. That’s Python managing (organizing) our data.

Only after looking at the anatomy of a website do I then get nitty-gritty and dive into the built-in data structures.

At this point, I’ll likely use some emoji as string elements, mainly to emphasize that the string type long ago stopped being American Standard Code for Information Interchange. As any grade schooler knows, we’re in the Age of Unicode these days.

### BUILT-IN DATA STRUCTURES# TUPLE
the_tuple = ('🐙', '🐳', '🐯') # <-- emoji are Unicode

Then finally, for my finishing segment, I move from built-in data structures to 3rd party.

The NumPy n-dimensional array is the bread and butter, the meat and potatoes, of computational Python.

The built-in list is great for “orchestration” (big picture organizing of program flow) but when you need to do solid number crunching, like inverting a matrix, that’s where your Numpy array enters the picture.

All elements will need to be the same type. Slice notation is on steroids (because of the multi-dimensions).

The NumPy array is a data structure for grownups.

We need more big picture though, like: “so what about NumPy arrays, what are they used for?”

Enter Machine Learning.

Here, my innovation is to create a pattern coming from the piano keyboard and learning to play piano.

Have you ever played Chopsticks? Certain pairs of notes start the melody, namely: (F,G), (E,G), (D,B), (C,C).

Make each of the eight keys, C to C, a slot and fill in with 1 or 0 depending on if pressed or not. Generate random sequences of eight 1s and 0s, like 01001000 or 10010001. Neither of these is a “chopstick pattern”. Those would be:

00011000
00101000
01000010
10000001

Add a ninth column, with a 1 if the pattern is a chopstick, with a 0 if it’s not, and feed the whole data set to a machine learning algorithm using the scikit-learn API.

I compare two such algorithms: K Nearest Neighbors (KNN), and a Multi-layer Perceptron Classifier (MLPC).

Because this is all in a Jupyter Notebook, the student is encouraged to grab it, trust it (a button push), and make changes. Try fine tuning the hyperparameters of the MLPC model why not?

This is a notebook to keep coming back to, as one’s knowledge expands. I’ve got a whole maze of notebooks to wander within. Students discover their own pathways, and are inspired to make new ones. Make your own maze, why not?

My final remarks mention “big data” and some of the tools one might use there.

I don’t get into Pytorch or TensorFlow in this notebook as it’s the concepts that matter and scikit-learn does a fine job with those.

Written by

Lots online.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store