Code or Die Welcome to my kitchen

Yak shaving tips, Infosec, Astronomy, Gardening

This is my blog. Proud guardian of three cats and two dogs, I currently reside in San Antonio, Texas.

I'm interested in gardening, astronomy, infosec among other pursuits. About this site.

Past and current projects. You can also browse the blog archive if you like.

Message of the Day: Party on dudes!

What's The Deal With Closure?

Although closure is easy to understand in relatively simple terms, it was not until recently in my coding education that I discovered a few subtleties about the subject that clarified my understanding of it immensely.

Since I always love to demystify computer science and coding, let’s pull aside the curtain to take a look at what really is happening with this phenomenon.

History

A quick aside to cover some background on closure:

Closure first saw use in 1964 by Peter J. Landin who coined it as a lambda expression whose open bindings have been closed by (or bound in) the lexical environment, resulting in a closed expression, or closure.1 This concept was then used by Sussman and Steele in Scheme, a lexically scoped variant of LISP,2 and the rest is history.

In plain English, the definition of closure consists of two important concepts:

  1. A closure consists of a function or reference to a function.
  2. A closure also consists of the referencing environment, also known as the function’s lexical scope.

Although the definition of closure varies slightly between programming languages, this is the accepted general definition for languages that employ first-class functions, in other words languages that treat functions as first-class citizens.

A language that supports passing functions as arguments to other functions, returning them from other functions, and assigning them to variables or storing them in data structures is said to employ first-class functions.3 Some examples of programming languages that fit this description are Haskell, Ruby, Python and JavaScript.

What does this all mean, really?

All Ruby code blocks are really closures. What?? I know, it sounded strange to me too. Let’s get into examples.

Consider the following Ruby code snippet:

str = "Hello"
5.times do
	str2 = "world!"
	puts "#{str} #{str2}"
end

Output:

Hello world!
Hello world!
Hello world!
Hello world!
Hello world!

Nothing too groundbreaking here since we are just exploring Ruby code blocks and how they act as closures.

We see that the code within the block between the do and end was somehow aware of the value of the local variable outside the scope of the block, str. The opposite is not true; if we try to show str2 outside the block, it is an undefined local variable.

Why does this happen? It is because Ruby, (at least the most common version of Ruby known interchangeably as Matz’s Ruby Interpreter, Ruby MRI or CRuby), uses a copy of the current stack frame for each iteration of the block. This copy contains the environmental pointers that indicate what the variables were at the time of the creation of the block - kind of a snapshot, if you will.

Let’s see what happens if we change str inside the block.

str = "Hello"
5.times do
  str2 = "world!"
  puts "#{str} #{str2}"
  str = "Just kidding, goodbye"
end

Output:

Hello world!
Just kidding, goodbye world!
Just kidding, goodbye world!
Just kidding, goodbye world!
Just kidding, goodbye world!

We see that the block acknowledges the original environmental pointer of local variable str in the first iteration, but on subsequent iterations the pointer has changed, and it continues to honor this new definition until the end. The so-called snapshot of the current stack frame was changed after the first iteration because we changed the value of str inside the block.

Stack and heap memory

It sounds a bit complicated, but we can better visualize this if we learn about the concept of heap and stack memory, both of which are stored in the computer’s RAM (Random Access Memory). Programming languages will use RAM to assign variables to values. A variable is nothing more than a pointer to a specific address in memory that contains the corresponding value.

Stack memory is used for running your program, and from what I understand, each stack frame is like a single step in any given line of code. In the case of Ruby MRI, it is the underlying C code that determines the current stack frame. Stack memory is fast, but liable to overflow i.e. if the program goes into an infinite loop or otherwise runs out of memory due to a runaway process. Stack memory is nice in that as soon as a particular function or process is done, the variables deallocate or disappear, freeing up the space automatically.

Heap memory on the other hand is used to allocate a more permanent block of data apart from the current stack frame. This helps enable multithreading in applications, or applications that need to access the same memory address at the same time: each thread has its own stack but also a shared heap. Excessive fragmentation is a common danger for heap memory which makes it hard to allocate a large enough block to new processes. Luckily in garbage collection (GC) enabled languages such as Ruby, heap memory will persist only until the data’s environmental pointers disappear (in other words when a block ends), or (if GC is not available) until the programmer expressly deletes the data.

It is these details, as well as many other great reasons, that make C and C++ an asset to learn since many high level modern programming languages tend to abstract away these vital computer science concepts.

Bringing it home

As we saw in the above code examples, Ruby blocks refer to a specific environment (the lexical scope) as well as the functions and variables contained therein. This fulfills the requirements of closure as defined above.

Since both Procs and Lambdas are types of blocks in Ruby, these also act as closures. Closures, as it turns out, are everywhere and underpin some of the most important concepts in programming.

In JavaScript, we use callback functions to pass a function as an argument to another function to be returned or called later. Callback functions are really, you guessed it, closures. They contain a function and the lexical scope of said function at the time of its creation, including any object’s chain of inheritance.

Something to watch out for in JavaScript is this fact that excessive use of callback functions or closures will negatively impact processing speed and memory consumption.4 For this reason it is advisable to use an object’s prototype for methods so that they are not reassigned every time an object gets created via closure.

I love learning new (to me) underlying concepts and this fascinating concept of closure has many facets. If you spot any glaring errors or wish to give some feedback, please feel free to shoot me an email.

It's OK To Be Wrong

Happy new year! Welcome to 2016.

Welcome to the world of tomorrow

As I was coming up with a cool little blog topic to start the year, I kept stopping and scrapping this and that idea. The reason I was experiencing difficulty formulating my thoughts properly was not a lack of material. The real cause for my hangup was that I felt I needed to be original, non-trivial and most of all, I just had to be correct.

This last thing was fatal, I now realize. It prevented making any sort of rational investigation. I needed to permit myself the necessary latitude to try something new and fail. As the adage goes, Nothing ventured nothing gained. Once I got around this hangup, I could properly begin to write again.

I feel that this is a very dangerous tendency. It takes courage to make something new. It takes more courage to admit that something you made or thought was wrong and correct that mistake.

Luckily the scientific method makes it fluid to pivot opinions based on new evidence. As programmers, developers, engineers, we are nothing if not scientists, and we should always be using the scientific method. Always.

It’s OK to be wrong sometimes

As an example of this, I recently came across a blog post written a couple of years ago that stated JavaScript is a bubble. Gritting my teeth, I proceeded to read the article and then the predictably contentious comments.

The article was of course wrong. Reports of JavaScript’s demise had been greatly exaggerated. What I found interesting was that the article also linked to a follow-up by the author which offered a retraction. I thought this was cool, because it showed that the author was human, and that they were capable of recognizing their errors.

I imagine there are many blog posts, articles and lots of other types of original content that probably never see publication because the author was fearful of not being somehow 100% right. My feeling is that for personal blog posts, we should be embracing the unknown and have the courage to expose our own ignorance. I would hope to be corrected in a kind but firm manner if and when I make a technical mistake or have an incorrect concept. It’s what makes feedback a valuable gift, and I would hope to do the same for any colleague or fellow blogger. It’s the only way we can learn and grow.

Here’s to making lots of mistakes in the new year. Cheers!

A Word on Google Analytics

Spent a nice couple of weeks on holiday in Texas visiting Austin, San Antonio, and Del Rio. In between catching up, eating, and general merrymaking, I learned a neat little trick to better interpret and protect Google Analytics (GA) data and traffic.

Most of the time you want your site to only record outside hits, in other words traffic that is not from your own (possibly numerous) visits. Naturally this applies to businesses as well as blogs. Secondly you also want to make sure your site is the only one actually generating the hits to your tracking code.

The solution to both of these issues is to set up custom filters1 in GA.

Exclude internal traffic2

  1. Find your IP address. (Google “My IP” for your public IP address).
  2. Log into your GA dashboard.
  3. Select Admin tab and select the account you wish to manage.
  4. Under Account column, select “All Filters”.
  5. Press the “+ Add Filter” button.
  6. Name your new filter, Own IP or something similar.
  7. Filter type should be left “Predefined”.
  8. Select filter type “Exclude” from dropdown.
  9. From the “Select source or destination” dropdown menu, select “traffic from the IP addresses”.
  10. From “Select expression” dropdown, select “that are equal too”.
  11. Finally, enter your IP address and save.

If you have a larger company or website, naturally you would wish to set up a domain-type of filter3.

Include only real traffic

A second type of filter that may be advisable in order to prevent possible tomfoolery is one that makes sure only actual traffic from your site is being recorded.

It is well known that you can easily find and read the GA tracking code using the developer console of any modern browser, unless, you know, it is hidden as an environmental variable. Although there is little real danger from having this code publicly available, it is possible that some Internet ne’er-do-wells may utilize it to spike traffic at you. It sounds outlandish, but this fake traffic could ostensibly screw up your reporting and possibly induce the webmaster (i.e. you) to investigate the sites that are generating the hits, (a very poor advertising tactic indeed)4.

I actually saw this type of activity on a previous blog I held, mostly from Russian and Chinese sites, so there you have it.

To make sure your traffic data is clean and accurate, simply create a new custom filter that only accepts hits from your desired domain.

  1. Again, (from Admin on GA dashboard,) select your site.
  2. Click “+ New Filter”.
  3. Name it Defence against the Dark Arts or something similar.
  4. Select “Custom filter”.
  5. Select “Include”.
  6. Under “Filter Field” dropdown select “Hostname”.
  7. Enter your desired domain name. Use backslash to escape special characters, like so: nchristiny\.github\.io
  8. Save your awesome new filter.

Enjoy your clean GA data.

As always, any feedback or questions are most welcome. Feel free to contact me via email. Thanks for reading.