Thursday, December 29, 2011

JavaScript: Closures and Dynamic typing

"JS had to look like Java only less so, it had to be Java’s dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened"
-Brendan Eich (creator of JavaScript)

Javascript has been here for ages. Working on the background elevating our browser experience, since the time of memorial... and now suddenly it has been pushed to the central stage? What is going on? And why should we care?

A short answer to the first of the question; "why now?" that I can think of are Ajax and the Dead of Flash (yes, it's true, Adobe told me so!). Recently web has become more mobile and web design more light-weigh & responsive. Javascript and HTML5 were the ones taking over the challenge and quite successfully at that.

Once the first question has been answered, the second should become obvious; Javascript and HTML5 are here to rule the web front-end, before we find something better, and we as developers must lear the ways of our new masters.

In terms of Javascript, I would like to focus only couple of it's treats: Closures and Dynamic Typing, mainly because neither one of these are available currently in Java ;(

Closures in Javascript.

What are closures and how do they work?

Closures are a functional programming concept. In the language where they are available, they can be used drastically simplify complex operations. However everything that can be achieved with closure in Javascript, can be achieved without them. This applies for any language, which satisfy the prove of Turing completeness[1]

I think best to describe a closure is with an example:

var outerValue = 'OuterValue';
var later;

function innerMethod() { 
     var innerValue = 'InnerValue';
     
     function innerMethod() { 
          assert(outerValue,"OuterValue accessible.");
          assert(innerValue,"InnerValue accessible.");
     }

     inner = innerMethod;
} 

innerMethod();
inner();

When executed this will print out:

OuterValue accessible.
InnerValue accessible.

Now the question, you should be asking yourself is: "How can the inner function access the InnerValue?" (OuterValue is defined in the global scope, so no surprises there).
Here's where the Closure step in to the picture; The closure, inner, had visibility to all variable to the scope at the time of the declaration.

This is what Closures are, mostly they are used in callback functions as any JQuery or NodeJs developer knows.

Here's another example using a simple callback function:

var element$ = jQuery("div");
element$.html("Content Loading...");

jQuery.ajax({
     url: "index.html", 
     success: function(html){
         element$.html(html);
     } 
});

Within the jQuery .ajax() method, the anonymous function serves as the response callback. Within this callback, there is a reference to the element$ via the closure, which then updates the content of the div. Presence of the closure makes the functionality short and simple.

Dynamic Typing in Javascript.


The blessing and a curse of JS.

Dynamically type languages usually have:
  • Bad IDEs: no autocomplete, no type for a function parameter
  • No method overload
  • Impossible to refactor
However:
  • You don’t have to initialize variables
  • You write less code

UI development often demands better productivity and high flexibility, so dynamic typing is often preferred. And here's something you cannot do in static type language (as simply). Code is adding behavior to the object instance after creation.


empty_object = new Object();
   empty_object.color = function() { return 'red'}
   alert('Color: ' + empty_object.color())

Summary:

Here's but a tiny flash of JavaScript's great flexibility and power of the higher-order language. In the UI layer, it's here to stay for a while, so we might as well, dive in and get familiar with it.


References:

Saturday, October 8, 2011

No match for Scala

A computer once beat me at chess, but it was no match for me at kick boxing.
-Emo Philips

For fun and profit, I was drafting an extreme-startup[1] Dojo challenge to my company; Houston-Inc[2]. I started up by downloading the project from GitHub[3], installed a Ruby1.9 with various Gems using MacPorts and finally started up the server.
The server kept crashing under use however, so I applied couple of quick patches with my rudimentary Ruby skills and contacted the developer (Johannes Brodwall). Johannes pushed a proper fix to the GitHub, but a second later ;) Probably I had been running the challenge in a different way thus exposing the bug. Anyway... back to the issue with Scala.

I wanted a powerful language and a build tool with quick turn around time. Java with JRebel might have been sufficient, but Java unfortunately has a clumsy RegEX functionality and JRebel is a commercial product... not ideal for a coding Dojo.

Scala with SBT[4] was a better choice, though optimally (should I have the skill) I could have taken a language with literal RegEX, such as Ruby or Perl. Final item on my technology stack was the a Web-Framework, for which I chose light-weight Scalatra[5]; a simple Sinatra inspired framework, since Lift[6] would have been shooting mosquito with a cannon.

Finally with problem; After a while of answering the server's challenges the build terminated with an exception: "java.lang.Error: ch.epfl.lamp.fjbg.JCode$OffsetTooBigException: offset too big to fit in 16 bits: 46686" The final code block that caused the exception was the one below:

Code with too much matching.


def parseParamValue(value: String): String = value match {
      case TheBondQuestion() => { "Sean Connery" }
      case EuropesoQuestion() => { "peseta" }
      case BananaQuestion() => { "yellow" }
      case TwitterQuestion() => { "jhannes" }
      case PlusCalculation(first, second) => { (first.toInt + second.toInt).toString }
      case Multiplication(first, second) => { (first.toInt * second.toInt).toString }
      case LargestInAList(list) => { list.split(", ").map(_.toInt).max.toString }
      case SquareAndCube(list) => {
        val numbers = list.split(", ").map(_.toInt)
        val result = for (x <- numbers; if isSquareOrCube(x)) yield x

        if (result.isEmpty) ""
        else result.head.toString
      }
      case DetectPrimes(list) => {
        val numbers = list.split(", ").map(_.toInt)
        val result = for (x <- numbers; if isAPrime(x)) yield x

        if (result.isEmpty) ""
        else result.map(_.toString).reduceLeft(_ + ", " + _).toString
      }
      case _ => "default"
  }

There is nothing wrong with code per say, but there is a problem with Scala Compiler and Java JVM. Bottom line is; that it's a bug[7] in compiler and it won't be fixed anytime soon ;) The cause of the bug is the overwhelming size of the methods and the solution to drop the size under the maximum method size, like so:

Quick and Dirty fix.


def parseParamValue(value: String): String = value match {
      case PlusCalculation(first, second) => { (first.toInt + second.toInt).toString }
      case Multiplication(first, second) => { (first.toInt * second.toInt).toString }
      case LargestInAList(list) => { list.split(", ").map(_.toInt).max.toString }
      case SquareAndCube(list) => {
        val numbers = list.split(", ").map(_.toInt)
        val result = for (x <- numbers; if isSquareOrCube(x)) yield x

        if (result.isEmpty) ""
        else result.head.toString
      }
      case DetectPrimes(list) => {
        val numbers = list.split(", ").map(_.toInt)
        val result = for (x <- numbers; if isAPrime(x)) yield x

        if (result.isEmpty) ""
        else result.map(_.toString).reduceLeft(_ + ", " + _).toString
      }
      case _ => value match {
        case TheBondQuestion() => { "Sean Connery" }
        case EuropesoQuestion() => { "peseta" }
        case BananaQuestion() => { "yellow" }
        case TwitterQuestion() => { "jhannes" }
        case _ => { "default" }
      }
  }

References:

Sunday, August 14, 2011

Scala - Can you read this?

"Eschew obfuscation, espouse elucidation"
-Fumblerule used by academics.

A month ago our company executed it's first Scala coding Dojo[1]. As excepted; a dozen of Java programmers, given a hybrid language, did produce a Java style mutable solution and were *damn* proud of their creation. After the Dojo, I teamed up with a colleague and sculpted a real function style solution, with a tail recursion instead of iteration and with zero mutable variables. The central part of the algorithm is presented below.

def countScoresFromRounds.

def countScoresFromRounds(throws: List[Int]): Int = throws match {
   case first :: second :: 10 :: rest =>
   (2 * first) + second + countScoresFromRounds(throws.tail)
   case first :: second :: third :: fourth :: rest if third + fourth == 10 =>
   first + (2 * second) + countScoresFromRounds(third :: fourth :: rest)
   case first :: second :: rest => first + second + countScoresFromRounds(rest)
   case listTail :List[Int] => listTail.foldLeft(0)(_ + _)
}

A first comment from my peers? "I was afraid to change anything, cause I couldn't understand what it does." As an evangelista of the Clean Code[2], I should consider this as a worst possible kind of personal insult. Or perhaps Scala is just not readable?

As I sat on this for a while, I came to the conclusion that Scala really is not readable... if you are looking at it through a pair of Java goggles that is.

The mental leap.

Patter matcher or a tail recursion are bread and butter of any self respecting university Java course, but both of them together are exotic catering indeed. An iteratively implementation with mutable objects means you can think the algorithm progressively, step by step. In a function/recursive style you have to skip ahead and think of the executing from the end down to the beginning. Once you are familiar with the pattern, understanding the method functionality becomes easy.

For example with an integer list argument [10, 10, 5, 5, 10, 8, 10] the execution of the method becomes as follows:

Other side of the coin.

So the major part of Scala readability is to understand the functional patterns. However even the scala community is slowly starting to demystify some of the subtleties of the lambda calculus. Take the array sum function for example, which one would you consider more readable?

Scala 2.7 and before
val sum = Array(1,2,3,4).reduceLeft(_+_)
Scala 2.8
val sum = Array(1,2,3,4).sum

The principle of clean code still applies. Always choose the readability over obfuscated oneliners. The latter might win you a perl competition, but the former makes you friends.

Final thoughts.

To make Scala readable, you need to compromise a little.
1. Learn to recognize the function programming style patterns such as: higher-order- and pure functions as well as recursion from the code.
2. Avoid of non-descriptive scala idioms and operator-overloading-mystery-syntax, where possible and further describe the code functionality with help of concise method- and value naming.

// looks cool, but not readable
(0 /: listOfInts) {_+_}

References:

Sunday, July 17, 2011

Getting exited about google+

"Noli turbare circulos meos." (Do not disturb my circles)
-The last words attributed to Archimedes.

Few weeks ago, I was a part of the first wave charging the exclusive club of Google+.
It took four "invites" to get me inside of the gates, two of which, I received by begging on Twitter =D
Thank you: Victor Bello and Michael Alden.

To summarize my "Why I love Google+" tantrum in just a four points:

    You write an status update and...
  • If you address it to "Public," it's a blog post.
  • If you address it to "Your Circles" it's a tweet.
  • If you address it to "My Company" it's a newsletter.
  • If you address it to a single person, it's an email.
... pretty revolutionary. And this is without mentioning all the other services of the Google cloud.

To relieve myself of the Facebook package (which I am not naive enough to image is going to be completed anytime soon) I extracted the Pictures, Contacts and Events from the clutches of the Social Giant with the methods described below.

*Disclaimer* the Facebook TOS includes a section "You will not collect users’ content or information, or otherwise access Facebook, using automated means (such as harvesting bots, robots, spiders, or scrapers) without our permission." I am not sure, how the following methods agree with this policy, I'll be looking for their contact atm. use 'em with your own regressions.


Pictures

For extracting picture from Facebook, I used a free online application from picknzip, which packages all the picture from your FB account to a nice zip-file your convenience.

Contacts

Apparently Facebook has implemented a throttling mechanism that, if you visit your ~5 friends in a short period of time, it will remove the email field. So the extraction mechanism has to be slowed down some what to circle around this security measure feature.

Here is a simple python Script (using Mechanize and Facebook Graph API) to extract e-mail addresses from your friends accounts. 60-second delay has been added between the profile scrapes. A shorter delay might do, but I since this was a one time operation, I rather take no chances here (the whole scrape of 300 friends took around 3-hours)

#!/usr/bin/python

import sys
import re
import json
import csv
import time
from mechanize import Browser

LOGIN_ADDRESS = "http://www.facebook.com/"
FORM_NAME = "email"
FORM_PASSWD = "pass"

def getEmails():
    
    if len(sys.argv) != 3:
        return usage()

    br = Browser()
    br.set_handle_robots(False)
    br.open(LOGIN_ADDRESS)
    br.select_form(nr=0)
    br[FORM_NAME] = sys.argv[1]
    br[FORM_PASSWD] = sys.argv[2]

    response = br.submit()

    if "Redirecting" in br.title():
        response = br.follow_link(text_regex='click here')

    response = br.open('http://developers.facebook.com/docs/reference/api')
    conft = response.read()
    mat = re.search('access_token=(.*?)"', conft)
    acct = mat.group(1)
 
    response = br.open('https://graph.facebook.com/me/friends?access_token=%s' % acct)
    eventRes = response.read()
    jdata = json.loads(eventRes)
    
    fbwriter = csv.writer(open('%s.csv' % sys.argv[1], 'ab'), delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
 
    for acc in jdata['data']:
        time.sleep(60)
        fid = acc['id']
        fname = acc['name']
 
        res = br.open('http://m.facebook.com/profile.php?id=%s&v=info&refid=17' % fid)
        xma = re.search('mailto:(.*?)"', res.read())
        if xma:
 
            email = xma.group(1).replace('@', '@')
 
            try:
                print fname, email
            except:
                print repr(fname), repr(email)
 
            try:
                fbwriter.writerow([fname, email])
            except:
                fbwriter.writerow([repr(fname), repr(email)])
    
def usage():
    print 'Usage: ' + sys.argv[0] + ' user@domainname password'
    return 1

if __name__ == "__main__":
    sys.exit(getEmails())



Events

Instead of extracting the Facebook events, I merely published 'em as a Google calendar, which ensures that my poor friends left behind can still envite for a Salsa-raves, using Facebook.
Instructions how to achive this can be found here.


As a bonus

I added an visible announcement to my friends at FB, that I have moved on a greener grass by changing my Facebook profile picture to:


See you at the circles!

Sunday, July 10, 2011

And Bloom Goes The Dynamite

Another month, another code Kata. This time, Dave [1] is challenging the programming community with Bloom Filters [2]. A bloom filter is a simple hash-based lookup mechanism, introduced as early as the 70s, but they still used by the industry today eg. BigTable [3] to reduce the costly disk lookups, which considerably increases the performance of a database query operations.

I am still in the midst of wrapping my head around functional programming paradigm, so naturally my weapon of choice for the task, is yet again, Scalable language I.e SCALA.


The BloomFilter class.

The BloomFilter class contains data-storage for the hash-code array and the functions of adding- and checking of existence of a word in the storage.

I consider one central criteria of determining the quality of a functional programming style, the absence of mutable variables I.e var's. However in some cases (like a function scope variable with no side effects), they shouldn't be avoided. In this case, I chose clarity over recursion and higher order functions.

package com.panda.kata5

import java.security.MessageDigest
import collection.mutable.BitSet

class BloomFilter(val numberOfHashes: Int, val numberOfBits: Int) {
  val hasher = new HashGenerator(numberOfHashes, numberOfBits)
  val bitstore = new BitSet()

  def contains(word: String): Boolean = {
    var found = true;
    hasher.generateEach(word) {
      s =>
        found = found && bitstore(s)
    }
    return found;
  }

  def add(word: String): BloomFilter = {
    hasher.generateEach(word.trim){
       s =>
         bitstore += s
    }
    return this
  }
}

The BloomFilter class.

The BloomFilter object is a singleton class for building the BloomFilter from a word-list file.

object BloomFilter {

  def readFromFile(fileName: String, numberOfHashes: Int, numberOfBits: Int): BloomFilter = {
    val filter = new BloomFilter(numberOfHashes, numberOfBits)
    scala.io.Source.fromFile(fileName).getLines.foreach {
      line =>
        filter.add(line)
    }
    return filter
  }
}

The hash generator class.

The hash generator class generates hash values for the individual words.
I am using the SHA hash function [4], which is part of java.security standart library, to parse out the hash-bits to the storage. This poses some limitations for the number of hashes for sure, but this is a just a simple algorithm exercise, so no harm done...


class HashGenerator(val numberOfHashes: Int, val numberOfBits: Int) {
  val sha = MessageDigest.getInstance("SHA")

  def generateEach(value: String)(f: Int => Unit) {

    def toInt(in: Byte): Option[Int] =
    try {
      Some(in.toInt)
    } catch {
      case e: NumberFormatException => None
    }

    def byteArrayToIntSum(bytes: Seq[Byte]): Int = {
      val ints = bytes.flatMap(b => toInt(b))
      ints.foldLeft(0)((a, b) => a + b)
    }

    val shaDigest = sha.digest(value.map(_.toByte).toArray)
    val increment = shaDigest.length / numberOfHashes
    val countTo = (numberOfHashes * increment) - increment

    for (x <- 0 to countTo by increment) {
      f(( byteArrayToIntSum(shaDigest.slice(x, x + increment)) % numberOfBits).abs)
    }

  }
}

Functional tests.

Couple of tests to verify the basic functionality.

package com.panda.kata5

import org.scalatest.FunSuite
import org.scalatest.matchers.ShouldMatchers

class BoomFilterTest extends FunSuite with ShouldMatchers {

  test("Should return true for each 5 words tested") {
    val boom = BloomFilter.readFromFile("src/resources/wordlist.txt", 15, 15)

    boom.contains("abased") should equal (true);
    boom.contains("Abraham") should equal (true);
    boom.contains("investigating") should equal (true);
    boom.contains("slicing") should equal (true);
    boom.contains("unalterably") should equal (true);

  }

  test("Should generate 5 hash values to String DIE-WEATON-DIE") {
    val boomhash = new HashGenerator(5, 150)
    var hashes = List[Int]()

    boomhash.generateEach("DIE-WEATON-DIE")(
      s =>
      hashes = (s :: hashes)
    )
    hashes.length should be(5)
  }
}

Retrospective: Sometimes scala will cause your grey hairs with the inconsistencies related to dot and brackets. List :: i (i is an Int) is a quite a different matter, than i :: List. Also, I hit the flow-stopper when converting the ByteArray to an Int. I ended up wrapping 'em in to functions, without exploring possible function chaining. Unfortunate procedural paradigm approach... room for improvement, but that's what coding Katas are all about.

I spent two thrilling, company [5] paid, days of Scala-brain-training merged in the functional world, after which returning to Java feels like a dull swim back to the Object Orientated shores =P Still I feel like something positive in the Scala world has rubbed on to me along the way.

  1. My Java variables get declared final (at first).
  2. Object with multiple constructors get a builder pattern [6].
  3. Google collections [7] and LambdaJ [8] are part my standart imported jar-libraries
  4. ...

References: