SAW Development Blog: 2012

22.11.12

HTTP DELETE failing with 411

In my configuration I'm using fabulous node-based http-proxy to do http and web-socket multiplexing from wildcard domain (*.saw.sonyx.net pointing into public-ip host), into VMs hosting application instances (nginx + passenger).
Recently, to my terror I realized that my application fails to delete resources. The persistence is provided to SAW by RESTful backbone models and collection, that in turn are speaking to the server with use of jQuery.
A quick debugging with developer console has shown that my app is getting status HTTP 411 status as a response to the DELETE request:

To my surprise the same call issued directly to the web-server succeeded without any problem. Apparently the issue had to be connected with proxy itself.
The HTTP status 411 simply says that content-length specifier is missing in the request header, which in fact is the case in the example above. It is not clear for me why should DELETE request have some payload, but that's another story. The general advice against this issue is to:

add content-length to the header - which in general case is a perfect idea, but a fast jQuery hacking experiment has demonstrated that browser (Chrome in my case) is refusing to set content-length. agrrr.
use nginx chunking module - which should be happy with streaming chunked data between browser and nginx (which doesn't support chunking mode itself). This way didn't work for me as there is very little that can be "streamed" for the DELETE request.

The third alternative is to add missing header in the http-proxy. Thanks to expertise of my colleague - Daniele, we managed to pinpoint function that serves incoming requests and hack it accordingly. At the end of the day it actually works :)

19.11.12

Chrome autoSave is dead, long live tincr!

Chrome autosave extension was huge bump in the quality of the web-application development. It allowed editing code in the chrome developer tools and then, with use of small node.js application syncing changes back to the source files.

Tincr brings it to the next level by providing synchronisation in the opposite direction. It is capable of reloading assets that you have touched without refreshing complete browser session!

Due to my two-server configuration in which my assets (javascript & css) are loaded out of the node-static server and only RoR generated json comes from ruby, tincr required non-trivial configuration separating assets from generated contents.

For testing purposes I have made a pet project, that loads assets from localhost:4444

jQuery and HTTP acceptable content-type request header

As a consequence of the fact that SAW turns into the single-page application, the user authentication process required an adoption to use XHR requests. In fact the default mode of operation for devise is, whenever user is not authenticated, to use HTTP 302 status to forward web browser to the authentication page. For a RESTful client application this is not a desired behavior as it confuses backbone in where the particular resource can be found. In fact the proper behavior is to return HTTP 401 status, so that it can be caught and client can decide how to proceed.

Trying to implement this I found that my back-end (devise & RoR) is supporting multiple content-types for the request, and it would be most happy to deliver HTML (over JSON). To my surprise specifying JSON as desired content-type in jQuery still resulted in getting 302. Careful investigation resulted in finding that no matter what is the desired content-type, jQuery always attaches */* to the list of accepted types (through allTypes variable). I can imagine that */* should serve as a fall-back for the situation when the desired content-type is not available, but in that case */* should have a different weight (q parameter). Specifying them in the single header line makes the server treat desired content-type and fall-back as equal.

In fact removing "allTypes" has fixed the problem. I'm now getting 401 which can be caught globally, save the requested URL in localStorage, and perform the right redirection:

jQuery(function() {
jQuery("body").ajaxError(function(event, request, settings){
if( request.status === 401 ) {
localStorage.setItem('SAWurl',window.location.href);
alert( "You are not logged in!" );
window.location.href="/users/sign_in";
}
});
});

I would be eager to hear your opinion whenever this is a bug or a feature of jQuery.

GIT submodules overkill

Git has this faboulous feature of (sub-)modularization of the repositories. Over the course of over two years of development SAW grew up in terms of sub-modules. Sub-modularization has is beneficial mainly because it gives an ease of pulling updates without violating source code of the system under implementation.

The practice of managing sub-modules that I found particularly successful is to always fork official github repositories and refer to them as sub-modules.

This technique allows clear separation from the official repository and sometimes necessary customization.

The gotcha of the sub-modularization is that often libraries come in the half-backed condition that requires some degree of building them (with make, grunt, or ant). For my purposes I invested in one, central "deploy" bash script that:

performs pull from central repository,
performs submodule initialization and updating
builds libraries (in particular jquery and jquery-ui)
compiles JavaScript assets (with jammit)
instructs RoR application server (passenger) to reload application sources

Full source of the script can be found here. The screen-shoot was made out of fabulous OSX git (and not only GIT) gui named SourceTree.

6.11.12

Ruby on Rails - hack of the day

I'm developing fairly massive JavaScript application interacting with Ruby on Rails (RoR) back-end. As the amount of my JavaScript files grew to hundredths I noticed that serving then to the browser (after full-page refresh) using single-threaded RoR server takes forever.
There is an easy way to go around this issue. One can configure RoR development environment (config/environments/development.rb) to use asset server (config.action_controller.asset_host) and point it to the instance of some efficient static http server such as node-static, configured on the other port number (3030 or so). An example of the configuration can be found in RoR production environment setup (config/environments/production.rb).
In my case environment configuration line looks like:
config.action_controller.asset_host = "http://localhost:3030"

30.10.12

Cache is your enemy

Browser cache is a phenomenal feature reducing http related traffic with 2 or 3 orders of magnitude, but during the development it can induce serious trouble. I mentioned some caching problems already in my previous post. This time I spent most of the day today hacking some nasty http-handshake problems just to figure out, that chrome is not properly rreshing my jQuery source. I arrived to the point where v8 had in memory two different versions of the same library (sic!). I have no idea how this is possible.
Apparently so-called "incognito" mode doesn't really solve the problem. The solution which worked for me was to disable cache in chrome developer tools.

Now that this works I can proceed with doing something more useful.

26.10.12

REST chattiness

Over the last few days I have given a try to very fine-granular implementation of the graph-data access over REST. As a result of that my client (backbone+marionette) -> server (nginx+passenger+RoR) turned out to be rather chatty, which is a potential performance threat. Here's what my chrome reported about it:

I think that handling over 110 requests in about 3 seconds sounds prety good. Especially taking into account that rendering happens after fetching list of issues related to the project - about 500ms after the start

The tested set-up is as follows:

Server-side: ubuntu 11.04 VM running inside VirtualBox

4 cores of core 2 (6Mbyte of cache)
1GB of ram
8 nginx worker processes
100Mbit wired ethernet

Proxy - CentOS VM running on some infrastructure

node.js based http-proxy forwarding http and web-sockets
single core, 1GB ram, 100 or 1000Mbit ethernet

Client -side - OSX with Google Chrome (24.0 - Canary)

8 Cores of i7
8GB of ram (ton of applications running)
100Mbit, wired ethernet

I don't know about you, but this sounds fairly good to me.

19.10.12

RESTful communication for web applications

In my previous post I rambled about business logic in web-applications. This topic surfaced as during my implementation I came across a problem with delivering aggregated data structures to the client-side of SAW.
An implementation of Backbone.Model encourages CRUD interaction with RESTful server-side interface, and doesn't put constraints on the nature of JSON data delivered. In particular it is happy to receive nested JSON trees. Particularly in my case a representation of the model was enriched with the two lists of elements referenced by and referring to the element in question, roughly as presented on the example bellow:
{

name: Exercise 3
id: 4faa69ef924ff86933000001
type: Project
related_from: [
- {
  - _id: 4daff753798e1c6dec000027
  - _type: Taggable
  - created_at: 2011-04-21T09:22:27+00:00
  - name: local bank logging information
  - type: Issue
  - updated_at: 2012-05-10T14:35:40+00:00
  }
- {
  - _id: 4fb0c54b924ff84e1a000002
  - _type: Taggable
  - created_at: 2012-05-14T08:41:47+00:00
  - name: Banking Interface Adapter location
  - type: Issue
  - updated_at: 2012-05-14T08:41:48+00:00
  }
- ]
related_to: [ ]

}

What seemed as a good idea that could save 2-3 HTTP calls ended up in a troubles when synchronizing back the Backbone.Model after applying some changes to it. In fact server-side could have filtered related_to and related_from from the PUT request parameters, but still a substantial amount of non update-related data was traveling back and forth.
Another option would be to interpret PUT/POST of these lists as valid operations of on the related item lists. But this implies implementation of quite some model-related logic into the server-side. I decided against this and added simple, generic routes serving aforementioned lists on separate requests:

(Ruby on Rails 3.1 routes.rb)

[...]
get 'r/:id/related_to' => 'tag#dotag'
get 'r/:id/related_from' => 'tag#untag'

get 'r/:id/related_to/:type' => 'tag#dotag'
get 'r/:id/related_from/:type' => 'tag#untag'

get 'r/:id/:attribute' => 'r#attribute'
put 'r/:id/:attribute' => 'r#setAttribute'

get 'r/:item_id/dotag' => 'tag#dotag'
get 'r/:item_id/untag' => 'tag#untag'
resources :r
[...]

In general it looks for me like RESTful data access promotes small-atomic data structures with explicit division between read-write resources and read-only aggregations. I would be eager to hear your opinion about it.

Business logic in web-oriented application

Software Architecture Warehouse started as a web-based client-server application with major part implemented on the server-side. Ruby on Rails based back-end hosted persistence, business logic, and view generation. Client (web-browser) was merely responsible for rendering server-side generated views. Over time it became clear that this set-up is not capable of fulfilling requirements of highly-interactive collaborative usage.
Today, weight of SAW shifted dramatically towards the client-side. Thanks to application of frameworks such as Backbone or Marionette, user interface rendering moved completely to the client-side. One of the hesitations that I had recently was where to position application business logic.
In fact I find it useful to speak of two kinds of business logic:

presentation/view oriented logic - can and should be implemented on the client-side, because this way it offers very good responsiveness (low latency) and thus good user-experience
data/process oriented logic - can remain on the server side, because of the data access security and consistency concerns.

12.10.12

Web application development culture

I started developing SAW as a web application based on the backbone.js framework. It appeared to be minimal and small enough to be bullet proof. It wasn't. During 1.5 years of development it improved dramatically and it grew to have interesting extensions such as marionette.js or geppetto. I topped it up with backbone.subroute magic so that it appeared to be good to go.
It isn't. I ended up in relaying solely on my own forks of the aforementioned projects. I understand that github is an inherently social development place, but being required to continuously hack libraries on which my project is based is really frustrating.

Just to name few:

Marionette module implementation uses very awkward initialization - it starts with the child modules and finishes with parent module. Correct me if I'm wrong, but this is both counter-intutitive and useless. Reference is here.
Subroute implementation fires routing event every time sub-router is instantiated. No idea what for. In my application I have one sub-router per module - this caused every url being routed n-times. grrrr...

I would be very eager to hear your comments about my fixes/frustrations with Backbone/Marionette/Github coding.

5.3.12

from JavaScript to Linux Kernel hacking

Today's hacking was particularly frustrating.

An attempt to deploy SAW into a production enviroment hosting multiprocess passenger (in nginx) failed badly reporting syntax erros in JavaScript that was operating just fine in development mode.

My first suspicion was that JavaScript asset concatenation and compression done by Jammit produced some nasty output which failed to be interpreted by the browser. Chrome developer tools and Safari development mode rendered totally useless just reporting that there is a syntax error, but failing to point it out. First Firebug managed to point out that one of my composed JavaScript assets contains garbled content. So far so good.

Debugging jammit was never easy, but this time it appeared that changes in its output are being completely ignored by the nginx/passenger duo. Long hours of hacking have shown that disabling nginx caching makes no change at all.

To my surprise changing asset files with vi did not make a difference either. It was the moment I realized that this might have something to do with (lame) VBoxSF that I use to share project source between my Mac and Ubuntu server machine that I use for development. Eureka. Apparently VBoxSF implementation is totally lame and breaks up on notifying kernel that given file was changed. So at the end of the day asking nginx not to use sendfile did the trick.

aggrrr..

22.2.12

UDP forwarding

The goal was to redirect log4r messages from multiple sources going into single sink hosted on the server into my workstation.
Apparently UDP forwarding is apparently a very confusing topic.
tools like netcat or ncat apparently fix client ip/port coordinates after receiving first data and ignore subsequent datagrams.
This sucks
Luckily this came to help. FTW.

log4r and chainsaw v2

After spending long hours on trying to get a log4j style logging with my RoR application I figured out that actually Chainsaw does not accept any changes in configuration after start-up configuration is loaded. Dear Java community, is that me, or is it some frigging lame coding?

20.2.12

note to myself

bleeding edge vs established ruby is a choice between

"days of hacking and nothing works because it is too new and buggy" and
"days of hacking and nothing works because it is too old and buggy"

to be more precise. ruby-debug simply doesn't work with 1.9.3p125