Browsers Blog Archive

Using nginx to transparently modify/debug third-party content

In tracking down a recent front-end bug for one of our client sites, I found myself needing to use the browser's JavaScript debugger for stepping through some JavaScript code that lived in a mix of domains; this included a third-party framework as well as locally-hosted code which interfaced with -- and potentially interfered with -- said third-party code. (We'll call said code foo.min.js for the purposes of this article.) The third-party code was a feature that was integrated into the client site using a custom domain name and was hosted and controlled by the third-party service with no ability for us to change directly. The custom domain name was part of a chain of CNAMEs which eventually pointed to the underlying *actual* IP of the third-party service, so their infrastructure obviously relied on getting the Host header correctly in the request to select which among many clients was being served.

It appeared as if there was a conflict between code on our site and that imported by the third party service. As part of the debugging process, I was stepping through the JavaScript in order to determine what if any conflicts there were, as well as their nature (e.g., conflicting library definitions, etc.). Stepping through our code was fine, however the third-party's JS code was (a) unfamiliar, and (b) minified, so this had the effect of putting all of the JavaScript code more-or-less on one line, which made tracing through the code in the debugger much less useful than I had hoped.

My first instinct was to use a JavaScript beautifier to reverse the minification process, but since I had no control over the code being included from the third-party service, this did not seem to be directly feasible. The third-party code was deployed only on our production site and relied on hard-coded domains which would make integrating it into one of our development instances challenging since we had no control over the contents of the returned resources. Since the relevant feature (and subsequent bugs) was only on the production site, making extensive modifications to how things were done and potentially breaking that or other features for users while I was debugging was obviously out as an option.

Enter nginx. I've been doing a lot with nginx lately as far as using it as a reverse proxy cache, so it's been on my mind lately. So I came up with this technique:

  1. Look up the IP address for the third-party's domain name (used for later purposes).
  2. Install nginx on localhost, listening to port 80.
  3. Modify /etc/hosts to point the third-party's domain name to the nginx server's IP (also localhost in this case).
  4. Configure a new virtual host with the following logical constraints:
    • We want to serve specific files (the beautified JavaScript) from our local server.
    • We want any other request going through that domain to be passed-through transparently, so neither the browser nor the third-party server treat it differently.

Given these constraints, this is the minimal configuration that I came up with (the interesting parts are located in the server block):

/etc/hosts:

example.domain.com 127.0.0.1

nginx.conf:

worker_processes 1;

events {
    worker_connections 10;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    
    server {
        server_name example.domain.com;
        root /path/to/local_root;

        try_files $uri @proxied;

        location @proxied {
            proxy_set_header Host $http_host;
            proxy_pass http://1.2.3.4;
        }
    }
}

Once I had the above configured/setup, I downloaded/saved the foo.min.js file from the third-party service, ran it through a JS beautifier, and saved it in the local nginx's cache root so it would be served up instead of the actual file from the third-party service. Any other requests for static resources (images, other scripts, etc) would pass-through to the third-party server, so I had my nicely-formatted JavaScript code to step through, the production site worked as normal for anyone else despite potential local changes to the file on my end (i.e., adding JavaScript alert() calls to the file, and no one was the wiser.

A few notes

The try_files directive instructs nginx to first look for a file named after the current URI (foo.min.js in our example) in our local cache, and if this is not found, then fallback to the proxied location block; i.e., relay the request to the original upstream server. We explicitly set the Host header on the proxy request because we want the request to behave normally with respect to name-based hosting, and provide the saved IP address to contact the server in question.

We only needed to preserve/lookup the upstream server's IP address because we're running the nginx server on localhost, so if we used a domain name the lookup would return the same IP defined in /etc/hosts; if the nginx server was running on a different machine, you would be able to just use the domain name as both the server_name and the proxy_pass parameters and set the entry for the host in your local /etc/hosts file to the IP of the nginx server.

A possible extension would be to detect when an upstream request matched a minified URL (via a location ~ \.min\..*\.js$ block) and automatically beautify/cache the content in our local cache. This could be accomplished via the use of an external FastCGI script to retrieve, post-process, and cache the content.

This technique can also be used when dealing with testing changes to a production site on which you are unable or unwilling to make potentially disruptive changes for the purposes of testing static resources. (JavaScript seems the most obvious application here, but this could apply to serving up images or other static content which would be resolvable by the local cache.)

I always need to remind myself to undo changes to /etc/hosts as soon as I'm done testing when using tricks like these. Particularly in something like this which is more-or-less transparent, the behavior would be functionaly identical as long as code/scripts on the third-party site stayed the same, but could easily introduce subtle bugs if the third-party services made changes to their codebase. Since our local copies would mask any remote changes for those non-proxied resources, this could be very confusing if you forget that things are set up this way.

Browser popularity

It's no secret that Internet Explorer has been steadily losing market share, while Chrome and Safari have been gaining.

But in the last couple of years I've been surprised to see how strong IE has remained among visitors to our website -- it's usually been #2 after Firefox.

Recently this has changed and IE has dropped to 4th place among our visitors, and Chrome now has more than double the users that Safari does, as reported by Google Analytics:

1. Firefox 43.61%
2. Chrome 30.64%
3. Safari 11.49%
4. Internet Explorer 11.02%
5. Opera 2.00%

That's heartening. :)

Cross Browser Development: A Few CSS and JS Issues

Coding cross browser friendly JavaScript and CSS got you down? In a recent project, Ron, David, and I worked through some painful cross browser issues. Ron noted that he even banged his head against the wall over a couple of them :) Three of these issues come up frequently in my other projects full of CSS and JS development, so I wanted to share.

Variable Declaration in JS

In several cases, I noticed that excluding variable declaration ("var") resulted in broken JavaScript-based functionality in IE only. I typically include variable declaration when I'm writing JavaScript. In our project, we were working with legacy code and conflicting variable names may have be introduced, resulting in broken functionality. Examples of before and after:

Bad Better
var display_cart_popup = function() {
    popup_id = '#addNewCartbox';
    left = (parseInt($(window).width()) - 772) / 2;
    ...
};
var display_cart_popup = function() {
    var popup_id = '#addNewCartbox';
    var left = (parseInt($(window).width()) - 772) / 2;
    ...
};
...
address_display = '';

country = $(type+'_country').value;
address = $(type+'_address').value;
address2 = $(type+'_address2').value;
city = $(type+'_city').value;
state = $(type+'_state').value;
zip = $(type+'_zip').value;
...
...
var address_display = '';

var country = $(type+'_country').value;
var address = $(type+'_address').value;
var address2 = $(type+'_address2').value;
var city = $(type+'_city').value;
var state = $(type+'_state').value;
var zip = $(type+'_zip').value;
...

I researched this to gain more insight, but I didn't find much except a reiteration that when you create variables without the "var" declaration, they become global variables which may have resulted in conflicts. However, all the "learning JavaScript" documentation I browsed through includes variable declaration and there's no reason to leave it out for these lexically scoped variables.

Trailing Commas in JSON objects

According to JSON specifications, trailing commas are not permitted (e.g obj = { "1" : 2, }). From my experience, JSON objects with trailing commas might work in Firefox and WebKit browsers, but it dies silently in IE. Some recent examples:

Bad Better

//JSON response from an ajax call
// if $add_taxes is not true, the carttotal element will be the last element of the list and it will end with a comma

{
  "response_message"    : '<?= $response_message ?>',
  "subtotal"            : <?= $subtotal ?>, 
  "shipping_cost"       : <?= $shipping ?>, 
  "carttotal"           : <?= $carttotal ?>, 
<?php if($add_taxes) { ?>
  "taxes"               : <?= $taxes ?>
<?php } ?>
}

//JSON response from an ajax call
//No matter the value of $add_taxes, the carttotal element is the last element and it does not end in a comma

{
  "response_message"    : '<?= $response_message ?>',
  "subtotal"            : <?= $subtotal ?>, 
  "shipping_cost"       : <?= $shipping ?>,  
<?php if($add_taxes) { ?>
  "taxes"               : <?= $taxes ?>,
<?php } ?>
  "carttotal"           : <?= $carttotal ?>
}

//Page load JSON object defined
//Last element in array will end in a comma

var fonts = {
[loop list=`$Scratch->{fonts}`]
    '[loop-param name]' : {
      'bold' : "[loop-param bold]",
      'italic' : "[loop-param italic]"
    },[/loop]
};

//Page load JSON object defined
//A dummy object is appended to the fonts JSON object
//Additional logic is added elsewhere to determine if the object is a "dummy" or not

var fonts = {
[loop list=`$Scratch->{fonts}`]
    '[loop-param name]' : {
      'bold' : "[loop-param bold]",
      'italic' : "[loop-param italic]"
     },[/loop]
    'dummy' : {}
};

Additional solutions to avoid the trailing comma include using join (Perl, Ruby) or implode (PHP), conditionally excluding the comma on the last element of the array, or using library methods to serialize data to JSON.

Floating Elements in IE

Often times, you'll get a design like the one shown below. There will be a static width and repeating components to span the entire width. You may programmatically determine how many repeating elements will be displayed, but using CSS floating elements yields the cleanest code.


Example of a given design with repeating elements to span a static width.

You start working in Chrome or Firefox and apply the following CSS rules:


CSS rules for repeating floating elements.

When you think you're finished, you load the page in IE and see the following. Bummer!


Floating elements wrap incorrectly in IE.

This is a pretty common scenario. In IE, if the combined widths of consecutive floating elements is greater than or equal to 100% of the available width, the latter floating element will jump down based on the IE float model. Instead of using floating elements, you might consider using tables or CSS position rules, but my preference is to use tables only for elements that need vertical align settings and to stay away from absolute positioning completely. And I try to stay away from absolute positioning in general.

The simplest and minimalist change I've found to work can be described in a few steps. Let's say your floating elements are <div>'s inside a <div> with an id of "products":

<div id="products">
  <div class="product">product 1</div>
  <div class="product">product 2</div>
  <div class="product" class="last">product 3</div>
  <div class="product">product 4</div>
  <div class="product">product 5</div>
  <div class="product" class="last">product 6</div>
</div>

And let's assume we have the following CSS:

<style>
div#products { width: 960px; }
div.product { float: left; width: 310px; margin-right: 15px; height: 100px; }
div.last { margin-right: 0px; }
</style>

Complete these steps:

  • First, add another div to wrap around the #products div, with an id of "outer_products"
  • Next, update the 'div#products' width to be greater than 960 pixels by several pixels.
  • Next, add a style rule for 'div#outer_products' to have a width of "960px" and overflow equal to "hidden".

Yielding:

<div id="outer_products">
  <div id="products">
    <div class="product">product 1</div>
    <div class="product">product 2</div>
    <div class="product" class="last">product 3</div>
    <div class="product">product 4</div>
    <div class="product">product 5</div>
    <div class="product" class="last">product 6</div>
  </div>
</div>

And:

<style>
div#outer_products { width: 960px; overflow: hidden; }
div#products { width: 980px; }
div.product { float: left; width: 310px; margin-right: 15px; height: 100px; }
div.last { margin-right: 0px; }
</style>

The solution is essentially creating a "display window" (outer_products), where overflow is hidden, but the contents are allowed to span a greater width in the inside <div> (products).


The white border outlines the outer_products "display window".

Some other issues that I see less frequently include the double-margin IE6 bug, chaining CSS in IE, and using '#' vs. 'javascript:void(0);'.

jQuery UI Sortable Tips

I was recently tasked with developing a sorting tool to allow Paper Source to manage the sort order in which their categories are displayed. They had been updating a sort column in a database column but wanted a more visual aspect to do so. Due to the well-received feature developed by Steph, it was decided that they wanted to adapt their upsell interface to manage the categories. See here for the post using jQuery UI Drag Drop.

The only backend requirements were that the same sort column was used to drive the order. The front end required the ability to drag and drop positions within the same container. The upsell feature provided a great starting point to begin the development. After a quick review I determined that the jQuery UI Sortable function would be more favorable to use for the application.

Visual feedback was used to display the sorting in action with:

// on page load
$('tr.the_items td').sortable({
opacity: 0.7,
helper: 'clone',
});
// end on page load

Secondly I reiterate "jQuery UI Event Funtionality = Cool"

I only needed to use one function for this application to do the arrange the sorting values once the thumbnail had been dropped. This code calls a function which loops through all hidden input variables on the page and updates the sorting order.

// on page load
$('tr.the_items td').sortable({
stop: function(event, ui) { do_drop(this); },
});
// end on page load

Validating the sorting fields was a little different from the previously developed feature in that the number of available items could change depending on the category. The number of items could easily be 3 or 30. Therefore I needed a quick way to check the ever changing number. I decided to use a nested loop using the each function.

$('input.new_sku').each(
function( intIndex, obj ) {
$('input.new_sku').each(
   function( secIndex, secObj ) {
       if( (intIndex != secIndex) && ($(obj).val() == $(secObj).val()) ) {
           error = true;
       }
});
}
);

The rest of the feature uses some of the same logic previously documented here.

All in all I learned that the jQuery UI is very versatile and a pleasure to work with. I hope to be using more of its features in the near future.

Safari 4 Top Sites feature skews analytics

Safari version 4 has a new "Top Sites" feature that shows thumbnail images of the sites the user most frequently visits (or, until enough history is collected, just generally popular sites).

Martin Sutherland describes this feature in details and shows how to detect these requests, which set the X-Purpose HTTP header to "preview".

The reason this matters is that Safari uses its normal browsing engine to fetch not just the HTML, but all embedded JavaScript and images, and runs in-page client JavaScript code. And these preview thumbnails are refreshed fairly frequently -- possibly several times per day per user.

Thus every preview request looks just like a regular user visit, and this skews analytics which see a much higher than average number of views from Safari 4 users, with lower time-on-site averages and higher bounce rates since no subsequent visits are registered (at least as part of the preview function).

The solution is to simply not output any analytics code when the X-Purpose header is set to "preview". In Interchange this is easily done if you have an include file for your analytics code, by wrapping the file with an [if] block such as this:

[tmp x_purpose][env HTTP_X_PURPOSE][/tmp]
[if scratch x_purpose eq 'preview']
<!-- skip analytics for browser previews -->
[else]
(normal Google Analytics, Omniture SiteCatalyst, or other analytics code)
[/else]
[/if]

In Ruby on Rails you'd check request.env["HTTP_X_PURPOSE"].

In PHP you'd check $_SERVER["HTTP_X_PURPOSE"].

In Django you'd check request.META["HTTP_X_PURPOSE"] or the equivalent request.META.get("HTTP_X_PURPOSE") (from the HttpRequest class).

And so on.

I confirmed the analytics tracking code was omitted by waiting for Safari to make its preview request and inspecting the response with the Fiddler proxy, on Windows. The same can be done for Safari on Mac OS X with a suitable Mac OS X HTTP proxy.

JPEG compression: quality or quantity?

There are many aspects of JPEG files that are interesting to web site developers, such as:

  • The optimal trade off between quality and file size for any encoder and uncompressed source image.
  • Reducing size of an existing JPEG image when the uncompressed source is unavailable, but still finding the same optimal trade-off.
  • Comparison of different encoders and/or settings for quality at a given file size.

Two essential factors are file size and image quality. Bytes are objectively measurable, but image quality is much more nebulous. What to one person is a perfectly acceptable image is to another a grotesque abomination of artifacts. So the quality factor is subjective. For example, Steph sent me some images to compare compression artifacts. Here is the first one with three different settings in ImageMagick: 95, 50, and 8:

Compare the subtle (or otherwise) differences in the following images (mouseover shows the filesize and compression setting):

Mouseover each image for the file size and ImageMagick compression setting. Additional comparisons are below. Each image can be opened in a separate browser tab for easy A/B comparison. I think many would find the setting of 8 to have too many artifacts, even though it's 10 times smaller than image compressed at a setting of 95. Some would find the setting of 50 to be an acceptable tradeoff between quality and size, since it sends 3.4 times fewer bytes.

Here is the code I wrote to make the comparison (shell script is great for this stuff):

#!/bin/bash
HTML_OUTFILE=comparison.html
echo '<html>' > $HTML_OUTFILE

write_img_html () {
    size=`du -h --apparent-size $1 | cut -f 1`
    if [ -n "$2" ]; then
       qual="setting: $2"
    fi
    cat <<EOF >>$HTML_OUTFILE
<a href="$1"><img src="$1" title="size: $size $qual"></a>
EOF
}

for name in image1 image2; do
    orig=$name-original.jpg
    resized=$name-300.png
    
    echo Resizing $orig to 300 on longest side: $resized...
    convert $orig -resize 300x300 $resized 
    write_img_html $resized "lossless"
    
    for quality in 100 95 85 50 20 8 1; do
        echo Creating JPEG quality $quality...
        jpeg=$name-300-q-$quality.jpg
        convert $resized -strip -quality $quality $jpeg
        write_img_html $jpeg $quality
    done
done

Another factor that often comes into play is how artifacts in the image (e.g. aliasing, ringing, noise) combine with JPEG compression artifacts to exacerbate quality problems. So one way to get smaller file sizes is to reduce the other types of artifacts in the image, thereby allowing higher JPEG compression.

The most common source of artifacts is image resizing. If you are resizing the images, I strongly recommend using a program that has a high quality filter. Irfanview and ImageMagick are two good choices.

The ideal situation is this:

  • Uncompressed source image
  • Full-resolution if you will be handling the resize
  • Absent artifacts such as aliasing
  • Resize performed with good software like ImageMagick
  • JPEG compression chosen based on subjective quality assessment.

Choosing the trade-off between quality and file size is difficult in part because it varies by image content. Images with lots of small color details (e.g. bright fabric threads; AKA high spatial frequency chroma) stand less compression than images that only have medium sized details that do not have important and minute color information.

One of the settings that is important for small web images is removal of the color space profile (e.g. sRGB). The only time it is needed is when there is a good reason for using non-sRGB JPEG, such as when you are certain that your users will have color managed browsers. Removing it can shave off 5KB or so; software will assume images without profiles have an sRGB profile. It can be removed with the -strip parameter of ImageMagick.

As for choosing the specific compression settings, keep in mind that there are over 30 different types of options/techniques that can be used in compressing the image. Most image programs simplify that to a sliding scale from 0 to 100, 1 to 12, or something else. Keep in mind that even when programs use the same scale (e.g. 0 to 100), they probably have different ideas of what the numbers mean. 95 in one program may be very different than 95 in another.

If bandwidth is not an issue, then I use a setting of 95 on ImageMagick, because in normal images I can't tell the difference between 95 and 100. But when file size in an important concern, I consider 85 to be the optimal setting. In this image, the difference should be clear, but I generally find that cutting filesize in half is worth it. Below 85, the artifacts are too onerous for my taste.

You don't often hear about web site visitors' dissatisfaction with compression artifacts, so you might be tempted to just reduce file sizes even beyond the point where it gets noticable. But I think there is a subliminal effect from the reduced image quality. Visitors may not stop visiting the site immediately, but my gut feeling is it leaves them with a certain impression in their mind or taste in their mouth. I would guess that user testing might result in comments such as "the X web site is not the same high-grade quality as the Y web site", even if they don't put it into words as specific as "the compression artifacts make X look uglier than Y". Even if that pet theory is true, it still has to be balanced against the benefit of faster page loading times.

Ideally, the tradeoff between quality and page loading time would be a choice left to the user. Those who prefer fewer artifacts could set their browser to download larger, less-compressed image files than the default, while users with low bandwidth could set it for more compressed images to get a faster page load at the expense of quality. I could imagine an Apache module and corresponding Firefox add-on some day.

Regarding the situation where you want to reduce the file size of existing JPEGs, my advice is to first try (hard) to get the original source files. You can do better (for any given quality/size tradeoff) from those than you can by just manipulating the existing files. If that's not possible, then the suboptimal workflows like jpegtran, jpegoptim, and doing a full decompress/recompress are the only alternative.

As far as comparing different encoders, I haven't really looked into that except to compare ImageMagick and Photoshop, where I (subjectively) determined they both had about similar quality for file size (and vice-versa).

Steph also made a video to show the range of compression from 1 to 100:

Here are all the comparison images. The file size and ImageMagick quality setting are in the rollover. I suggest opening images in browser tabs for easy A/B comparison.

jQuery UI Drag Drop Tips and an Ecommerce Example

This week, I implemented functionality for Paper Source to allow them to manage the upsell products, or product recommendations. They wanted a better way to visualize, organize, and select the three upsell products for every product. The backend requirements of this functionality were relatively simple. A new table was created to manage the product upsells.

The frontend requirements were more complex: They wanted to be able to drag and drop products into the desired upsell position (1, 2, or 3). I was allowed a bit of leeway on the interactivity level of the functionality, but the main requirement was to have drag and drop functionality working to provide a more efficient way to manage upsells. A mockup similar to the image shown below was provided at the onset of the project.


The mockup provided did not demonstrate the "interactiveness" of the drag and drop functionality. Items below the current upsells were ordered by cross sell revenue, or the revenue of each related item purchased with the current item.

Since I was familiar with jQuery, I knew that the jQuery UI included drag and drop functionality. I also had heard of several other jQuery drag and drop plugins, but since the jQuery UI is well supported, I was hopeful that the UI would have the functionality that I envisioned needing. Throughout the project, I learned a few valuable tips to consider with drag and drop implementation. To begin development, I downloaded the latest jQuery and UI Core in addition to the draggable and droppable UI components.

Visual Feedback = Helpful

The first thing I learned from working on the drag and drop functionality, was that visual feedback is very helpful in interactive design and that the jQuery UI has functionality built in to provide visual feedback. The first bit of visual feedback I included was to use a "clone" helper with semi-opaque styling to provide visual feedback that the object was being dragged. This was accomplished using the following code:

// on page load
$('div.common_item').draggable({
  opacity: 0.7,
  helper: 'clone'
});
// end on page load

And is shown here as the Lake Peace 1.25" Circle Stickers product is dragged:

The second bit of visual feedback I included was adding a class to the droppable item when a draggable item hovered over it. I added the "hoveringover" class to the droppable item which was defined in by the stylesheet to have a different colored background. This was accomplished using the following code:

// on page load
$('tr.upsells td').droppable({
  hoverClass: 'hoveringover'
});
// end on page load

And is shown here as the Shimmer Silver A7 Envelope product hovers above the Quilt on Night with Curry A2 Stationers in upsell position #2:

jQuery UI Event Functionality = Useful

The second tip I learned from working on the drag and drop functionality was that the jQuery drag and drop UI includes valuable event functionality to manage events during the drag and drop process.

By adding the code shown below, at the initiation of dragging, I set a hidden input variable to track which element was being dragged. This value was later used to populate the product upsell form.

// on page load
$('div.common_item').draggable({
  start: function(event, ui) { $('input#is_dragging').val($(this).attr('id')); }
  });
// end on page load

By adding the code shown below, at the conclusion of dragging, I cleared the hidden input variable that indicated which item was being dragged.

// on page load
$('div.common_item').draggable({      
  stop: function(event, ui) { $('input#is_dragging').val(''); }
});
// end on page load

A final event response was added to be called when an item is dropped on a droppable item. The function do_drop is called at this drop time. The do_drop function replaces the html of the current upsells if the dropped sku is different than the current upsell sku, updates the hidden form element, adds visual feedback by adding a class to show that the item had been replaced, and displays the "Save" and "Revert" options to save to database or revert the upsell items.

// on page load
$('tr.upsells td').droppable({
  drop: function(event, ui) { do_drop(this); }
});
// end on page load

var do_drop = function(obj) {
  var current_sku = $('input#is_dragging').val();
  if(current_sku != $(obj).find('img').attr('class')) {
    //show "Save" and "Revert" options
    show_drag_form();

    //update hidden form element
    $('input#' + $(obj).attr('id').replace('td_', '')).val(current_sku);

    //replace html and add visual feedback by adding a class to show that the item was replaced
    $(obj).html($('div#' + current_sku).html()).addClass('replaced');     
  }
};

Shown below, the Curry Dots A9 Printable Party Invitations have been replaced with the Olive Natsuki Gel Roller and the background color change signifies the item has been modified.

jQuery UI Documentation and Examples = Awesome

I found the jQuery UI documentation and examples to be very helpful. Another jQuery UI draggable component that was used was to force draggable items to be contained to a region on the page. I contained the elements to the entire parent table using the following code.

$('div.common_item').draggable({
  containment: 'table#drag_table'
});

The Envelope Liners product is shown below to be confined to the table that contained potential and current upsell products. I could not drag the Envelope Liners any further to the right.

Because the functionality was a backend admin tool, the client requested that the functionality not be over-engineered to work across browsers. I did, however, verify that the drag and drop functionality worked in Firefox, Internet Explorer 7 and 8, Chrome, and Safari with a small amount of styling tweaking.

The final drag-drop JavaScript initiation is similar to the following code:

$(function() {
  $('div.common_item').draggable({
    opacity: 0.7,
    helper: 'clone',
    start: function(event, ui) { $('input#is_dragging').val($(this).attr('id')); },
    stop: function(event, ui) { $('input#is_dragging').val(''); },
    containment: 'table#drag_table'
  });
  $('tr.upsells td').droppable({
    hoverClass: 'hoveringover',
    drop: function(event, ui) { do_drop(this); }
  });
})

Shown below is an example of the product upsell in action for the Chrysanthemum Letterpress Thank You Notes.

New End Point Site Launched: Rails, jQuery, Flot, Blogger

This week we launched a new website for End Point. Not only did the site get a facelift, but the backend content management system was entirely redesigned.

Goodbye Old Site:

Hello New Site:

Our old site was a Rails app with a Postgres database running on Apache and Passenger. It used a custom CMS to manage dynamic content for the bio, articles, and service pages. The old site was essentially JavaScript-less, with the exception of Google Analytics.

Although the new site is still a Rails application, it no longer uses the Postgres database. As developers, we found that it is more efficient to use Git as our "CMS" rather than developing and maintaining a custom CMS to meet our ever-changing needs. We also trimmed down the content significantly, which further justified the design; the entire site and content is now comprised of Rails views and partial views. Also included in the new site is cross browser functioning jQuery and flot. Some of the interesting implementation challenges are discussed below.

jQuery Flot Integration

The first interesting JavaScript component I worked on was using flot to improve interactivity to the site and to decrease the excessive text that End Pointers are known for [for example, this article]. Flot is a jQuery data plotting tool that contains functionality for plot zooming, data interactivity, and various configuration display settings (see more flot examples). I've used flot before in several experiments but had yet to use it on a live site. For the implementation, we chose to plot our consultant locations over a map of the US to present our locations in an interactive and fun to use way. The tedious part of this implementation was actually creating the datapoints to align with cities. Check out the images below for examples.

Flot has built in functionality for on hover events. When a point on the plot is hovered over, correlating employees are highlighted using jQuery and their information is presented to the right of the map.

When a bio picture is hovered over, the correlating location is highlighted using jQuery and flot data point highlighting.

We also implemented a timeline using flot to map End Point's history. Check out the images below.

When a point on the plot is hovered over, the history details are revealed in the section below.

The triangle image CSS position is adjusted when a point on the plot is activated.

Dynamic Rails Partial Generation

One component of the old site that was generated dynamically sans-CMS was Blogger article integration into the site. A cron job ran daily to import new Blogger article title, link, and content snippets into the Postgres database. We opted for removing dependency on a database with the new site, so we investigated creative ways to include the dynamic Blogger content. We developed a rake task that is run via cron job to dynamically generate partial Rails views containing Blogger content. Below is an example and explanation of how the Blogger RSS feed is retrieved and a partial is generated:

Open URI and REXML are used to retrieve and parse the XML feed.

require 'open-uri'
require 'rexml/document'
...

The feed is retrieved and a REXML object created from the feed in the rake task:

data = open('http://blog.endpoint.com/feeds/posts/default?alt=rss&max-results=10', 'User-Agent' => 'Ruby-Wget').read
doc = REXML::Document.new(data)

The REXML object is iterated through. An array containing the Blogger links and titles is created.

 
results = [] 
doc.root.each_element('//item') do |item|
  author = item.elements['author'].text.match(/\(.+/).to_s.gsub(/\.|\(|\)/,'')
  results << '<a href="' + item.elements['link'].text + '">' + item.elements['title'].text + '</a>'
end 

Finally, a Rails dynamic partial is written containing the contents of the results array:

  File.open(#{RAILS_ROOT}/app/views/blog/_index.rhtml", 'w') { |f| f.write(results.inject('') { |s, v| s = s + '<p>' + v  + '</p>'}) }

A similar process was applied for bio and tag dynamic partials. The partials are included on pages such as the End Point service pages, End Point bio pages, and End Point home page.

jQuery Carousel Functionality

Another interesting JavaScript component I worked on for the new site was the carousel functionality for the home page and client page. Carousels are a common "web 2.0" JavaScript component where visible items slide one direction out of view and new items slide into view from the other direction. I initially planned on implementing a simple carousel with a jQuery plugin, such as jCarousel. Other JavaScript frameworks also include carousel functionality such as the YUI Carousel Control or the Prototype UI. I went along planning to implement the existing jQuery carousel functionality, but then was asked, "Can you make it a circular carousel where the left and right buttons are always clickable?" In many of the existing carousel plugins and widgets, the carousel is not circular, so this request required custom jQuery. After much cross-browser debugging, I implemented the following (shown in images for a better explanation):

Step 1: The page loads with visible bios surrounded by empty divs with preset width. The visibility of the bios is determined by CSS use of the overflow, position, and left attributes.

Step 2: Upon right carousel button click, new bios populate the right div via jQuery.

Step 3: To produce the carousel or slider effect, the left div uses jQuery animation functionality and shrinks to a width of 0px.

Step 4: Upon completion of the animation, the empty left div is removed, and a new empty div is created to the right of the new visible bios.

Step 5: Finally, the left div's contents are emptied and the carousel is in its default position ready for action!

Another request for functionality came from Jon. He asked that we create and use "web 2.0" URLs to load specific content on page load for the dynamic content throughout our site, such as http://www.endpoint.com/clients#citypass, http://www.endpoint.com/clients#backcountry.

Upon page load, JavaScript is used to detect if a relative link exists:

if(document.location.href.match('#.+')) {
    var id = document.location.href.match('#.*').toString().replace('#', '');
}   

The id retrieved from the code snippet above is used to populate the dynamic page content. Then, JavaScript is used during dynamic page functionality, such as carousel navigation, to update the relative link:

document.location.href = document.location.href.split('#')[0] + '#' + anchor;

Twitter Integration

Another change in the new site was importing existing functionality previously written in Python to update End Point's Twitter feed automagically. The rake task uses the Twitter4R gem to update the Twitter feed and is run via cron job every 30 minutes. See the explanation below:

The public twitter feed is retrieved using Open URI and REXML.

    data = open('http://twitter.com/statuses/user_timeline/endpoint.xml', 'User-Agent' => 'Ruby-Wget').read
    doc = REXML::Document.new(data)

An array containing all the titles of all tweets is created.

    doc.each_element('statuses/status/text') do |item|
      twitter << item.text.gsub(/ http:\/\/j\.mp.*/, '')
    end

The blogger RSS feed is retrieved and parsed. An array of hashes is created to track the un-tweeted blog articles.

    data = open('http://blog.endpoint.com/feeds/posts/default?alt=rss&max-results=10000', 'User-Agent' => 'Ruby-Wget').read
    doc = REXML::Document.new(data)
    found_recent = false
    doc.root.each_element('//item') do |item|
      found_recent = true if twitter.include?(item.elements['title'].text)
      blog << { 'title' => item.elements['title'].text, 'link' => item.elements['link'].text } if !found_recent
    end

Using the j.mp api, a short url is generated. A Twitter message is created from the short URL.

      data = open('http://api.j.mp/shorten?version=2.0.1&longUrl=' + blog.last['link'] + '&login=**&apiKey=*****&format=xml')
      ...
      twitter_msg = blog.last['title'] + ' ' + short_url

The twitter4r gem is used to login and update the Twitter status message.

      client = Twitter::Client.new(:login => **, :password => *****)
      begin
        status = client.status(:post, twitter_msg)
      rescue
      end

Google Event Tracking

Finally, since we implemented dynamic content throughout the site, we decided to use Google Event Tracking to track user interactivity. We followed the standard Google Analytics event tracking to add events for events such as the slider carousel user involvement, the team page bio and history hover user involvement:

//pageTracker._trackEvent(category, action, optional_label, optional_value);
pageTracker._trackEvent('Team Page Interaction', 'Map Hover', bio);

We are happy with the new site and we hope that it presents our skillz including Interchange Development, Hosting Expertise and Support, and Database Wizardry!

Learn more about End Point's rails development.

Rejecting SSLv2 politely or brusquely

Once upon a time there were still people using browsers that only supported SSLv2. It's been a long time since those browsers were current, but when running an ecommerce site you typically want to support as many users as you possibly can, so you support old stuff much longer than most people still need it.

At least 4 years ago, people began to discuss disabling SSLv2 entirely due to fundamental security flaws. See the Debian and GnuTLS discussions, and this blog post about PCI's stance on SSLv2, for example.

To politely alert people using those older browsers, yet still refusing to transport confidential information over the insecure SSLv2 and with ciphers weaker than 128 bits, we used an Apache configuration such as this:

# Require SSLv3 or TLSv1 with at least 128-bit cipher
<Directory "/">
    SSLRequireSSL
    # Make an exception for the error document itself
    SSLRequire (%{SSL_PROTOCOL} != "SSLv2" and %{SSL_CIPHER_USEKEYSIZE} >= 128) or %{REQUEST_URI} =~ m:^/errors/:
    ErrorDocument 403 /errors/403-weak-ssl.html
</Directory>

That accepts their SSLv2 connection, but displays an error page explaining the problem and suggesting some links to free modern browsers they can upgrade to in order to use the secure part of the website in question.

Recently we've decided to drop that extra fuss and block SSLv2 entirely with Apache configuration such as this:

SSLProtocol all -SSLv2
SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:-LOW:-SSLv2:-EXP

The downside of that is that the SSL connection won't be allowed at all, and the browser doesn't give any indication of why or what the user should do. They would simply stare at a blank screen and presumably go away frustrated. Because of that we long considered the more polite handling shown above to be superior.

But recently, after having completely disabled SSLv2 on several sites we manage, we have gotten zero complaints from customers. Doing this also makes PCI and other security audits much simpler because SSLv2 and weak ciphers are simply not allowed at all and don't raise audit warnings.

So at long last I think we can consider SSLv2 dead, at least in our corner of the Internet!

JavaScript fun with IE 8

I ran into, and found solutions for, two major gotchas targeting IE 8 with a jQuery-based (and rather JavaScript-heavy) web application.

First is to specify the 'IE 8 Standard' rendering mode by adding the following meta tag: <meta equiv="X-UA-Compatible" content="IE=8">

The default rendering mode is rather glitchy and tends to produce all sorts of garbage from 'clean' HTML and JavaScript. The result renders slightly different sizes, reports incorrect values from common jQuery calls, etc.

The default rendering also caused various layout issues (CSS handling looked more like IE 6 than IE 7). Also, minor errors (an extra '' tag on one panel) caused the entire panel to not render.

Another issue is the browser is overly lazy about invalidating the cache for AJAX pulled content, especially (X)HTML. This means that though you think you're pulling current data, in reality it keeps feeding you the same old data. This also means that if you use the same exact URL for HTML & JSON data, you must add a parameter to avoid running into cache collisions. IE 8 only seemed to honor 'Cache-control: no-cache' in the header to cause it to behave properly.

On the other side, I've got a big thumbs up for jQuery. I was able to produce a skinned fairly 'heavy' client-side application that works equally well (and looks almost the same) on Firefox, Chrome, Safar, and now IE 8.

SDCH: Shared Dictionary Compression over HTTP

Here's something new in HTTP land to play with: Shared Dictionary Compression over HTTP (SDCH, apparently pronounced "sandwich") is a new HTTP 1.1 extension announced by Wei-Hsin Lee of Google last September. Lee explains that with it "a user agent obtains a site-specific dictionary that then allows pages on the site that have many common elements to be transmitted much more quickly." SDCH is applied before gzip or deflate compression, and Lee notes 40% better compression than gzip alone in their tests. Access to the dictionaries stored in the client is scoped by site and path just as cookies are.

The first client support was in the Google Toolbar for Internet Explorer, but it is now going to be much more widely used because it is supported in the Google Chrome browser for Windows. (It's still not in the latest Chrome developer build for Linux, or at any rate not enabled by default if the code is there.)

Only Google's web servers support it to date, as far as I know. Someone intended to start a mod_sdch project for Apache, but there's no code at all yet and no activity since September 2008.

It is interesting to consider the challenge this will have on HTTP proxies that filter content, since the entire content would not be available to the proxy to scan during a single HTTP conversation. Sneakily-split malicious payloads would then be reassembled by the browser or other client, not requiring JavaScript or other active reassembly methods. This forum thread discusses this threat and gives an example of stripping the Accept-encoding: sdch request headers to prevent SDCH from being used at all. Though the threat is real, it's hard to escape the obvious analogy with TCP filtering, which had to grow from stateless to more difficult stateful TCP packet inspection. New features mean not just new benefits but also new complexity, but that's not reason to reflexively reject them.

SDCH references:

CSS @font-face in Firefox 3.5

This has been frequently mentioned around the web already, but it's important enough that I'll bring it up again anyway. Firefox 3.5 adds the CSS @font-face rule, which makes it possible to reference fonts not installed in the operating system of the browser, just as is done with images or other embedded content.

Technically this is not a complicated matter, but font foundries (almost all of whom have a proprietary software business model) have tried to hold it back hoping for magical DRM to keep people from using fonts without paying for them, which of course isn't possible. As one of the original Netscape developers mentioned, if they had waited for such a thing for images, the web would still be plain-text only.

The quickest way to get a feel for the impact this change can have is to look at Ian Lynam & Craig Mod's article demonstrating @font-face in Firefox 3.5 side-by-side with any of the other current browsers. It is exciting to finally see this ability in a mainstream browser after all these years.

MTU tweak: a fix for upload pain

While traveling and staying at Hostel Tyn in Prague's city center, I ran into a strange problem with my laptop on their wireless network.

When many people were using the network (either on the hostel's public computers or on the wireless network), sometimes things bogged down a bit. That wasn't a big deal and required merely a little patience.

But after a while I noticed that absolutely no "uploads" worked. Not via ssh, not via browser POST, nothing. They always hung. Even when only a file upload of 10 KB or so was involved. So I started to wonder what was going on.

As I considered trying some kind of rate limiting via iptables, I remembered somewhere hearing that occasionally you can run into mismatched MTU settings between the Ethernet LAN you're on and your operating system's network settings.

I checked my setup and saw something like this:

ifconfig wlan0
wlan0     Link encap:Ethernet  HWaddr xx:xx:xx:xx:xx:xx
          inet addr:10.x.x.x  Bcast:10.x.x.x  Mask:255.255.255.0
          inet6 addr: fe80::xxx:xxxx:xxxx:xxxx/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1239 errors:0 dropped:0 overruns:0 frame:0
          TX packets:20 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:191529 (191.5 KB)  TX bytes:4543 (4.5 KB)

The MTU 1500 stood out as being worthy of tweaking. So I tried a completely unscientific change:

sudo ifconfig wlan0 mtu 1400

Then tried the same HTTP POST that had been consistently failing, and poof! It worked fine. Every time.

I think mostly likely something more than 1400 bytes would've been possible, perhaps just a few short of 1500. The number 1492 rings familiar. I'll be old-fashioned and not look it up on the web. But this 1400-byte MTU worked fine and solved the problem. To my delight.

As an interesting aside, before making the change, I found one web application where uploads did work fine anyway: Google's Picasa. I'm not sure why, but maybe it sliced & diced the upload stream into smaller chunks on its own? A mystery for another day.

Testing in the Web Environment

Introduction

Testing is an important part of good software engineering practices. In fact, it can be said that it is at once the most important, and yet most neglected part of software engineering. Testing methodology for software engineering developed out of its hardware engineering roots: software was defined in terms of its inputs and outputs, and testing was similarly defined in terms of applied inputs and expected outputs.

However, software testing is more complex than that: this is because software almost always incorporates "state" or memory that affects subsequent operations. For instance, the following pseudocode:

if (VALUE is not defined)
then
VALUE := 1.0
fi
FRACTION := 1.0 / VALUE

In this simple case, the code fragment will always operate correctly on the first execution, but subsequent executions may fail if VALUE is zero.

Testing web applications involves planning for this kind of memory, because in essence a web application runs within a larger program (the web server and perhaps the application server) and may inherit state from the environment, or indeed may preserve its own state from one page reference to the next.

In addition, web applications involve human factors.

  • Does the application "display correctly" (whatever that means)?
  • Does the page load "quickly enough"?
  • Do dynamic elements (e.g. Ajax) respond appropriately?

Such factors are harder to measure than verifying that a sales tax calculation returns an accurate number.

For these reasons, we turn to web application testing frameworks. Loosely defined these frameworks provide either a substitute for, or an interface to, a web browser that is under programmatic control. So for instance, a test script can invoke the web application via URL just as a browser would. Then it can test for page content or metadata (title, etc.), and even in some cases access embedded media such as image files. The framework provides a way to operate the web application: through it, the test script can submit forms, click on objects, respond to dynamic events such as JavaScript alerts, and even operate the browser in other ways: navigating via the "Back" button, saving files, etc.

Using such frameworks, the software engineer can automate the testing process. The application's performance can be defined in terms of the test scripts that it passes, so that modifications to the application (new functionality or bug repairs) can be validated against the existing tests (regression testing).

In this article, I'll briefly survey several approaches to web application testing frameworks that are in use or under study at End Point.

WWW::Mechanize

The first framework is a Perl module called "WWW::Mechanize", and its associated extension "Test::WWW::Mechanize". This framework provides an object-oriented interface to an HTTP connection which allows a test script, written in Perl, to perform operations on a web site much like a browser, and to test the results in various ways. By way of example, here is a script that operates on the End Point website:

use strict;
use Test::WWW::Mechanize;
use Test::More tests => 4;
my $mech = Test::WWW::Mechanize->new();
$mech->get_ok('http://www.endpoint.com', 'Home page fetched');
$mech->title_like(qr/End Point/, 'Page mentions us');
$mech->follow_link_ok({ text_regex => qr/Team Bios/ }, 'Found team bios');
$mech->content_contains('Jeff Boes', 'Author was mentioned');

This test declares that we will run four tests. It initializes the test framework with a call to the "new" method. Then it executes the four tests, annotating each one with a message that lets us identify which test failed by a human-friendly string rather than a bare number.

The first test just checks that the framework can retrieve the home page; failure would be caused by a server problem, DNS failure, etc. The second test just verifies that the page title contains a particular text pattern (the name of our company). The third test finds a link (in this case, based on a pattern of text in the link; we could also locate a link by URL, for example) and verifies that the framework can follow the link. The fourth and final test verifies that the author's name appears on the page.

From simple building blocks like this, more and more complex tests can be built up. Through the underlying framework, a test script can:

  • set and retrieve form field values, including checkboxes and selectors
  • submit forms
  • set and retrieve cookie values
  • analyze images
  • provide credentials for HTTP Basic Authentication (for password-protected sites)

End Point has used this approach with success. For example, the order and checkout process on CityPass uses a sequence of tests designed to place orders for every product offered, in various combinations. The test script makes a connection to the site's PostgreSQL database allowing it to compare the resulting order receipts with the matching database entries.

The major failing of the Mechanize family is that JavaScript is not supported. Thus, this framework is not suitable for testing pages for which major parts of the functionality are provided through JavaScript.

HTTP::Recorder

This framework, another Perl module, is really a system for constructing test scripts for use with WWW::Mechanize. It doesn't offer any testing facility on its own; instead, it is designed to line up between a browser and a web application, recording the mouse clicks and keystrokes made, and emitting a test script that is then fed through WWW::Mechanize (perhaps after suitable manual adjustment).

Again, this system doesn't recognize, operate on, or record JavaScript events, so it's not as useful for testing sites with large amounts or critical sections of JavaScript.

Selenium

Selenium is a framework rather unlike the previous entries, although from the view of the programmer developing a test script or suite, it doesn't seem that much different. Selenium has several components; the one that interests us most for this particular survey is "Selenium RC" (Remote Control). This component services requests from a test script written much like the WWW::Mechanize scripts. The Selenium RC server will start up a browser and translate test script requests into actual mouse and keyboard events on the controlled browser.

Selenium works with several different browsers, such as Firefox and Microsoft Internet Explorer. For the vast majority of test scripts, the only change required to switch from testing one browser platform to another is to change a single line in the initial server request.

Selenium works with JavaScript events and functionality. You can, for instance, test JavaScript "onmouseover" events, or field validation through "onchange" or "onsubmit". Your test scripts can check for JavaScript alerts and respond to them, and behave in nearly every way just as a real user would, sitting in front of a real browser.

Selenium RC is implemented as a Java application, which means that its environment must include a Java installation (JVM).

The drawback of Selenium is that since it must be run in an environment that includes a browser and window display system, you'll almost certainly need to run your test script on a workstation, or a server with all the windowing software installed.

Other approaches

  • OpenSTA (Open System Testing Architecture) is more of a heavy-load testing framework, although it does provide a scripted setup.
  • Usability testing environments such as WAUTER are designed to observe and record end-user actions (such as scrolling and mouse clicks) for later analysis.

Using YSlow to analyze website performance

While attending OSCON '08 I listened to Steve Souders discuss some topics from his O'Reilly book, High Performance Web Site, and a new book that should drop in early 2009. Steve made the comment that 80%-90% of the performance of a site is in the delivery and rendering of the front end content. Many engineers tend to immediately look at the back end when optimizing and forget about the rendering of the page and how performance there effects the user's experience.

During the talk he demonstrated the Firebug plugin, YSlow, which he built to illustrate 13 of the 14 rules from his book. The tool shows where performance might be an issue and gives suggestions on which resources can be changed to improve performance. Some of the suggestions may not apply to all sites, but they can be used as a guide for the engineer to make an informed decision.

On a related note, Jon Jensen brought this blog posting to our attention that Google is planning to incorporate landing page time into its quality score for Adword landing pages. With that being known, front-end website performance will become even more important and there may be a point one day where load times come into play when determining natural rank in addition to landing page scores.

Major rumblings in the browser world

Wow. There's a lot going on in the browser world again all of a sudden.

I recently came across a new open source browser, Midori, still in alpha status. It's based on Apple's WebKit (used in Safari) and is very fast. Surprisingly fast. Of course, it's not done, and it shows. It crashes, many features aren't yet implemented, etc. But it's promising and worth keeping an eye on. It's nice to have another KHTML/WebKit-based browser on free operating systems, too.

Now today news has come out about Google's foray into the browser area, with a browser also based on WebKit called Chrome. It'll be open source, include a new fast JavaScript engine, and feature compartmentalized JavaScript for each page, so memory and processor usage will be easy to monitor per application, and individual pages can be killed without bringing the whole browser down. Code's supposed to become available tomorrow.

A new generation of JavaScript engine for Mozilla is now in testing, called TraceMonkey. It has a just-in-time (JIT) compiler, and looks like it makes many complex JavaScript sites very fast. It sounds like this will appear formally in Firefox 3.1. Information on how to test it now is at John Resig's blog.

And finally, Microsoft is adding a new "InPrivate" browsing mode to Internet Explorer 8, which now has a public beta. Unlike all of the above, it will ... not be open source. :)

Nice to see so much movement.