TeamCo AntHill: July 2010

Saturday, July 31, 2010

Speed Tracer Server-side Tracing with Rack

As anyone who has ever developed an interactive web app will know, Firebug (Firefox) and Inspector (Webkit) are your best friends. In fact, in many cases these tools are the "IDEs of choice" for manipulating the DOM, debugging JavaScript, and even working with CSS. However, have you ever wondered how many CPU cycles that script really took, or how much time the browser spent in reflow, as compared to just parsing the data? That's where Google's Speed Tracer comes in. Provided as a Chrome extension, it instruments the V8 VM, and the rest of the browser to provide detailed data about the GC cycles, painting, parsing, network resource loading and more.
The low level data provided by Speed Tracer finally allows us to peek under the covers to understand what the browser is actually doing - think strace, but for your browser. However, what if we could also bridge the gap between client-side and server-side tools? Wouldn't it be nice if we could go beyond the simple latency and response time reporting for network resources to viewing a full server-side log of what happened, all in one tool? Well, that's exactly what the Speed Tracer team launched at Google IO, so lets take a look at how it works!

Server-side tracing with SpeedTracer

The actual mechanics of getting server-side performance data into Speed Tracer are clever and simple. Whenever the browser receives a network response, it looks for the X-TraceUrl header, which specifies the relative URL and the unique ID of the trace for that specific request. From there, if the developer expands the network resource which provided the X-TraceUrl, Speed Tracer makes a request for the server-side trace, parses the JSON and surface it in the UI. This means that the server-side data is brought in on demand and does not affect the actual load times of your resources, and also that you need a mechanism to record, store, and serve these traces later.

Server side tracing with Rack & Ruby

The original announcement at Google IO demoed this new functionality on top of GWT and the Spring TC server. However, this same functionality is also easily extracted into a Rack middleware - which is what I did. In fact, here is a preview of sample trace from a Rails 3 application using rack-speedtracer:

The middleware takes care of providing the headers, persisting the traces, and then serving all the data back to Speed Tracer in the format it expects and understands under the hood. As a developer, you simply need to require the middleware, and then instrument your code where you want performance data to be recorded. Let's take a look at a simple configuration for a Rails 3 application:

# in your Gemfile
gem 'rack-speedtracer', :require => 'rack/speedtracer'
 
# in development.rb environment
config.middleware.use Rack::SpeedTracer
 
# define a widgets controller
class WidgetsController < ApplicationController
  def index
    env['st.tracer'].run('Widgets#index') do
      env['st.tracer'].run("ActiveRecord: Widgets.all") do
        Widget.all
      end
 
      env['st.tracer'].run('Render') { render :text => 'oh hai' }
    end
  end
end

http://www.igvita.com/2010/07/19/speed-tracer-server-side-tracing-with-rack/

Friday, July 30, 2010

RDoc opts

All Rails apps contain some pre-baked RDoc integration in the form of a doc directory, adoc/README_FOR_APP “main” doc file and some Rake tasks, which you can see below:


$ rake -T doc
rake doc:app              # Build the app HTML Files
rake doc:clobber_app      # Remove rdoc products
rake doc:clobber_plugins  # Remove plugin documentation
rake doc:clobber_rails    # Remove rdoc products
rake doc:plugins          # Generate documation for all installed plugins
rake doc:rails            # Build the rails HTML Files
rake doc:reapp            # Force a rebuild of the RDOC files
rake doc:rerails          # Force a rebuild of the RDOC files

Scraping with style: scrAPI toolkit for Ruby

There’s a lot of ways to scrape HTML.

There’s regular expression, they deal well with text. But HTML is not just text, it’s markup. So you have to deal with elements that are implicitly closed, or out of balance. Attributes are sometimes quoted, sometimes not. Nested lists and tables are a challenge. Good regular expressions take a lot of time to write, and are impossible to read.

Or you can clean up the HTML with Tidy, get a DOM and walk the tree. The DOM is much easier to work with, it’s a clean markup with a nice API. But you have to do a lot of walking to find the few elements you’re scraping. That’s still too much work.

http://blog.labnotes.org/2006/07/11/scraping-with-style-scrapi-toolkit-for-ruby/

http://railscasts.com/episodes/173-screen-scraping-with-scrapi

Tuesday, July 20, 2010

Dive Into Dijit

One huge feature that sets the Dojo Toolkit apart from other JavaScript libraries is its UI component system: Dijit. A flexible, comprehensive collection of Dojo classes (complemented by corresponding assets like images, CSS files, etc.), Dijit allows you to create flexible, extensible, stylish widgets. To learn how to install, configure, and use basic Dijits within your web application, keep reading!

http://www.sitepen.com/blog/2010/07/12/dive-into-dijit/

Aloha Editor - The HTML5 Editor

The world's most advanced browser based Editor let's you experience a whole new way of editing. It’s faster than existing technologies and offers unprecedented functionalities.

No reload. No popup. No need to preview...

Click and edit. This is all you need. If you start editing you do not need to reload the website with an old style rich text editor (RTE). - Just click and edit. Other editors need to reload the website or come with a popup. Both actions are enerving and take time. Time you better invest in crafting your text. While diting you want to see the results immediately without previews or reloads. With Aloha Editor you work on the final document. You see what you get - with every keystroke! Only Aloha Editor is "What you see is what you get" (WYSIWYG).

The floating menu. A brand new lightweight context menu.

The floating menu offers you the right options matching your editing context. The menu floats to the paragraph, table or content element you are editing. Thus the menu is as near as it can be. Aloha Editor is designed not to show more than 15 icons at the same time. This offers you a clear overview about the options at hand. You only see those icons which are useful to you determined by your selection or cursor position. Still you may have other options available just one mouse click away!

http://aloha-editor.com

Monday, July 19, 2010

Whenever

Whenever is a Ruby gem that provides a clear syntax for defining cron jobs. It outputs valid cron syntax and can even write your crontab file for you. It is designed to work well with Rails applications and can be deployed with Capistrano. Whenever works fine independently as well.
Ryan Bates created a great Railscast about Whenever: railscasts.com/episodes/164-cron-in-ruby
Discussion: groups.google.com/group/whenever-gem

http://github.com/javan/whenever

Sunday, July 18, 2010

Cucumber Screenshot

Cucumber Screenshot makes it easy to capture HTML snapshots and PNG screenshots of the pages generated by your Rails application as it runs your Cucumber/Webrat features.
It uses WebKit to generate the PNG screenshots and so they are only available for OS X.
If you want to take bitmap screenshots on any other platform then take a look at this example from Cucumber.

http://github.com/mocoso/cucumber-screenshot

Make a browser screenshot in Ruby with Selenium

First you need to have Selenium RC installed and launched. It's pretty simple. Download it, go to the selenium-server-1.0 and enter in command line

java -jar selenium-server.jar

Your Selenium server is started on the 4444 port, ready to be used ! You also need the selenium-client installed.

sudo gem install selenium-client

Your hard drive is now a bit less empty. We can start having fun with code !

require 'rubygems'
require 'selenium'

# We load Selenium
@selenium = Selenium::SeleniumDriver.new("localhost", 4444, "*firefox", "http://42.dmathieu.com/", 10000);
@selenium.start

# We go to the main page and take the screenshot
@selenium.open "/"
@selenium.capture_entire_page_screenshot(File.expand_path(File.dirname(__FILE__)) + 'screenshot.png', '');

# We unload Selenium
@selenium.stop

We load the required libraries. Not complicated. We only need Selenium.

require 'rubygems'
require 'selenium'

Then we load Selenium, indicating the URL we wish to visit and the browser with which we want to visit it.

@selenium = Selenium::SeleniumDriver.new("localhost", 4444, "*firefox", "http://42.dmathieu.com/", 10000);
@selenium.start

We load the page, take the screenshot and save the created image.

@selenium.open "/"
@selenium.capture_entire_page_screenshot(File.expand_path(File.dirname(__FILE__)) + 'screenshot.png', '');

And we don't forget to free the memory.

@selenium.stop

And then the magic happens. Our beautiful screenshot (of the entire page, not only the screen) is then generated.

http://dmathieu.com/en/ruby/make-a-browser-screenshot-in-ruby-with-selenium

Win32::Screenshot

Capture Screenshots on Windows with Ruby. This library captures screenshots in bmp format, but you may use RMagick to convert these to some other formats like png.

http://github.com/jarmo/win32screenshot

JRuby Toolkit Robot example - How take a screenshot of your desktop

require 'java'

include_class 'java.awt.Dimension'
include_class 'java.awt.Rectangle'
include_class 'java.awt.Robot'
include_class 'java.awt.Toolkit'
include_class 'java.awt.event.InputEvent'
include_class 'java.awt.image.BufferedImage'
include_class 'javax.imageio.ImageIO'

toolkit = Toolkit::getDefaultToolkit()
screen_size = toolkit.getScreenSize()
rect = Rectangle.new(screen_size)
robot = Robot.new
image = robot.createScreenCapture(rect)
f = java::io::File.new('test.png')
ImageIO::write(image, "png", f)

http://www.devdaily.com/blog/post/ruby/jruby-code-that-takes-screenshot-of-your-desktop-saves-it-as-fi

Friday, July 16, 2010

HTML to PDF Conversion

Once a business web application reaches a certain size, the need often arises to generate PDFs from HTML/CSS.

Up until recently, the story around this for a MRI Rails application was not good. You could either use tools like Prawn, which require a description of the layout in a specific DSL, or pay for a tool like Prince XML which can convert from HTML, but which costs quite a bit. Those using JRuby were in a stronger position as they could use the Java PDF library called Flying Saucer.

The good news is that PDF generation for MRI Ruby is now easy and free, thanks towebkit, the open source webkit wrapper called wkhtmltopdf and mileszs's wickedpdf plugin. I was really excited to come across this plugin and started to use it right away. However, it had a couple of issues:

Temp file handling caused errors when two PDFs were being generated within the same second (eg, 2 requests at almost the same time)
Problems generating PDF were not reported

Galdomedia forked the code and updated it to use standard Ruby temp files. This was great for ruby 1.7, but not good for Ruby 1.6 which does not allow you to set the extension on temp files (wkhtmltopdf relies on having a .html extension).

As my production servers run Ruby 1.6, I needed a different approach. My fork uses streams rather than temporary files, and adds some basic error handling and basic integration tests.

To install in a rails app:

script/plugin install git://github.com/jcrisp/wicked_pdf.git

Or clone the code from GitHub.

http://jamescrisp.org/2009/11/12/html-to-pdf-conversion-plugin-for-rails-a-fork-of-wicked-pdf/

An XML/XHTML/CSS 2.1 Renderer

Release 8 – Final (R8)

April 18, 2009: We are proud to present our R8 release. This release includes no changes since R8 release candidate 2.

Our R8 release includes a number of improvements over R7, as well as bug fixes. You can read the complete list of changes on our news page; here are some
highlights about what you'll find in R8:

PDF
- Upgrade to use iText 2.0.8
- Support adding custom header properties on PDF output
- Add ability to set PDF version programmatically
- Add ability to manipulate PDF output document before it's closed
- Add ability to have different starting page number for first document in multiple documents
- Add API to retrieve PDF page and coordinates for boxes with an ID attribute
- Implement CMYK color support for PDF output
- Support encryption of PDF output
Swing
- Basic support for selection, highlighting and copying
General
- Expose copy of parsed entities from catalog.
- Preliminary support for data URLs
- Support True Type Collection (.ttc) files
- Preliminary support for Type 1 fonts
- Support logging via Log4J as an alternative to JDK logging (requires recompile)
- Handle hidden form elements
- Rudimentary support for JavaScript links (from Dan Kaplan)
- Support for callback on form submission
CSS

Prelimary support for @font-face rules
Implement partial support for leader and target-counter (patch from Karl Tauber)
Table pagination. When turned on (by setting the -fs-table-pagination property to paginate vs. a default of auto ), tables and table cells will be closed (with appropriate borders and padding) when a page ends and reopened when a page starts. Additionally, a table's thead and tfoot will be repeated on each page.
CSS3 margin boxes
Named pages
Running elements
Namespace-aware CSS matching, for example, with attributes [although this also applies to elements]
Pseudo-elements may now be specified with a double colon
Substring selectors
The background property can now be used in a @page context – CSS 3 spec
Custom property to limit the scope of the page and pages counters to a portion of the document
Custom property that instructs FS to try to avoid breaking a box so that only borders and padding appear on a page

https://xhtmlrenderer.dev.java.net/

Wednesday, July 14, 2010

Making urls look memorable

Bluga.net WebThumb provides a white-label web service API for generating web thumbnails and full size snapshots of websites.

WebThumb offers more features and quicker response times then any other service.

Real-time thumbnails
Flash 10 Support
PDF Support
Quick response times
REST API
API clients for PHP, Ruby, Python
Cache the thumbnails on your server or Webthumbs
Browser windows from 75x75 to 1280x2048

http://snippets.dzone.com/posts/show/3621
http://www.paulhammond.org/webkit2png/
http://stackoverflow.com/questions/726660/how-do-i-make-beautiful-screenshots-of-web-pages-using-ruby-and-a-unix-server
rbwebkitgtk: http://github.com/danlucraft/rbwebkitgtk/tree/master

Moz snap shooter: http://www.lilik.it/~mirko/Ruby-GNOME2/moz-snapshooter.rb
http://www.hackdiary.com/2004/06/13/taking-automated-webpage-screenshots-with-embedded-mozilla/

Selenium RC has a Ruby interface and can grab a screenshot using: http://release.seleniumhq.org/selenium-remote-control/1.0-beta-2/doc/ruby/classes/Selenium/Client/GeneratedDriver.html#M000220

PageGlimpse is a service providing developers with programatic access to thumbnails of any web page. The thumbnails can be virtually used in any kind of applications that require the display of website screenshots: web sites, windows/linux/mac applications, iPhone/mobile utilities, browser plugins, etc.
Including web site thumbnails in your application will dramatically improve the user experience. The service is easy to use, fast and reliable, no restriction on thumbnail sizes or number of hits. Click here to see how it works.
http://www.pageglimpse.com/

TeamCo AntHill