Using Human Computation to answer web performance questions

Recently I had lunch with Drit Suljoti (@dritans) of Catchpoint who sponsors NY Web Performance Meetup and as always with Drit, we had a great talk about emerging technologies, product ideas and ways how web can be made faster.

At some point in discussion about casual gaming I remembered great series of projects by Luis von Ahn (of CAPTCHA and reCAPTCHA fame) who was working on so called “human computation”.

One of the ideas he was working on was development of games that humans can play and produce a side product that will help computers with tasks that they are not that good at. He created ESP game that was later licensed to Google to become Google Image Labeler and Peekaboom (now unavailable).

These two games were working on a problem of object identification in images – task that computers might never be able to master, at least during our lifetime. I really recommend his awesome presentation about the topic of human computation, CAPTCHAs and games – watch it all, it’s worth it.

So, talking to Drit about the performance, I realized that human computation can be the answer to one of the problems that Web Performance field is struggling with for a while already: how to automatically identify “enough” point of the web page, when it becomes reasonable for a user to engage with the page; metric that we can use to optimize web page performance for.

This metric is notoriously hard to measure, browser events only reveal technical aspect of page load, but do not tell much about user’s perception. Even render events are not good enough as having something drawn on the page, although dramatically removes the emptiness feeling of “nothing is happening”, still doesn’t provide the enough moment.

There are some algorithms that try to define “enough” by analyzing the drawing activity of the visible area of the page (so called “above the fold”, term we got from newspapers), they are still not very precise and hit the same constraints any object recognition algorithm will hit.

And having this problem in my head for quite a while, looking for an ideal metric that Show Slow can track, that can be put on weekly reports and even t-shirts, I realized that we might not be able to give a good answer to this problem by computerized means. It is very possible that the only way to achieve reasonable results is to use human brains.

So the solution could look like a game or at least a game-like environment in which people are given a simple task that, as a side product can produce enough data so we can identify performance metrics using statistical analysis.

Right now I see a few important metrics that people can enter voluntarily on involuntarily – time at which page rendered to “enough” state and points on the screen that represent the most important visual information.

Both of these metrics can probably be collected simultaneously and using the same process. Players might be identifying page’s content or provide timing directly.

Luckily many tools, including free and open source WebPageTest.org and Catchpoint’s commercial monitor can already capture images of page rendering activity so getting the content is not a problem.

So, what it all means is that people can work very well where computers can’t and human computation can come to rescue for web performance industry’s most complex problems.

This entry was posted in Web Performance. Bookmark the permalink.

14 Responses to Using Human Computation to answer web performance questions

  1. How about this…

    JavaScript event API. Here’s how it would work – You create your own event and use them as timers for anything you want to measure. These events will span all HTML documents, CSS and JavaScript Include files. You can create events for stopping and starting, elapsed time for an event, (without removing it), even the ability to reset the timer to zero without removing the event.

    All of this would run inside a Mobile browser along with all the other standard measurement stuff (measure how long to get the images etc.) What you would now have is a way to interact with the customer and use the timers as triggers.

    Cheers,

    Peter
    5o9 Inc.

  2. I’m not 100% sure it should be done in real time with real users.

    Measuring stuff in general definitely needs to happen – things like Jiffy (not sure if it actively maintained) and Boomerang and maybe other frameworks that allow measurement instrumentation are going to provide you with some metrics.

    The problem is that some of these, like that generic “enough” moment is not possible to technically to do without modifying the page to include API calls. That’s why I think human computation can solve some of these problems.

  3. The problem is the “feedback” mechanism. The event is the trigger which allows us to measure – so IMO it has to be inside the page otherwise you can’t measure the “enough” point. What you want are enough “events” to define a usable matrix of enough.

    Something like this…

    js.event( “MyEvent1”, “start” );
    js.event( “MyEvent1”, “show” );
    js.event( “MyEvent1”, “stop” );

    With the show event you can tie it to a real user interaction – the timer doesn’t have to stop until later.

    Cheers,

    Peter

  4. Drit says:

    Sergey,

    One way to do it in real time is to implement something similar to Fastsoft’s Which Site Loads Faster http://whichloadsfaster.com/

    The concept would be to frame the URL to test and on the parent page track its progress (through webtiming API?) and provide controls for the user to provide feedback on when the page loaded. The main concern would be cross frame security limitations (not tested the Webtiming API in such scenario). The biggest pro, is that is based on real performance data…and the user could take in account the case of is the page interactive or not.

    In any case I think we should try to pursue both options and try it out!

    drit

  5. Yeah, I showed it to Ryan and he was excited too – mentioned that he had plans for having “which did you feel was faster” button: https://github.com/ryanwitt/whichloadsfaster/issues#issue/2

  6. Interesting idea. I’m not sure how it works in practice though. You still have to define a way to measure the “feel”. That requires that there’s something inside the browser page where you can take timing measurements.

    Secondly – what about Mobile? I just tested whichloadsfaster.com on Android and in a Google vs. Bing test only Bing would load. (Probably a cross scripting issue).

    Whichever way you go I think it’s important to A) define “feel” and B) include Mobile.

    Peter

  7. I think the whole point of using Human Computation is to derive “feel” from user’s actions and that’s what will make it possible – otherwise solving this with high quality results is very unlikely.

  8. Ryan Witt says:

    Had this page open yesterday and was about to comment but got distracted. :)

    There are really are three topics here:
    (1) Discovering what parts of the page matter
    (2) Discovering relative page load speed
    (3) Using a game to measure either of these things

    Sergey, if I understand him correctly, is trying to do (1) and (3), which I now agree are much more important than (2).

    We probably all saw the study and realize that it’s important for humans to feel that the machine is not the bottleneck, but identifying the right metric is tough. It’s not as simple as “above the fold”, and the metric is always tweaked manually, as you can gather from many of the “we made our homepage faster” talks (like @slicknet’s velocity presentation on Y! homepage performance).

    This is probably one of the most important questions in web performance because it tells you what to actually optimize (rather than the easy to measure but somewhat inaccurate metrics we go for now). Trading accuracy for ease of measurement is not something we can do forever if we really want to make the web integrate into the lives of each human. Having a tool to help find this metric would be awesome.

    This definitely applies to whichloadsfaster.com: Originally I wanted users just to (2) relatively rank the page speeds (“which seemed faster?”), but I also had the idea to have the user click the button right when each page loads, trying to race to get an accurate button click right when they notice it loaded (1). You could provide feedback based on when other people had clicked and thereby make it into a game. I did not think this would be practical, but after talking to Sergey, I think this would be much more valuable and fun. I’m sure once we have a good goal, we’ll figure out how to engineer it. :)

    However we gather it, once we have some perception data on when a page “loaded”, we could use this to train a machine learning algorithm to do the same. Something like recognizing the properties of a DOM element that “counts” as part of the initial page load.

    I think there’s a lot of potential here to help web performance. Maybe we should meet up to discuss more. Would be easy for the NYC people to get together for coffee. Drit, Sergey, you guys up for it?

  9. Ryan Witt says:

    Look what presentation is coming up on the online Velocity conf: http://en.oreilly.com/velocity-mar2011/public/schedule/detail/18692

  10. Pingback: 22 not-so-short links about web performance — Web Performance Today

  11. @Ryan, I’m all for it – let’s discuss. I’m not sure we’re ready for machine learning case, but getting some statistical data to get real time might be doable.

    I saw AFT presentation scheduled – I saw some docs about it on WebPerfCentral here: http://www.webperformancecentral.com/wiki/WebPagetest#AFT_Documentation – will definitely be asking questions ;)

  12. Steve Thair says:

    “Above the fold” is a term taken from the publishing industry for a reason… and I wonder if that terminology only applies TO the publishing industry i.e. pages that are mean to be “read” e.g. newspaper articles.

    Unfortunately that is a subset of the pages on the internet.

    For example, there are pages that are designed to be interacted with (applications), pages that are designed to be “viewed” (as opposed to “read” e.g. Flickr), pages that try to match my “intent” (e.g. search pages, where the accuracy of the search algorithm determines how much of the page is relevant).

    So there is an problem a priori to the “when is the page loaded” problem, which is the “what sort of page is it” problem. The answers are interdependent.

    Lets take a concrete example:

    (1) I click a link on my favourite gadget site “www.trustedreviews.com” to read a new review about the Xoom. Before the page has completely loaded I can happily start reading the first paragraphs in the review once the main content pane has started to load whilst the rest of the navigation furniture and the “below the fold” elements load. “Ready Time is less than Load time”

    (2) After reading the review I decide to check out some prices… so I nip over to Google shopper. I want to see all the prices and then sort by price… so I have to wait for the entire page to load before I can sort. “Ready time = Load time”.

    (3) I then click on product link and as the page starts to load I click on the BUY button as soon as it appears. I don’t want to read more about the product, I’ve already done by research. The Buy button may or may not work depending on the javascript loading order and how the onclick/Submit function is handled. “Ready time??? Load time???”

    (4) Once into the basket and checkout process I am probably into a workflow where “Ready Time = Page Load” or even “Ready Time Greater than Page Load” whilst OnLoad event javascript needs to fire to setup validation routines etc before I can interact with the page.

    And so on…

    So if we look at Ryan’s list above you need to add:

    (0) What sort of page is it?
    (1) Discovering what parts of the page matter for this type of page
    (2) Discovering relative page load speed
    (3) Using a game to measure either of these things

    I hope that makes sense?

  13. Peter Booth says:

    I read the article with interest and started turning it over in my head. I think that the “enough time” is a hard problem, intrinsically indeterminate, other than in a probabilistic fashion. As a performance guy, one of the nicest thing sto do with hard problem sis a find a way to avoid having to solve them (e.g. dont recalculate this).

    So what if we didn’t have to solve this? What if our metric were load time+all javascript executed? Is the issue that right now many sites perform poorly and load times are unacceptably high? I think that as performance guys (and girls) we can often avoid wrestling with absolute numbers and fudge things by focusing on the relative. Well as an opinionated technologist I can confidently assert that if a web site downloads, renders an dis fully initialized within 100 ms then I dont care about above the fold time – load time is just fine thank-you.

  14. @Peter 100ms is an ambitious goal and can’t alway be achieved – look at Yahoo home page where they have more content that can fit into 100ms with current value of speed of light ;)

    We definitely should optimize, but we need a realistic measurement.

Leave a Reply

Your email address will not be published. Required fields are marked *