Lazy loading and the SEO problem, solved!

Author: Igor De Ruitz

Published on 5/19/2014

Last update 5/22/2014

Categories: SEO

Tags: ASP.NET, JavaScript, HTML, Tutorials, Google, Microsoft, Infrared CMS

Lazy loading and the SEO problem, solved!

Lazy loading applied to a web page is an excellent technique when all contents aren't immediately visible and you want to speed up page rendering.
A perfect candidate for this method are image galleries with many items, only the first image is visible when the page loads, other images will be visible only if the user decides to browse them. At the bottom of this article there's an image gallery that uses lazy loading:

when the page first renders, only the first image and caption are loaded and displayed;
since users usually browse images in order, the second image and caption are cached via JavaScript, so they can be quickly rendered;
if the user decides to jump to an image other than the next one, it will be pulled from the server along with its following one.

This technique sounds great, but has a big SEO problem: search engine crawlers does not execute scripts, they only analyze HTML markup, so all the code that renders images dynamically is totally ignored along with images and captions.

The first SEO solution comes directly from Google documentation, it's very simple, but unfortunately it does not work.

False solution #1: image sitemaps
Partial solution #2: noscript tags
Very good solution #3: escaped fragments
Best solution #4: hijax links
Image gallery
Comments

False solution #1: image sitemaps

In the official Google webmaster documentation it is written that if you load images dynamically using JavaScript, you can make Google aware of them using sitemaps.
I assume that you know what a sitemap is, may be you don't know that for every page you can specify many images attached to it.
Here's an example of a sitemap for a single page that displays dynamically three images.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">

<url>

<loc>

http://www.mydomain.com/default.aspx

</loc>

<image:image>

<image:loc>

http://www.mydomain.com/images/myimage01.jpg

</image:loc>

<image:title>My first title</image:title>

</image:image>

<image:image>

<image:loc>

http://www.mydomain.com/images/myimage02.jpg

</image:loc>

<image:title>My second title</image:title>

</image:image>

<image:image>

<image:loc>

http://www.mydomain.com/images/myimage03.jpg

</image:loc>

<image:title>My third title</image:title>

<image:caption>

My image caption.

</image:caption>

</image:image>

</url>

</urlset>

I have a travel blog that uses the same gallery at the bottom of this article. Every post has many images, thousands considering the whole web site.

I've done it perfectly:

every image has a well formed name;
every image has an alt attribute;
every image was placed in the sitemap along with title and description;
sitemap was referenced in robots.txt;
sitemap was registered in Google Webmaster Tools;
Google Webmaster Tools parser reported correctly the image contents.

After two years do you known how many images were indexed? ZERO!
I tried to discuss this with experts on Google forums, but nobody was able to give a significant answer.
My opinion is that despite of what is written on Google webmaster documentation, if images are not placed inside HTML markup, image sitemaps are ignored.

Partial solution #2: noscript tags

The <noscript></noscript> tags are used to inform the browser that if JavaScript is not enabled, the enclosing markup code should be rendered instead.
That's a perfect chance for us, in fact we can place inside a static gallery that loads all the images on the first rendering.
There're some drawbacks:

every gallery have to be duplicated: the lazy loading version and the static version (longer HTML code);
spiders look with suspect noscript tags because they are frequently used for cloaking (stuffing keywords not really related to the page).

Last but not least there's an user experience problem: there's no way for the user to bookmark the page with the gallery positioned on a specific image, every bookmark will always open the page showing the first image.

Very good solution #3: escaped fragments

[UPDATE: in October 2015 this method was officially deprecated by Google]

The solution was developed by Google not only for lazy loading, but for indexing AJAX contents in general.
Here's the official page: Making AJAX Applications Crawlable.

The documentation is simple and clear, but in a few words the solution is to use slightly modified URL fragments.
A fragment is the last part of the URL, prefixed by #. Fragments are not propagated to the server, they are used only on the client side to tell the browser to show something, usually to move to a in-page bookmark.
If instead of using # as the prefix, you use #!, this instructs Google to ask the server for a special version of your page using an ugly URL. When the server receives this ugly request, it's your responsibility to send back a static version of the page that renders an HTML snapshot (the not indexed image in our case).

It seems complicated but it is not, let's use our gallery as an example.

Every gallery thumbnail has to have an hyperlink like: /...#!blogimage=<image-number>
When the crawler will find this markup will change it to
/...?_escaped_fragment_=blogimage=<image-number>

Let's take a look at what you have to answer on the server side to provide a valid HTML snapshot.
My implementation uses ASP.NET, but any server technology will be good.

var fragment = Request.QueryString["_escaped_fragment_"];

if (!String.IsNullOrEmpty(fragment))

{

var escapedParams = fragment.Split(new[] { '=' });

if (escapedParams.Length == 2)

{

var imageToDisplay = escapedParams[1];

// Render the page with the gallery showing

// the requested image (statically!)

...

}

What's rendered is an HTML snapshot, that is a static version of the gallery already positioned on the requested image (server side).
To make it perfect we have to give the user a chance to bookmark the current gallery image.
90% comes for free, we have only to parse the fragment on the client side and show the requested image

if (window.location.hash)

{

// NOTE: remove initial #

var fragmentParams = window.location.hash.substring(1).split('=');

var imageToDisplay = fragmentParams[1]

// Render the page with the gallery showing the requested image (dynamically!)

...

}

Did you like this article up to here?

Before you continue, follow us on our LinkedIn page pressing the button here below!
In this way, we'll be able to keep you updated on digital strategies not only with our posts, but also with the best articles that we collect around the web.

Best solution #4: hijax links

In Making AJAX Applications Crawlable it is written that if you're starting from scratch, one good approach is to build your site's structure and navigation using only HTML. Then, once you have content in place, you can spice up the appearance and interface with JavaScript (and AJAX). Googlebot will use your fallback links, while users will enjoy your dynamic contents.
This solution is the one that we adopted in our gallery. If you inspect the HTML link of every thumbnail you'll see something like:

<a href='//www.idea-r.it/...?blogimage=<image-number>' onclick='...show(<image-number>);return false;'>

From the user perspective, the return false at the end of the onclick handler will make the href totally ignored and the JavaScript code executed.
From the spider perspective, the href will be followed to index contents and the JavaScript code ignored.

Let's take a look at what you have to answer on the server side to provide a valid HTML snapshot.

var parameter = Request.QueryString["blogimage"];

if (!String.IsNullOrEmpty(parameter))

{

var imageToDisplay = Convert.ToInt32(parameter);

// Render the page with the gallery showing the requested image (statically!)

...

}

You can experiment it yourself.
Try for example to click the following URL and you'll see what the Google crawler will index:
www.idea-r.it/blog/110/en/lazy-loading-seo-problem?blogimage=4

Since we still want the user to be able return to a bookmarked image, leave the following code (seen in previous solution):

if (window.location.hash)

{

// NOTE: remove initial #

var fragmentParams = window.location.hash.substring(1).split('=');

var imageToDisplay = fragmentParams[1]

// Render the page with the gallery showing the requested image (dynamically!)

...

}

To make bookmarking possible, we have to change the URL in the address bar every time the user clicks on a new image. Add the following code at the end of the JavaScript rendering method:

window.location.hash = 'blogimage=' + displayedImageNumber;

You can experiment it yourself with the gallery:

copy the following URL into the clipboard
www.idea-r.it/blog/110/en/lazy-loading-seo-problem#blogimage=4
navigate to another page;
paste the URL in you browser address bar

You'll see again this page, but with the initial image set to the 5th one.
All the galleries you see in this blog are implemented using our Infrared CMS, so you can test test their optimization.

Download the FREE eBook

SEO - The definitive checklist to climb to the top

Download for free

Download the free SEO checklist and learn how to optimize your website and outperform the competition.
You will find over 50 guidelines collected and explained to better position you on search engines.
The security requirements that Google recommends and promotes, the mistakes you need to avoid at all costs, and much more.