Hands-On Guide: Verifying FIFA World Cup Web Site against Performance Best Practices
Whether you call it Football, Futbol, Fussball, Futebol, Calcio or Soccer – if you are a fan of this game I am sure you are looking forward to the upcoming FIFA World Cup in South Africa. The tournaments web site is http://www.fifa.com/worldcup and allows the fans to follow their teams and get the latest updates on scores, standings, schedule, ticketing or hospitality. Only the best performing teams in the qualification matches made it to the tournament and only the best performing team will end this tournament as new world champion.
As I’ve done with other sport events such as the Winter Olympics in Vancouver or the Golf Masters I want to take you through a Step-By-Step analysis of different pages on the FIFA site based on Web Performance Best Practices that have been established over the last couple of years – such as the ones from Google and Yahoo. FREE tools such as the dynaTrace AJAX Edition, Yahoo’s YSlow and Google’s PageSpeed make it easy to perform these analytic steps identifying issues that could easily become to real performance problems once the web site is really hit by many users.
My analysis of the FIFA site shows that – once the World Cup starts next week and the site gets really hit by millions of users around the globe – there is a big chance the site will run into performance and scalability issues due to several best practices that my analysis shows the site does not follow. This failure causes load times of the initial page take more than 8 seconds and requires downloads of more than 200 elements. These problems can easily be fixed by following the recommendations I highlight in this blog. Let’s get started:
What needs to be analyzed to identify a slow page?
Key Performance Indicators (KPI’s)
Let’s start by analyzing the start page of the FIFA World Cup Web Site. Before I capture the performance information I make sure to clear the Browser Cache to experience the page as a first time visitor (dynaTrace AJAX provides an option to clear the cache for you – configurable in the Run Configuration Setting). The following image shows the Summary View of the initial page:
Now – let’s have a look at the Network View to check on detailed download graph. This view shows us which domains serve which resources and how the browser actually downloads the individual resources. We can spot redirect requests (HTTP 3xx), authentication issues (HTTP 4xx) and Errors (HTTP 5xx). The DNS and Connect Time tells us whether we have to deal with some expensive domains in terms of establishing a physical connection. The Wait Time tells us whether individual resources have to wait a long time to be actually downloaded due to the physical network connection limitation. The Server Time tells us whether the server takes a long time to respond to a request – indicating a server-side processing problem. Finally the Size and Transfer Time tells us whether we have a problem with large content and latency:
Here is a summary of all KPI’s that we can read from the previous three views – and let me explain what they mean to me and what values I consider to be good or acceptable or not acceptable:
- Time to First Impression/Drawing: 3.74s
- Analysis: so it takes almost 4s until the user sees a visual indication of the page load – that is definitely too long and should be improved
- Recommendation: < 1s is great. <2.5s is acceptable
- Time to onLoad: 8.25s
- Recommendation: < 2s is great. <4s is acceptable
- Time to Fully Loaded: 8.6s
- Recommendations: < 2s is great. <5s is acceptable
- Number of HTTP Requests: 201
- Analysis: 201 – that’s a lot of elements for a single page. We have seen many images that are the main contributor to this load. My first thought on this -> let’s seen how we can reduce this number by e.g.: merging files (more details later)
- Recommendations: < 20 is great. < 100 is acceptable (This one is a hard recommendation as it really depends on the type of website – but – it is a good start to measure this KPI)
- Number and Impact of HTTP Redirects:1/1.44s
- Analysis: This is a very expensive and it seems unnecessary redirect from http://www.fifa.com/worldcup to http://www.fifa.com/worldcup/
- Recommendations: 0. Avoid Redirects whenever possible
- Number and Impact of HTTP 400′s: 1/0.71s
- Recommendations: 0. Avoid any 400′s and 500′s
- Recommendations: It is hard to give a definite threshold value. Keep in mind that these files need to be downloaded and parsed by the browser. The more content there is the more work on the browser. The goal must be to remove all information that is not needed for the current page. I often see developers packing everything in a huge global .js file. That might be a good practice but too often only a fraction of this code is actually used by the end-user. It is better to load what needs to be loaded in the beginning and delay load additional content when really needed
- Max/Average Wait Time: 4.31s/1.9s
- Analysis: this means that resources have to wait up to 4.3s to be downloaded and that they have to wait 1.9s on average. This is way to much and can be reduced by either reducing the number of resources or by spreading them on multiple domains (Domain Sharding) in order to allow the browser to use more physical connections.
- Recommendations: < 20ms is good. < 50ms is acceptable (as you can see – we are FAR OFF these numbers in this example)
- Single Resource Domains: 1
- Analysis: from the timeline we can also see that there is one domain that only serves a single resource. In this particular case it seems to be serving an ad. We can assume that this might not be changeable but this KPI is a good indicator on whether it is worth paying the cost of a DNS Lookup and Connect if we only download a single resource from a domain
- Recommendations: 0. Try to avoid single resource domains. It is not always possible – but do it if you can
The KPI’s tell me that the page is way too slow – especially the Full Page Load Time of 8.6s needs to be optimized. With the KPI’s we can already think about certain areas to focus on, e.g.: reducing the network roundtrips or minimizing content size. But there is much more. Let’s have a closer look into 4 different areas.
Usage of Browser Caching
Even though the objects are taken from the Cache (as indicated in the Cached column) – the browser has to send a request to the web server to check if the cached object is still valid. Why is that? Because the Expires Header only sets a date/time that is roughly 30s in the future. So – a returning user has to send the same number of HTTP Requests to the server asking whether the content is still valid (IF-MODIFIED-SINCE). Even though these requests only return that the content is still valid we end up having large wait times due to the fact that there are so many resources served by the same domain.
Besides very short expires headers the page also contains a few that have an Expires Header that is set in the past. This might be on-purpose to prevent any caching of these resources – but it also often happens due to mis-configuration of the web server.
Summarizing the Browser Caching Analysis – we have
- 175 resources that have an Expires Header no longer than 48 hours in the future. I took the 48 hours from the Best Practices of Yahoo and Google
- Solution: Analyze these resources and set Far-Future Expires headers where it makes sense, e.g.: all the flags of the participating countries
- Save Potential: 175 unnecessary roundtrips to the server, lots of network time and transfer size
- 4 resources that expired in the past
- Solution: Look into those objects and verify if they really shouldn’t be cached at all
- Save Potential: 4 unnecessary roundtrips to the server, lots of network time and transfer size
Network Resources and Transfers
While analyzing the flags I discovered an interesting “flaw” of the website. Maybe you already noticed it to. Why do we have 68 flags? There are only 32 countries playing in the tournament. Well – maybe they have different sizes of flags – that was my first assumption – and yeah – that is part of the discovery – but – it is not the real “flaw” I identified. The reason for that is that the small flag images are hosted on two domains (img.fifa.com and www.fifa.com). The initial HTML page references the images with an absolute path from the img.fifa.com domain as well as relative from www.fifa.com. The following illustration shows parts of the HTML Document from www.fifa.com that uses two different ways of referencing those flags. The illustration also shows the actual requests that are sent to the web server – it is easy to spot that the SAME country flags are downloaded twice from both domains:
Fixing this problem saves 32 roundtrips as the browser can just use the already downloaded images – or – if you follow the best practices on merging the images into a single image using CSS Sprites we end up downloading only 1 image instead of 64 (nice save – isn’t it?). In case you wonder why there are 68 flag requests in total? Reason for that is that some flags – depending on which page you are own – are also downloaded in medium and large size – thats why I had 4 additional flags that got downloaded.
Summarizing the Network Resources and Transfers – we have
- One Redirect and one HTTP 403
- Solution: figure out a way to get rid of them – especially the 403
- Potential Savings: more than 2s of total network time + speeding up the initial download of the page by 1.4s when we get rid of the redirect. This will bring down the Time for First Impression, Time to onLoad and Time to Fully Loaded KPI’s
- 32 duplicated downloads of flag images
- Solution: change the src location of these flag images to be only taken from the img.fifa.com domain
- 32 flag images for CSS Sprite use
- Solution: Merge the 32 flags into a single image and use CSS Sprites
- Potential Savings: reducing 32 requests to 1 -> saves 31 requests
- ~100 additional images potential candidates for CSS Sprite
- There are a total of 175 images on that page. So – besides the 68 flag images we have 100 more that are potential candidates for merging like the 19 sponsor logos or 10 organization logos
- Potential Savings: we can probably get rid of another 50-70 requests on these images
Application Server-Side Processing Time
Once we solve all the deployment issues like making correct use of browser caching and optimizing network resources we are ready for some real load. Unless you are serving static content only increasing load usually has a negative impact on application server-side processing time. Why is that? Because that is when the application code actually needs to perform some work such as getting information from the database (who scored the goals in the opening match) or query external services (e.g.: how many tickets are still available for a certain game). The more requests the server has to handle the more pressure it puts on the actual implementation of the code and this often reveals problems on the server-side. These are problems that really hurt your business in case critical transactions such as buying a ticket don’t finish fast enough or actually fail under heavy load.
That is why we have to look at requests that actually cause the application to do work. How to identify those requests? If you know the application I am sure you have a good understanding about which requests are served by the app server, your web server or your CDN. We can also check the HTTP Response Headers to see whether the application adds some app-specific headers. Here’s a quick way – I look at requests that show one the following characteristics:
- First request on the page -> usually returns the initial HTML
- Requests that return HTML -> generated content (this also may include static HTML pages)
- Requests on URL’s ending with aspx, jsp, php
- Requests that send GET or POST parameters data to the server
- All XHR/AJAX Requests
The following image shows the Network View with all those requests that meet my criteria showing me that a total of ~3.6s is spent in Server-Side Processing:
The Server-Column shows the Time to First Byte. This is as close to server-side processing time as we can get by analyzing the network requests that are sent by the browser. So – this is the time from the last byte sent from the HTTP Request until the first byte received. This also includes some network latency – but as I said, this is very close to the actual server-side processing time. When we want to get more accurate numbers we have to analyze the actual processing time on the application server itself. Either analyze server log files or use an APM Solution such as dynaTrace that allows us to get a full end-to-end view of each individual requests.
Summarizing the Application Server-Side Processing Time – we have
- 10 Requests that seem to hit an application server consuming a total of 3.6s on the server and return ~800kb of data
- Solution: Analyze server-side processing and tweak performance by following best practices such as reducing roundtrips to the database, reducing remoting calls, optimize synchronization and memory usage. There are plenty of articles on this blog – I definitely recommend to read the 2010 Performance Almanac from Alois
- Potential Savings: based on our experience with our clients performance can be increased 3-fold by following the server-side performance best practices. You have to have the appropriate tools for a detailed analysis and you should start with your performance optimization efforts early on in the project – don’t start in production – Listen in to some Best Practices of our clients such as Zappos, Insight, Monster or SmithMicro
When double clicking on a method the HotSpot View shows us the Back Traces (the reversed call tree). Doing this on all these $(<xy>) showed me that these calls are calls made by the implementation of getElementsByClassName. By fixing that problem we also get rid of all these. More interesting on this page is the lookup that is highlighted in the screenshot above. The method currMenuItem executes an expensive lookup of the same element 4 times resulting in 87ms execution time. 3 of these 4 calls can be saved by caching the lookup result.
- 1 lookup by class name taking 2s execution time
- Solution: instead of looking elements up by classname use a lookup by id (#) or at least specify a classname. Also – make sure to use the latest versions of your lookup framework such as jQuery – they constantly make performance improvements
- Potential Savings: I would say we can save 99% of the execution time when switching to a lookup by ID -> that’s a great save
- Redundant lookups
- Solution: cache the lookup result of the first lookup call. Then reuse this value for additional operations
- Potential Savings: In this case we can save 60ms
Overall Performance Analysis Results – What’s my Rank?
If I would need to rank this web site – similar to what YSlow and PageSpeed are doing I would have the following result:
Overall Ranking: F
- Browser Caching: F – 175 images have a short expires header, 4 have a header in the past
- Network: F – 201 Requests in total, 1 Redirect, 1 HTTP 400, duplicated image requests on different domains
- Server-Side: C- 10 App-Server Requests with a total of 3.6s -> analyze server-side processing
Good news is that there is lots of potential to speed up this web-site by following these Best Practices
Follow up readings …
Throughout this blog I linked to different blogs and websites that you should to look into. Here are some MUST READS:
- Best Practices from Google and Yahoo
- Blog from Steve Souders and John Resig (jQuery)
- dynaTrace Blogs on AJAX and Performance Almanac
Additionally you should check out How to Get Started with dynaTrace AJAX Edition and the Webinar we had with Monster.com to better understand how dynaTrace AJAX Edition can help you analyze your website.