Categories: Chit-chat :

ETags

Showing 1-13 of 13 messages
ETags Robbo 6/30/12 5:34 PM

Hi

I am using GTMetrix to analyse a site and identify ways of speeding it up.

GTMetrix reports that the ETags are "misconfigured".  Disabling ETags altogether ( FileETag none in .htaccess ) produces an improvement in YSlow score and a worsening in PageSpeed score.

Reading the GTMetrix notes about ETags, it appears that they are produced automatically by Apache (unless disabled).


So my questions are:

--- is GTMetrix reliable regarding messages that the "ETags are misconfigured" or is this a known issue?

--- as the ETags are generated automatically by Apache, in what way can they be "misconfigured"?

--- how can I ensure that the ETags generated are valid?



Cheers

Robbo





Re: ETags webado 6/30/12 6:08 PM
Might help if I even knew what purpose they are supposed to serve....
I know I've occasionally seen them mentioned when checking headers in web-sniffer but have never known what they are.
Re: ETags webado 6/30/12 6:10 PM
Aha...


Soemthing to do with the cache....
Re: ETags cristina 7/1/12 2:10 AM
Put simply, the default configuration of an Apache server would return an Etag field in the server header for the URL of a static file, with the value of Etag staying the same while the content is not modifiied and the Last-Modified value stays the same, and no Etag for dynamically generated content that has no Last-Modified server header field.
It also helps with the 304 (Not Modified) server status code for conditional GET requests, that is quite important because it does not return content so it reduces bandwidth.
Google and some other search engines and browsers send conditional GET requests (with If-Modified-Since) and need correct 304 response code when the content is the same as for a previous request.
I suppose some Etag misconfiguration can happen when the default configuration is changed to add Etag in the server header of URLs with dynamically generated content
Re: ETags Robbo 7/1/12 3:04 AM
Hi Webado & Cristina

Thanks for the responses,  The ETags I am concerned about are for static images.  My understanding is that the ETag is generated based certain things (inode of the server, date, filesize, etc) that should make the ETag unique for that specific file.  Indeed even if you make an identical copy of the file but store it on a different server, it will have a different ETag.

My puzzlement is how can an ETag be "misconfigured" if it is generated automatically by Apache and allows no user-specified configuration?

Contrast that with, for example, manually configuring the .htaccess with [ ExpiresByType image/png A2592000 ] -  Clearly the expiration might be specified wrongly e.g. with an extremely short life.

So GMetrix says my ETags are "misconfigured" but it gives no clue in what way they are misconfigured, or how to "reconfigure" them.

That's what stumps me.


Re: ETags Robbo 7/1/12 5:29 AM


I think I may have figured this out.

A recognized limitation of standard ETags is that because they are derived from three attributes: inode (the unique server identifier), MTime, and Size, the same resource (e.g. the same static graphic) held on different servers in a multi-server system, would have different ETags even though for caching purposes the resource is the same.

Apparently, GTMetrix treats the standard default form of ETags as "misconfigured"; however, there is a way to specify a different configuration format which is to use MTime and Siz but NOT iNode.
 
According to http://www.websiteoptimization.com/secrets/advanced/configure-etags.html/  this reconfiguration can be implemented in the Apache httpd.conf file as follows:

<Directory /usr/local/httpd/htdocs>
    FileETag MTime Size
</Directory>


Instead of updating the httpd.conf, I experimented by adding this line to my .htaccess:

FileETag MTime Size


It seems to work just fine:

--- the ETags are reconfigured (verifiable with web-sniffer)

--- PageSpeed is happy that I am providing ETags (i.e. I have not disabled them);

--- YSlow is happy that my ETags are not "misconfigured" (i.e. the ETags are based on MTime and Size but not server iNode))


I am happy to see the site's PageSpeed and YSlow scores improve and am also pleased that I understand it better now.


Cheers,

Robbo


Re: ETags JohnMu 7/2/12 7:28 AM
This is really interesting, thanks for digging into the details, Robbo & Cristina! 

I imagine these settings only apply to static content (HTML pages / images, etc), right? or would they also apply to dynamic content (like PHP scripts)? 

Cheers
John
Re: ETags cristina 7/2/12 7:43 AM
Look also at the other fields in the return header, like Last-Modified and Date, do they look OK?


Re: ETags cristina 7/2/12 8:01 AM
By default an Apache server would send an Etag in the server header of an HTTP response for static files, not dynamically generated content.
But I think it is possible to add an Etag in the server header with a script, or like Robbo by modifying the .htaccess file, (like for any other header field), but it has to be done correctly. The value of Etag has to reflect the content of the page, the cache control server settings, etc.
And it is possible that a server or cluster of servers has various possible levels of generating the Etag (or not) that might conflict with each other, if the default settings are superseded at different levels.
It is possible that GTMetrix gave an Etag misconfiguration warning because Robbo's servers returned different Etags when the GTMetrix requested the URL with If-None-Match in the header of their HTTP request (but I am not familiar with GTMetrix).



Re: ETags pierrefar 7/2/12 8:52 AM
Fun fun topic from years past :) Some links to dig into:



Now the details: Th ETag was meant to be a unique identifier of the resource (CSS, JS, image) such that a browser can check if this identifier was changed and the server would reply with the new content if it has changed or 304 Not Modified if it hasn't.

For a single server serving these resources, this is fine. It gets complicated though if the same resource is hosted on multiple different machines that can respond to the request (think a big site with load balancers). The problem here is that, by default, Apache uses the inode of the resource in the ETag. The inode is something the underlying file system exposes (it's a property of the file on that file system), and so if you have multiple machines serving the same file, they get different inodes and the ETag is rendered useless. (If you're keen: http://en.wikipedia.org/wiki/Inode ).

Which brings us to the different ways of fixing this. One popular option is to disable ETags completely and properly configure cache headers (basically well into the future, but there are a few important details so read up on those!) to get browsers to do the right thing. I personally like this option a lot because it saves the browser the ETag check - the cache headers suffice and the browser doesn't need the extra HTTP connection and server response to know it has the correct version (which is faster). In the future when you change the resource (e.g. edit the CSS file), you change its URL, browsers discover this, fetch it, see the cache headers, and we're back to a well-cached page asset. A popular way to alter the URL is to use a "fake" URL query string that functions as a versioning system, something like:

/assets/site.css?v2

And when you edit it, you make that ?v3, then ?v4, etc. If you want to get super fancy, you can have the CMS or framework check the file's modified time, convert it to a Unix timestamp (a 10-digit number), and embed that automatically into your pages. 

The other way to fix this is to not disable ETags but configure them to not use inodes and use properties that depend on the file's contents rather the underlying file system. Modified time and file size are such properties, which is what the websiteoptimization.com page suggests.

Does that help?

Cheers,
P
Re: ETags webado 7/2/12 9:04 AM
It could help .. it should help...

But what about Chrome that seems to simply not want to even request a page it's already seen before (and cached)  from the server natively? It keeps displaying it from the local cache.

I find I have to constantly hard refresh Chrome to see changed pages despite my sending through headers with last modified date set correctly.

At least in IE I have settings that tell it to get me a fresh page every time. 
Re: ETags Brian Ussery 7/2/12 2:49 PM
I use http://redbot.org/ to check for ETag issues but in most cases don't find any real problems.  When ETag config issues come up, they usually tend to be one of these:

1. Load Balance Issues (like Pierre mentioned) - For load balanced sites on multiple Apache servers, fields used to create ETags should not be server dependent.  For IIS configure ETAG_CHANGENUMBER to avoid multiple ETags for the same resource.

2. Compression Issues - Instances where compressed and non-compressed version of the same file return the same ETag.

I think Bing has stated that they use ETags but not Google.





Re: ETags cristina 7/2/12 2:56 PM
Google gets correctly HTTP status 304 (Not Modified) for static files with Etags, I do not know if it would still get the 304 if Etags would be missing, (everything else  being the same).