PANVEGA’s Blog

DotNet Development, SharePoint Customizing, Silverlight, MS Infrastructure and other tips and tricks

IIS HTTP Compression

Posted by PANVEGA on September 22, 2008

There’s a finite amount of bandwidth on most Internet connections, and anything administrators can do to speed up the process is worthwhile. One way to do this is via HTTP compression, a capability built into both browsers and servers that can dramatically improve site performance by reducing the amount of time required to transfer data between the server and the client. The principles are nothing new — the data is simply compressed. What is unique is that compression is done on the fly, straight from the server to the client, and often without users knowing.

In the day of IIS5 and earlier the compression built into IIS had various issues and was really not worth implementing. To enable compression you would need to go with a 3rd party solution like http://www.port80software.com or http://www.xcompress.com. This has all changed in IIS6!

It allows faster page serving to clients and lower server costs due to lowered bandwidth I am going to explain how to implement HTTP Compression in Internet Information Server (IIS) 6.0.

Note:

HTTP compression uses standards-based gzip and deflate compression algorithms to compress your XHTML, CSS, and JavaScript to speed up web page downloads and save bandwidth.

Compression typically reduces plaintext size by 75 percent: that quadruples your throughput! Every website should be serving up HTTP compressed pages to clients that can accept it. The client indicates ability to accept compressed contents in the request headers:

GET /blog/index.xml HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Host: www.codinghorror.com
Connection: Keep-Alive

Which files where compressed?

Not all files are suitable for compression. For obvious reasons, files that are already compressed, such as JPEGs, GIFs, PNGs, movies, and ‘bundled content (e.g., Zip, Gzip, and bzip2 files) are not going to compress appreciably further with a simple HTTP compression filter. Therefore, you are not going to get much benefit from compressing these files or a site that relies heavily on them.

However, sites that have a lot of plain text content, including the main HTML files, XML, CSS, and RSS, may benefit from the compression.

It will still depend largely on the content of the file; most standard HTML text files will compress by about a half, sometimes more. Heavily formatted pages, for example those that make heavy use of tables (and therefore repetitive formatting content) may compress even further, sometime to as little as one-third of the original size.

Fortunately, with most HTTP servers it’s possible to select which types of files are compressed so the effects of trying to compress non-compressable data is limited.

Note:

To find the Compression Cache Temporary Directory, open up IIS and right-click on the Web Sites node and go to the Service tab. There is a text box that has a label next to it marked Temporary Directory, although it may not yet be enabled.

If a compressed version isn’t found, IIS will send an uncompressed version of the file to the client and a compressed version is placed in the temporary directory (IIS will only serve to the client from the temp directory). If the compressed version is found, IIS will send the file directly to the requesting client. If the requested file is a dynamic file, such as an ASP.NET Web form, then the response is dynamically compressed and sent to the requesting client (no temp directory access is ever done). The temporary compress directory is only used for static pages. Dynamic pages aren’t saved to disk and are recreated every time so there is some CPU overhead used on every page request for dynamic content.

Enabling HTTP Compression on your Windows 2003 Server

You must be a member of the Administrators group on the local computer to perform the following procedure or procedures. As a security best practice, log on to your computer by using an account that is not in the Administrators group, and then use the runas command to run IIS Manager as an administrator. At a command prompt, type runas /user:Administrative_AccountName“mmc %systemroot%\system32\inetsrv\iis.msc”.

There are quite a few steps to enabling HTTP Compression on your server. If you follow the steps in this article, you shouldn’t have any issues.

First, open up IIS and right-click on the Web Sites node and go to Properties. Click on the Service tab. As shown in FIGURE 1, you’ll see two options: Isolation mode and HTTP compression. If the Run WWW service in IIS 5.0 isolation mode check box is checked, IIS will run almost exactly like IIS 5.0. This means you won’t be able to take advantage of things such as Application Pools, which in my opinion are worth the upgrade to IIS 6.0 by themselves.

We’ll utilize the options within HTTP compression in this article:

  • Compress application files. Check this to compress application files. If you do select this, you must also have Compress static files checked, although you won’t be warned of this need.
  • Compress static files. Check this to compress static files. After you do so, the Temporary directory text box is active.
  • Temporary directory. You can leave this at the default, which is %windir%\IIS Temporary Compressed Files, or set it to a custom folder. This is where temporary compressed static files will be stored.
  • Maximum temporary directory size. This option enables you to set the maximum size of the temporary directory. After the size is met, items are removed based on duration; the older files are removed and the new files are put in.

Allow the compression ISAPI to run

IIS 6’s new security system prohibits ISAPI DLLs from running by default, so you need to tell IIS 6 that it’s okay to let the compression ISAPI DLL run.

  1. Open the IIS admin tool (inetmgr); drill into your server, and right-click on “Web Service Extensions”
  2. Choose “Add a new web service extension”. For the extension name, use whatever you want to identify it in the list (I used “HTTP Compression Extension”).
  3. You need to add a single required file, which is \Windows\System32\inetsrv\gzip.dll, the ISAPI responsible for doing gzip and deflate compression.
  4. Check the “Set extension status to allowed”, then click OK.
  5. You should have a new web service extension in your list called “HTTP Compression”, and it should have a status of “Allowed”.

Select compressible content: Metabase Configuration – MetaBase.xml

IIS 6’s compression system only compresses a very limited set of content. You need to enable compression for the appropriate file extensions (specifically, .aspx files for your ASP.NET pages, and perhaps any static content you want compressed as well).

  1. You’re going to edit the Metabase. To do this, you first need to shut down IIS.
  2. In the IIS admin tool, right click on your server name in the left panel, and choose All Tasks -> Restart IIS.
  3. On the restart dialog, choose “Stop internet services” and click OK. When IIS is shut down, you’ll need to edit \Windows\System32\inetsrv\MetaBase.xml.
  4. Search for “IIsCompressionScheme”. There will be two XML elements, one for deflate and one for gzip. Both elements have properties called HcFileExtensions and HcScriptFileExtensions. These contain a space-delimited list of file extension for compressible content.
  5. At a bare minimum, you’ll need to add “aspx”, “ascx” to the HcScriptFileExtensions list. Note that if the properties are left blank, then all content, regardless of file extension, will be compressed.

Note:

Make sure that IIS process user account has required privileges on the compression temp folder.
Do never try to compress image types (jpg, gif, png, tiff etc…)

  • HcDoDynamicCompression. Specifies whether dynamic content should be compressed. This is important because dynamic content is by definition always changing, and IIS does not cache compressed versions of dynamic output. Thus, if dynamic compression is enabled, each request for dynamic content causes the content to be compressed. Dynamic compression consumes considerable CPU time and memory resources, and should only be used on servers that have slow network connections, but CPU time to spare.
  • HcDoStaticCompression. Specifies whether static content should be compressed.
  • HcDoOnDemandCompression: Specifies whether static files, such as .htm and .txt files, are compressed if a compressed version of the file does not exist. If set to True and a file doesn’t exist, the user will be sent an uncompressed file while a background thread creates a compressed version for the next request.
  • HcDynamicCompressionLevel. VAL(1-10) specifies the compression level for the compression scheme, when the scheme is compressing dynamic content. Low compression levels produce slightly larger compressed files, but with lower overall impact on CPU and memory resources. Higher compression levels generally result in smaller compressed files, but with higher CPU and memory usage.
  • HcFileExtensions. Indicates which file name extensions are supported by the compression scheme. Only static files with the specified file extensions are compressed by IIS. If this setting is empty, no static files are compressed.
  • HcScriptFileExtensions. Indicates which file name extensions are supported by the compression scheme. The output from dynamic files with the file extensions specified in this property are compressed by IIS.

The final step is to do an IIS shutdown and restart by right-clicking in Internet Information Services node and then click All Tasks, Restart IIS.

more Information

Types of compression

I first examine the following various types and attributes of compression:

  • HTTP compression. Compressing content from a Web server
  • Gzip compression. A lossless compressed-data format
  • Static compression. Pre-compression, for when static pages are the delivery
  • Content and transfer encoding. IETF’s two-level standard for compressing HTTP contents

HTTP compression

HTTP compression is the technology used to compress contents from a Web server (also known as an HTTP server). The Web server content may be in the form of any of the many available MIME types: HTML, plain text, images formats, PDF files, and more. HTML and image formats are the most widely used MIME formats in a Web application.

Most images used in Web applications (for example, GIF and JPG) are already in compressed format and do not compress much further; certainly no discernible performance is gained by another incremental compression of these files. However, static or on-the-fly created HTML content contains only plain text and is ideal for compression.

The focus of HTTP compression is to enable the Web site to serve fewer bytes of data. For this to work effectively, a couple of things are required:

  • The Web server should compress the data
  • The browser should decompress the data and display the pages in the usual manner

This is obvious. Of course, the process of compression and decompression should not consume a significant amount of time or resources.

So what’s the hold-up in this seemingly simple process? The recommendations for HTTP compression were stipulated by the IETF (Internet Engineering Task Force) while specifying the protocol specifications of HTTP 1.1. The publicly available gzip compression format was intended to be the compression algorithm. Popular browsers have already implemented the decompression feature and were ready to receive the encoded data (as per the HTTP 1.1 protocol specifications), but HTTP compression on the Web server side was not implemented as quickly nor in a serious manner.

Gzip compression

Gzip is a lossless compressed-data format. The deflation algorithm used by gzip (also zip and zlib) is an open-source, patent-free variation of the LZ77 (Lempel-Ziv 1977) algorithm.

The algorithm finds duplicated strings in the input data. The second occurrence of a string is replaced by a pointer (in the form of a pair — distance and length) to the previous string. Distances are limited to 32 KB and lengths are limited to 258 bytes. When a string does not occur anywhere in the previous 32 KB, it is emitted as a sequence of literal bytes. (In this description, string is defined as an arbitrary sequence of bytes and is not restricted to printable characters.)

Static compression

If the Web content is pre-generated and requires no server-side dynamic interaction with other systems, the content can be pre-compressed and placed in the Web server, with these compressed pages being delivered to the user. Publicly available compression tools (gzip, Unix compress) can be used to compress the static files.

Static compression, though, is not useful when the content has to be generated dynamically, such as on e-commerce sites or on sites which are driven by applications and databases. The better solution is to compress the data on the fly.


Content and transfer encoding

The IETF’s standard for compressing HTTP contents includes two levels of encoding: content encoding and transfer encoding. Content encoding applies to methods of encoding and compression that have been already applied to documents before the Web user requests them. This is also known as pre-compressing pages or static compression. This concept never really caught on because of the complex file-maintenance burden it represents and few Internet sites use pre-compressed pages.

On the other hand, transfer encoding applies to methods of encoding during the actual transmission of the data.

In modern practice the difference between content and transfer encoding is blurred since the pages requested do not exist until after they are requested (they are created in real-time). Therefore the encoding has to be always in real-time

The browsers, taking the cue from IETF recommendations, implemented the Accept Encoding feature by 1998-99. This allows browsers to receive and decompress files compressed using the public algorithms. In this case, the HTTP request header fields sent from the browser indicate that the browser is capable of receiving encoded information. When the Web server receives this request, it can

  1. Send pre-compressed files as requested. If they are not available, then it can:
  2. Compress the requested static files, send the compressed data, and keep the compressed file in a temporary directory for further requests; or
  3. If transfer encoding is implemented, compress the Web server output on the fly.

As I mentioned, pre-compressing files, as well as real-time compression of static files by the Web server (the first two points, above) never caught on because of the complexities of file maintenance, though some Web servers supported these functions to an extent.

The feature of compressing Web server dynamic output on the fly wasn’t seriously considered until recently, since its importance is only now being realized. So, sending dynamically compressed HTTP data over the network has remained a dream even though many browsers were ready to receive the compressed formats.

other References:

http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/502ef631-3695-4616-b268-cbe7cf1351ce.mspx?mfr=true

http://www.ibm.com/developerworks/web/library/wa-httpcomp

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: