Thursday, March 29, 2012

Adventures in Coding: Web Caching and C#

Continuing on my "lame-man's" journey down the road of application development on the .NET platform with C#, I thought I would discuss one of the more interesting lessons I have learned thus far: System.Web.Caching.

I was developing a custom Corporate Directory application for a customer using ASP.NET and C#. This application pulls data from a remote repository and presents it to the Cisco IP Phone as XML. One issue that was immediately apparent was that I needed to cache results or drive the users mad with the lag in rendering the data. In steps System.Web.Caching and all of the goodies that come with it.

Background

As noted in the intro, I recently finished developing "version 1.0" of a custom corporate directory application. This application is for use with Cisco Unified Communications Manager (CUCM) and Cisco IP Phones. If you have a Cisco UC deployment then you know what I am talking about.

You can read more about the application here. What I wanted to focus on this week was a routine used for caching (and managing) the raw data used for sending corporate directory information to the Cisco IP Phone.

The Problem

Since the topic is "caching" surely the problem is obvious: poor application performance. In my application I was retrieving information from an external source. Specifically, the CUCM server using the AXL/SOAP API. The application spawns a SQL query to the CUCM server, processes, filters, and sorts the response. Then the resulting data is rendered as XML for the Cisco IP Phone client.

I discovered pretty quickly that the transaction time involved with running the query remotely was just too long. In reality, we were dealing with only a handful of seconds. However, in today's computer world a second is a very long time. So, I needed a solution and caching seemed to be the right answer.

As Microsoft eloquently puts it, "Caching is a technique widely used in computing to increase performance by keeping frequently accessed or expensive data in memory. In the context of a Web application, caching is used to retain pages or data across HTTP requests and reuse them without the expense of recreating them."

The Solution

With some quick Google searches I was turned on to the System.Web.Caching cache class. After doing some research I determined that I was on the right track. I was able to incorporate the appropriate methods into my program logic and it was relatively painless. So, let's get to it.

First and foremost, I put some config parameters in my web.config file to control the caching. I decided I wanted to have a variable that can toggle whether caching is enabled and another variable to control the expiration timer for the cache. Since I can have more than one source feeding my directory information, I wanted to cache the data sources independently (as opposed to caching the aggregate).

In my application, there is a point in time where the client provides everything that is needed to do the heavy lifting. This is where we'll start our example:

protected string getDirectoryList(string mysrc)
{
   //some pre-processing code not shown
   string cacheName = ConfigurationManager.AppSettings[mysrc + ".cache.name"];
   bool blnCache = System.Convert.ToBoolean(ConfigurationManager.AppSettings[mysrc + ".cache"]);
   double cacheTimeOut = 0;
   string foo="";
   try
   {
      cacheTimeOut = System.Convert.ToDouble(ConfigurationManager.AppSettings[mysrc + ".cache.duration"]);
   }
   catch (Exception)
   {
      cacheTimeOut = 720;  //default to 720 just in case we catch an error trying to reference config
      //do some error handling
   }
   
   //some special sauce stuff not shown here
   if (blnCache)
   {
     foo = getDirCache(mysrc, cacheName, cacheTimeout);
      //more special sauce stuff
   } else {
      //non-caching method. not shown.
   }
   // do whatever is necessary with foo.
   return foo
}


So, what we have thus far is a little bit of code to read control variables from web.config. We set the cache name, determine whether we should cache or not and then set the cache timeout. All of the caching "magic" is handled by getDirCache().
protected string getDirCache(string mysrc, string mycachename, double myTimeout)
{
  // we need to retrieve the current cache object from memory
  object cacheItem = Cache[mycachename] as string;
  
  if (cacheItem == null)
  { // our cache has expired or needs to be initialized
    string myqry=ConfigurationManager.AppSettings[mysrc + ".query"]; //this is the SQL or LDAP query
    if (SendQueryRequest(myqry))
    {  //our query request was submitted and processed successfully. 
       //SendQueryRequest() puts data in a global variable. This variable is processed 
       // Skipping secret sauce stuff. The resulting data set is a string of XML data
       // stored in cacheItem. Once we have the data we want to cache, let's cache it
       Cache.Insert (mycachename, cacheItem, null, DateTime.Now.AddMinutes(myTimeout), System.Web.Caching.Cache.NoSlidingExpiration);
    }
  }
  return (string)cacheItem;
}

The getDirCache() function is basically checking to see if the cache object is null and if it is then it will initialize the cache. The SendQueryRequest() function is basically doing all of the heavy lifting needed to get the data. We process the data as needed and then get ready to insert it using the Cache.Insert() method. 

The Cache.Insert() method has several parameters

  • key  [in our sample this is mycachename]
    • Type: System.String
    • The cache key used to reference the object.
  • value   [in our sample this is cacheItem]
    • Type: System.Object
    • The object to be inserted in the cache.
  • dependencies   [in our sample this is null]
    • Type: System.Web.Caching.CacheDependency
    • The file or cache key dependencies for the item. When any dependency changes, the object becomes invalid and is removed from the cache. If there are no dependencies, this parameter contains Nothing.
  • absoluteExpiration  [in our sample this is myTimeout]
    • Type: System.DateTime
    • The time at which the inserted object expires and is removed from the cache. To avoid possible issues with local time such as changes from standard time to daylight saving time, use UtcNow rather than Now for this parameter value. If you are using absolute expiration, the slidingExpiration parameter must be NoSlidingExpiration.
  • slidingExpiration   [in our sample this is flagged to not be used]
    • Type: System.TimeSpan
    • The interval between the time the inserted object was last accessed and the time at which that object expires. If this value is the equivalent of 20 minutes, the object will expire and be removed from the cache 20 minutes after it was last accessed. If you are using sliding expiration, the absoluteExpiration parameter must be NoAbsoluteExpiration.
  • priority
    • Type: System.Web.Caching.CacheItemPriority
    • The cost of the object relative to other items stored in the cache, as expressed by the CacheItemPriority enumeration. This value is used by the cache when it evicts objects; objects with a lower cost are removed from the cache before objects with a higher cost.
  • onRemoveCallback
    • Type: System.Web.Caching.CacheItemRemovedCallback
    • A delegate that, if provided, will be called when an object is removed from the cache. You can use this to notify applications when their objects are deleted from the cache.
I am currently testing the onRemoveCallback delegate but will save that for another blog.

IIS Configuration Notes

In all of the examples I found on the web the caching examples were pretty straightforward and basically covered what I have done here. Though, there are some more in depth discussions out there to be sure. One of the things that was missing in all examples was the steps needed in IIS to get the caching to behave the way you want.

In my scenario, I was all proud of myself for getting the caching worked out. It was behaving as desired and netted a marked performance gain. So, concept works and next comes validation. To validate, I ran a series of tests where IP phones would query the application. I inserted a delay between tests and incremented this delay by 5 minutes with each successive test. So, the time elapsed between test_1 and _2 was 5 minutes, the time elapsed between test_2 and _3 was 10 minutes, between test_3 and _4 was 15 minutes, and so on. The assumption I made is that there could be up to two hours between successive queries from real users in production. 

What I found out was that at 20 minutes, the application performance dropped dramatically and the getDirCache() routine was reporting the cache object was null and, therefore, rebuilding the data. 

Sidebar: Now, this is one of those times where experience comes in handy. When you are new at something (as I am with this C#/ASP.NET thing) the tendency is to spin around like some jazz fusion drummer. All linear rhythm goes out the window. Fortunately, I am used to doing new things so I don't jump to any conclusions because I assume there is a repeatable and logical explanation. So, I repeat the test and find that 20 minutes is the magic number. Now what?


So, we know that the web application itself isn't introducing any 20 minute expiration (on purpose or by accident). We also know that the application is running on IIS and IIS has all sorts of nifty parameters at all levels of the environment. To keep some sanity in my troubleshooting exercise, I decided to check the Application Pool first (you know working from the top down, as it were). 

Checking the Application Pool was a good first choice because I found the solution in the Advanced Settings of the application pool. Specifically, I found that the Application Pool has several parameters that control behavior of worker processes. See the image below.


The parameter of interest is highlighted in the above figure: Idle Time-out (minutes). This parameter basically defines the amount of time that a worker process will stay resident during an idle period.

As it turns out, the memory pool used by our app to cache data is tied to a worker process. So, in the event the idle time-out expires the worker process is shut down and any cached memory is returned to the available memory pool. I did some more testing and found that increasing this timer was the solution. I read that setting it to zero essentially disables the timer. I didn't want to go that route. Instead, I picked a reasonable time based on known quantities and increased the timer accordingly. 

Conclusion

So, the lessons I learned from this process:
  1. Caching data dramatically improves performance
  2. System.Web.Caching is pretty painless to use and works as advertised
  3. Store your cache in a non-volatile location or, at the very least, make sure IIS is tuned to work with your caching strategy and not against it



Thanks for reading. If you have time, post a comment!

No comments:

Post a Comment