The Problem with Old Cache

This was the question our development team had to answer to resolve a fairly significant problem we had discovered with Oracle’s Access Manager.

Picture this.  You are a bank customer with an online account who logged on to the site and is prompted to change your password.

You perform a successful password change, but need to walk away from your computer for a few minutes, so you log off.  You come back within 10 minutes and decide to resume your online activities.  Upon providing your user id and new password, you receive an error message that your credentials are invalid.  You just changed it a minute ago; you clearly recall the new password value.  This must be a mistake, you think, let’s try again.  And again, the same error…, “We are sorry, your information is incorrect, please call our customer service department and wait approximately 45 minutes so we can help you”.

So, while waiting to speak with the customer service, ready to provide all kinds of security questions, including blood and tissue samples, you decide to try again.

You enter your user id and your new password and BINGO!!! You are logged on successfully.  Pretending you didn’t just waste 45 minutes of your life, you resume your online session activities.

What happened?  A few minutes ago it didn’t work, and now it does.

Our team was puzzled.  It passed System Integration and Capacity Testing and no one noticed this?  Here we are in Production and with 30 logons per second and average 15K concurrent sessions; calls were flooding our call centers.

THE PROBLEM

Access Manager’s default cache flush is set to 15 minutes causing a delay in the refresh with new values from the Directory. The web page performed an IDXML call, which initiated the change request and subsequently updated the user’s password in the Directory.  However, the Access Manager’s cache still contained the original password.  When logging back online the Access Manager’s authorization would fail the logon request because of the stale OAM cache.

The way to determine if this was in fact a cache issue is to manually flush the user’s cache using the Access System Console.  Once flushed, if the user can now login successfully, then the issue was more than likely the stale cache.

THE FIX

Assuming this is a cache issue one method to correct this is to reduce the cache size in the Web Gates from 100,000 to 100.  This approach forces increased updates to the directory Directory, however in a high-traffic site environment, going to the directory for information can significantly degrade server performance.

The better solution is to modify the doAccessServerFlush configuration from “FALSE” to “TRUE”.  This signals that the AccessGate client has been configured on the OIS server and it can now begin to send user flush requests to the Access System, using the Access Manager API.

To do this, navigate to the Identity_server_installation_directory/oblix/data/common directory and update the file named: basedbparams.xml

Below is an example of the file:

<?xml version="1.0"?>
 <ParamsCtlg  CtlgName="basedbparams">
 <CompoundList ListName="">
 <SimpleList >
 <NameValPair   ParamName="default_policy" Value="false"/>
 <NameValPair   ParamName="doAccessServerFlush" Value="true"/>
 <NameValPair   ParamName="enableAllowAccessCache" Value="true"/>
 <NameValPair   ParamName="SelfRegGeneratesSSOCookie" Value="false"/>
 <NameValPair   ParamName="SR_SSOCookieMethod" Value="GET"/>
 <NameValPair   ParamName="SR_SSOCookieURL" Value="/identity/oblix"/>
 <NameValPair   ParamName="SR_SSOCookiePath" Value="/"/>
 <NameValPair   ParamName="criticalReadForPostModify" Value="false"/>
 </SimpleList>
 </CompoundList>
 </ParamsCtlg>