eDirectory Indexes: More Than Just A Good Idea

For those new to NetIQ/Micro Focus’ eDirectory LDAP compliant data directory, eDirectory uses “indexes” to boost performance. Basically, an index is a cached collection of values or the presence of values for an attribute in the data store. Administrators can create an index within the directory for any attribute in the directory schema and can create more than one index for the same attribute. By default, eDirectory automatically indexes a set of common or core attributes in the default schema but administrators can extend that schema and the list of attributes indexed.

So how do the indexes help eDirectory?

Essentially, each index defined is cached into the server’s memory based on the index settings (value vs presence). Having these attributes/indexes cached into the system memory allows eDirectory or other LDAP processes to more quickly and efficiently read that data.

Think of it like someone asking you a question and you know the answer off the top of your head (cached index) versus you need to look up the answer (no index). The cached answer will come much quicker than the non-indexed answer.

So how do I create an index in eDirectory?

The standard approach is to use the NetIQ/Micro Focus administration tool, iManager, that comes with eDirectory. This is a web-based application running through Tomcat that can be installed as part of the eDirectory installation.

Note: There is also a stand-alone version of iManager that is a self-contained Java application that can be run without installing Tomcat or iManager. Typically, iManager is installed on the eDirectory server(s) but such an installation is not required. The stand-alone iManager is generally available for download through the NetIQ/Micro Focus website.

After logging into iManager, go to the “Roles and Tasks” section of the interface and expand the “eDirectory Maintenance” node from the menu running down the left side of the screen. Next, click the “Indexes” link. If there is more than one (1) eDirectory server in the tree, click on a server to manage the indexes for that server and that server only.

Note: A lot of information will replicate between eDirectory servers but indexes are not one of them. Each eDirectory server has its own list of indexes so if you want to add the same attribute on each server then you will be required to do so manually using the steps listed here (or in the eDirectory administration documentation from the vendor) for each server in the tree.

After selecting your eDirectory server (or if only one server is in the tree), iManager will display a list of the current indexes defined for that server. To create a new index just click the “Create” button found at the top of that list. From there you will be prompted to enter a name for the index, select the attribute to be indexed from the list of attributes in that server’s schema, and the “Rule” for the index to indicate whether you want the value indexed or just an acknowledgment of whether the attribute is populated or not (presence).

Click the “OK” button on that screen followed by clicking the “Apply” button at the bottom of the window and the index will be added to that server. To add additional indexes just repeat the steps as necessary remembering to add any indexes to other servers in the tree by selecting the other tree(s) when prompted since indexes do not replicate.

How long does it take for a new index before it is cached?

That varies based on a number of factors. Firstly, server resources play a part. A server with more resources can compile that information faster than a server with fewer resources. Secondly, it depends on the amount of data being indexed. Logically it would take less time to index 500 values than it would 500,000. But generally, it is usually expected that a new indexed attribute would be available in the cache between 15-30 minutes after the index is added.

Note: Index caches are flushed when a server is rebooted since the cache is held in system memory. Depending on the number of indexes defined and the amount of data being scanned in the directory, it can take several minutes for the caches to be fully rebuilt. I have seen some indexes take between 30-45 minutes to fully rebuild after a server update.

Should I index every attribute in my directory?

Typically not. Indexes should be reserved for attributes that commonly queried from the directory. Indexing every attribute is likely to require the server(s) to have large amounts of RAM to hold that level of cached data while many of the indexed values would be unused. This would be a waste of server resources and could potentially overload the resources on those machines which could lead to a crash of the services.

What should be indexed?

There are some obvious attributes that should always be indexed, like CN. Almost every process involving eDirectory leverages an object’s CN so by indexing it, the system can respond to requests for and with that data more quickly. Of course, you won’t have to index the CN attribute since that is one of the many attributes indexed by default when eDirectory is installed.

But, beyond the defaults, it is up to the administrators of your system to determine what needs to be indexed since processes and interactions will vary from implementation to implementation. The standard for consideration to be indexed is based on the volume of requests for such data. Simply put, any attribute in your eDirectory that will be queried as part of an authentication, authorization, or even for display on a regular or constant basis should be indexed.

This consideration goes beyond the single directory object-level. If you have processes that query individual objects a lot, then those attributes should be indexed. And, if you have processes that do large queries to show all users with a specific last name, email address domain, telephone area code, etc. then those attributes should be indexed as well.

What happens if I don’t index an attribute that is commonly queried?

Depending on the attribute and the queries related to it, the outcome could vary.

If the query only happens a few times in a short period and the server isn’t under any type of load, then nothing may happen and you may not see any sort of noticeable impact.

Note: This is common during testing cycles where smaller amounts of data are being used or a smaller group of testers are involved. During these testing cycles, everything may appear to work fine so no indexes are thought to be needed but then when the solution is deployed to a bigger audience it may experience issues or even cause issues for the whole environment.

If the server is under load or multiple queries are running simultaneously against unindexed attributes, then some of the queries my experience long delays on responses while others may time out. Some queries may return quickly as those queries get processed before the other queries are executed. It could be a hodgepodge of results for each query.

If you have several (like 3,000 or more) queries hitting at the same time using unindexed attributes, then it will likely cause a crash of the eDirectory service. When a request comes in for an unindexed value, eDirectory has to search for that value and return it, which is a longer, more resource-intensive process. If there are several such requests, then it will likely cause the eDirectory service to consume all of the processor on the server which in turn will lead to eDirectory being non-responsive.

Indexes are not just nice to have, they are a necessity. But, because indexes are often times forgotten, thanks to their behind-the-scenes nature, indexes are missed as environments grow and expand. Developers will extend schemas, create new processes, and/or define additional criteria for common things like logging into an application that calls upon new data or old data in new ways. In some instances, everything works fine but it could be better if indexes were properly leveraged. And in others, the new processes fail when deployed to the masses because indexes were not added when they were needed.

Anytime a process that interacts with eDirectory is being defined or modified, it is a good idea to know what is indexed and what isn’t so any new attributes can be added to avoid unwanted delays, failures, or even outages.