Tuning Performance
Penrose can exhibit wildly different performance characteristics depending on the mapping and the data source configurations. Please take the time to read this page as TODO
Penrose Internals Overview
Before starting to diagnose performance problems, you need to know what are the basic components of Penrose and how do they interact to realize a particular use-case.
TODO: This section needs verification and cleaning up.
Query
- A client binds to a LDAP listener, passing his credentials.
- Something happens and the authentication gets delegated to org.safehaus.penrose.apacheds.PenroseAuthenticator which compares the credentials with some data from the Penrose data-store.
- A client issues a search command, specifying base DN, filter and attribute list.
- ApacheDS looks for a partition containing this base DN. If the query would be handled by penrose, the partition would be of type org.safehaus.penrose.apacheds.PenrosePartition
- ApacheDS delegates the search to the so found partition:
- Penrose finds the entry and if it is a static entry immediately returns the data. If the entry not a static entry, then it is a dynamic entry and has to be evaluated on-the-fly.
- Penrose checks the query cache - the query cache maps search parameters to results. This means that if you perform the same search twice, the second time Penrose will bypass all the next steps.
- Penrose reads and parses the dynamic entry definition and queries the data
- Penrose checks the data-source cache - the datasource cache maps rows of data to their primary key. If the needed data can be found here, the next step is skipped.
- Penrose uses the configured connection to load the data from the underlying database or directory.
- The loaded data is pushed into the data-source cache.
- The data is passed to the join engine which builds the actual result entries
- The join engine copies the values of the constant attributes directly from the schema.
- The join engine copies the values of the variable attributes from the corresponding attribute of the matched source object.
- For each expression attribute, the join engine creates a new Beanshell interpreter, binds the source objects to beanshell variables as configured in the 'alias' attribute in the mapping descriptor, and evaluates the script.
- The result entries are cached in the query cache
- ApacheDS sends the results to the client
Update
TODO describe the update operation
Modules (intercepting operations)
A module can be registered to perform custom actions before and after each LDAP operations. The custom actions can manipulate the data (both input and output) or send notification to 3rd applications. Keep your modules light.
Tips
- Always configure the reverse mappings for each attribute that you might use in a search filter. If you don't you will suffer a severe performance penalty.
- When determining the cache sizes and timeouts, keep in mind the following factors:
- TODO
- If you can afford to create a new view and do the joins in the database, you'll get noticeable performance boost. (The rationale is the same as the next item.)
- If you have a choice to normalize the datain in the source or in the Penrose mapping, prefet to do it in the source. This way, your caches contain normalized data and Penrose wouldn't have to re-evaluate it for every search. For example, if you database uses CHAR(50) to keep the user names and you need to strip the spaces, you would get better performance by creating a view in your database, where you rtrim() the column instead of using the 'user.userId.trim()' as a beanshell expression.
A real story
Our company needed to expose parts of our user database via LDAP, so we can integrate with a 3rd party product. Our user database was kept on Sybase ASE server and the particular data was held in 2 tables (actually a table and a view.) We needed to expose only the users with value true in the user_active column and join them with the LastUserPassword view. There would be two kinds of searches - one would fetch the RDNs of all active users, the other would fetch all attributes of a single user given its userId.
First try
My initial solution was to define a separate data source for each table and join them in the dynamic entry. That worked fine as far as all the data was there, the only problem was that listing 64 entries took around 20 seconds on a 2 CPU Athlon 2.6Ghz and one of the CPUs was constantly maxed out. Fetching the attributes of a single user was taking about 5 seconds. On our QA box (4 CPU SPARC 450Mhz, 4Gb ram, Solaris 8.5), looking up ~1000 users was taking well over a minute and fetching the attributes for a single user took around 20 seconds.
Datasource denormalization
My next step was to create a new view in the database, which contained only the data I need. The concatenation of the givenName and sn (to form the cn) was also done in the database. This gave us a 30% to 40% performance boost. Listing 64 RDNs on the 3.5Ghz Athlon still took ~40 seconds and accessing a single user with the same configuration was taking ~15 seconds. With larger database, the situation was getting progressively worse.
Then I did some profiling. It turned out that half of the time was spent creating and setting variables in the Beanshell interpreter, which given the fact that I wasn't using any of the mapping features was a waste. Another big chunk of the time was spent evaluating debug logging output which again wasn't printed.
Enter Penrose team
This was the time when I contacted the Penrose team. I'm happy to say that they were very quick to respond and took their jobs seriously. On the next morning, I had a nightly build which had the beanshell interpreter issues fixed. Unfortunately, it didn't gave us the expected performance boost.
In this optimized build, getting 64 entries took 13 secs. and from the log files, we could see that on the 3rd second we got the data from the JDBS, so the remaining 10 seconds were obviously spent somewhere in Penrose. Fetching the attributes for a single user took ~3 seconds (it should have been instantaneous). From the log files, it seemed that even when querying by DN, Penrose was doing linear scan, which was definitely a bad smell.
Revelation
Two days later, Jim called me again to try the new nightly build. He said that he had made some optimizations which would skip the joining part of the query and ome other stuff and by the way he mentioned to make sure that I have configured my reverse mappings.
First I tried the build with my old config and it was kind of faster, perhaps 20-30%. The real shock was when I configured the reverse mappings (click on the thumbnail below). Now, the reverse mappings is something that you use when you want to specify how to update your database through the Penrose provided LDAP interface. It turned out that they were used by the query optimizer as well and made the query by DN a near-constant speed operation.
Epilogue
Here are some final performance stats, measured on 11/17/2005 on 4 CPU 450 MHz SPARK Ultra 4500, Solaris 8.5 server. As a baseline we used the iPlanet directory server which stored our email addresses.
We measured the time taken from the moment we sent the request till the moment we get the first response and the time from the moment we sent the request till the moment we got the whole responses. Both times were measured for cold and warmed up cache.
List all entries
| Database | First Result - Cold Cache |
First Result - Warm Cache |
All Results - Cold Cache |
All Results - Warm Cache |
|---|---|---|---|---|
| 738 entries | 11 sec. | immediately | 20 sec. | 4 sec. |
| 1161 entries | 6 sec. | immediately | 20 sec. | 5 sec. |
| iPlanet - 78 entries | immediately | 2-3 sec. |
Query All Attributes for an Entry
| Database | Cold | Warm |
|---|---|---|
| 738 entries | 2 sec. | 1 sec. |
| 1161 entries | 1 sec. | 1 sec. |
| iPlanet - 78 entries | 1 sec. |
NOTE: All times were measured with stopwatch and Softerra LDAP Browser. For the warm cache, I beleive that the time taken by the LDAP browser GUI skews the results.
