Sources

Defining a Source

A source defines the structure of the data in the data source. Similar to a database table, a source has fields and primary keys. Sources configuration can be found in PENROSE_HOME/conf/sources.xml.

<sources>

  <source name="u">

    <connection-name>MySQL</connection-name>

    <field name="username" primaryKey="true" />
    <field name="firstName" />
    <field name="lastName" />
    <field name="password" />

    <parameter>
      <param-name>tableName</param-name>
      <param-value>users</param-value>
    </parameter>

  </source>

</sources>

To specify a source, you need to specify the followings:

Name

Specify the source name. This will be used in creating mapping.

Connection Name

Specify the connection name used by this source.

Fields

Specify the field name that will be accessed by this source. For JDBC sources, this will mean the table columns. For JNDI sources, this will mean the attribute types.

Parameters

Specify the connection-specific parameters. See below.

JDBC Source Parameters

Parameter Description Required Example
tableName Database table name (Penrose 1.0) Yes users
catalog Database catalog name (Penrose 1.1) No example
schema Database schema name (Penrose 1.1) No system
table Database table name (Penrose 1.1) Yes users
filter Search filter No lastName = 'Smith'

LDAP Source Parameters

Parameter Description Example
baseDn Search base DN dc=penrose,dc=safehaus,dc=org
scope Search scope OBJECT, ONELEVEL, or SUBTREE
filter Search filter (objectClass=*)
objectClasses Comma-separated list of object classes for newly added entries person,organizationalPerson,inetOrgPerson

Data Loading

There are 2 types of data loading mechanisms:

  • Load everything at once (default)
    This is faster for small database where data can be loaded quickly into memory. It will load the full data including the primary keys in one operation.
  • Search the primary keys first, then load as needed
    This is more scalable for larger database. The data source will be queried first to get the primary keys, then it will only load the full data of entries that don't exist in the cache.

Data loading can be configured by adding the following parameters:

Parameter Description Valid Values Default
sizeLimit Size limit integer 100
loadingMethod Loading method loadAll, searchAndLoad loadAll

Cache

Each source has 2 caches:

  • Filter cache
    It stores the primary keys resulting from search operations.
  • Data cache
    It stores the data resulting from load operations.

When Penrose is about to search the data source using a search filter, first it checks the filter cache. If the filter is not in the cache, it will perform the search operation, then stores the resulting primary keys into the cache.

When Penrose is about to load the data source using a set of primary keys, first it checks the data cache. If any of the requested data has not been loaded, it will perform the load operation for those missing data, then stores results into the cache.

To configure the cache, add the following parameters:

Parameter Description Valid Values Default
filterCacheSize Filter cache size integer > 0 100
filterCacheExpiration Filter cache expiration (in minutes) integer >= 0 5
dataCacheSize Data cache size integer > 0 100
dataCacheExpiration Data cache expiration (in minutes) integer >= 0 5

You can set the cache expiration to 0 to disable the cache. In this case all requests will always be performed against the datasource.