LothianProductionsCommon

Last updated 2nd December, 2008.

Overview

The LothianProductionsCommon (LPC) library is a simple and easy-to-master framework for solving common enterprise data-access issues that crop up when scaling .NET and J2EE applications.

The project has contributors across the world and is in use in a variety of .NET and Mono production environments. It has been used in both NASDAQ companies and small non-technical organisations.

It offers many valuable facilities, including functionality in these key areas:

ServicesDescriptionNamespace
Database portability and abstractionThe ConnectionProvider provides a great many wrappers for core data retrieval calls, as well as providing a database independent means for instantiating connections. It also provides a number of static SQL and command manipulation functions. LothianProductions.Data
Auto numberingThe Autonumber class exposes an interface for incrementing a custom numeric index field in a given table.LothianProductions.Data
Tiered abstraction layersThe Factory and related Broker classes provide for easy abstraction of an application across multiple tiers.LothianProductions.Data
String, date, numeric manipulationThe StringHelper, DateHelper and MathHelper classes provide various helper functionsLothianProductions.Util
Caching and poolingThe Cache and RelationshipCache classes provide for TTL-enabled detached caching of objects and relationship.LothianProductions.Util.Cache
DaemonThe DaemonScheduler and EagerCacheDaemon implementation provide for the execution of regular, scheduled tasks and pre-emptive caching.LothianProductions.Util.Daemon
ConfigurationThe AppSettingsHelper provides a mechanism for easily obtaining complex configuration data.LothianProductions.Util.Settings
TemplatingLothianProductions.Util.Templating
Action frameworkLothianProductions.Actions

Currently only the C# .NET version of the LPC is supported.

Scope of this document

This document describes and explains the LothianProductionsCommon framework.

Table of contents

Glossary of terms

Technical terminology

TermDefinition
LPC Abbreviation for the LothianProductionsCommon project.
Façade A façade is an engineering pattern similar to an adapter. It allows components to be transparently switched, for instance to allow a service to use multiple data sources.
Model 1 Simple "flat" template modelling. Does not afford distinct roles to components which direct, provide or transform the data being used.
MVC Stands for model view controller and is a factoring pattern that allows an application great flexibility. The model represents the application.s underlying data model, the view represents the applications views on that data and the controller represents the business logic that takes a user between different views of data and performs different operations on it.

Refactoring an MVC application is often exponentially cheaper than a non-MVC application. (See also PAC.)
Model 2 An implementation of MVC factoring for templates.
webapp "Web application".
UML Unified modelling language, the standard visual communication language for software engineers.
Domain analysis Modelling and understanding the domain for a given solution. The domain is the complete area for which the solution is being provided.
DAL Data access layer. The part of an application that is specifically and solely focussed on reading and persisting data.
Component diagram A diagram that is lower-level than a dataflow diagram, providing visibility to software or hardware components and implementation details.
Web Service A Web Service is an XML and HTTP based interface which provides a simple consistent mechanism for working with data.
Agile Agile development is a recent software engineering methodology. It provides for safer development than many of the older methodologies, and can deliver business value earlier.

Extreme Programming is one of the main agile methodologies.

FAQ

What does a typical request lifecycle look like when using the action framework?

This diagram shows a typical request's lifecycle when running on the .NET platform.

Is there a flow-chart summarising the cache decision flow?

See here.

Why does the LPC differentiate between domain and relational data?

Differentiation between domain and relational data

The LothianProductionsCommon (LPC) framework provides highly scalable search and retrieval functionality. Key to this is the underlying differentiation between domain object data representing Cacheable objects and domain object relational data representing their relationships.

Need for flexible primary keys

Cacheable objects must have a simple primary unique key (Id), and the framework assumes that this can be represented by a long integer in most cases (domains where objects cannot be keyed by a numeric ID can be supported through use of the MathHelper.s BaseDecToBaseN methods -- if implemented at factory level this allows for IDs to use a 37-byte alphanumeric base).

Partial and complete cache-assisted domain object retrieval

Where an object.s primary key is known, it can be fetched simply by using the appropriate broker.s Get or Gets method (users should note that polymorphism is not exploited by the broker templates as it cannot be modelled in WSDL when exposing brokers to SOAP requests). This will retrieve the objects required by primary key, and can pass through the cache on an object-by-object basis, optimising requests for groups of objects that are only partially cached.

Get and Gets calls the only way to retrieve objects

This Get or Gets broker call is the only way to retrieve domain objects, and it will reference the underlying application configuration, bringing out the data manipulation language (DML). When using SQL, for instance, the DML may read as:

SELECT id, name, description modified FROM test WHERE {id}

Scheme-based optimisation of Gets calls

The factories will generate valid DML from this template string. In the case of SQL queries, there are various layers of optimisation determined by the ConnectionScheme being used. Firstly, when using Gets to retrieve multiple domain objects from a database, the scheme will decide whether using OR or IN statements is most effective for the given database type.

SELECT id, name, colour FROM test WHERE id = 1 or id = 2 SELECT id, name, colour FROM test WHERE id IN (1, 2)

The scheme will also split large Gets calls where required by the underlying data provider. For instance, if a Gets call is querying for 40,000 objects and the data provider only permits 15,000 bound variables per statement, the scheme will transparently split the query into four and merge the results. Additionally, if the data provider has a query length limit -- for example, of 2,000 characters per query -- the scheme can also transparently split and remerge the query.

Why separate relationship resolution?

Consider the following queries run sequentially on a static dataset:

SELECT id, name, colour FROM table WHERE colour IN (.red., .blue.) SELECT id, name, colour FROM table WHERE id IN (.red.)

Running and caching results of both could lead to an unnecessary connection and query being made, as well as the unnecessary transfer of data from data provider to application server, and the unnecessary use of cache space for what are duplicate results.

By splitting out the relational information in the queries, the LPC has an opportunity to cache data more effectively with minimal overhead. For instance, the following query will return only the primary key of the data to be retrieved.

SELECT id FROM table WHERE colour IN (.red., .blue.) = 1, 2, 3, 4, 5

These IDs can then be loaded if not cached, and cached appropriately.

SELECT id, name, colour FROM test WHERE id IN (1, 2, 3, 4, 5)

It is likely the next query will contain duplicate IDs.

SELECT id FROM table WHERE colour IN (.red.) = 1, 2, 3

And in this case it will turn out that we have already retrieved and cached objects #1, #2 and #3. In this particular case we can see that the obviated Gets query does not look that costly -- but what if 99% of Gets would be looking for object #2, and what if each object has many large fields?

Managing relationships

Having established that domain objects can only be retrieved by a Get or Gets call, and that they can only be retrieved by a unique primary key, the next step is to define a search framework to resolve the relationship between terms as above like "colour IN (.red., .blue.)" and a set of results as seen, like "1, 2, 3, 4, 5". In addition to this, it is clear that as well as caching domain objects, being able to cache relationship resolution queries and their answers would be beneficial.

SearchCriteria objects contain collections of SearchTerms in the LPC, allowing users to build simple queries as above, where "colour IS red OR blue", and more complex queries with multiple sort orders, fuzzy matching and layers of Boolean logic. SearchCriterias are also sensibly used to key the cache of resolved relationship queries.

When handed to the GetRelationship call, the criteria will be dynamically transformed and interpreted to suit the data provider being used. This allows for dynamic changes in data provider type, and portability -- the LPC can be moved from a MySQL back end to an Oracle one despite substantial SQL format differences, or even backed onto a set of XML files with a SQL to XQL transform. GetRelationship will return a Relationship object, which provides a primitive ordered list of IDs and accompanying lists of relevance scores (used when the underlying query engine can provide such information) and modification timestamps, as well as the SearchCriteria used to generate the relationship. The modification timestamps are used by the caching mechanism to assist with object expiry. If we can see in the relationship query that the object has been modified since it was last cached, we know the cache can be invalidated for that particular object.

Start to finish, retrieving domain objects from a query

  1. SearchCriteria is created and populated with SearchTerms.
    1. colour IN (.red., .blue.)
  2. SearchCriteria is handed to GetRelationship call in broker.
    1. RelationshipCache is searched for existing cached Relationship.
    2. DML is retrieved from application configuration ("lpc/dml/[Factory.s name]:Criteria[Criteria.s name]"): SELECT id, modified FROM test WHERE {criteria}
    3. Criteria phrase is translated appropriately: SELECT id, modified FROM test WHERE colour IN (.red., .blue.)
  3. Broker returns Relationship object.
    1. 1, 2, 3, 4, 5 (etc.)
    2. Relationship is cached in RelationshipCache.
  4. Relationship.s IDs can be passed into broker.s Get or Gets call to retrieve domain objects.
    1. Cache is searched for existing cached domain objects.
    2. DML is retrieved from application configuration ("lpc/dml/[Factory.s name]:Gets"): SELECT id, name, colour FROM test WHERE id IN (1, 2, 3, 4, 5)
    3. Domain objects are cached in Cache.
  5. Cacheable objects are returned.

Changelog

2.5 (May 2008)

  • Lite DML autogeneration framework for improved ORM on Cacheable objects
  • Query interception and analysis for all DML statements; integrated reporting
  • New configuration mechanism for brokers, factories & caching; much simpler to manage
  • Rewritten HttpAction framework with regexp-based URL-space support
  • Primitive HTTP proxy server implementation
  • Enhanced custom handler support for configuration relocated over HTTP or into assemblies or resources

2.4 (October 2007)

  • Native HTTP/1.0 client & server implementation, including gzip & easy upload tools
  • Reusable licensing & activation framework
  • Audio, video & graphic manipulation libraries
  • Cluster support & load-balance mechanisms: remote cache manipulation
  • Enhanced monitoring & reporting tools, for stand-alone & cluster monitoring
  • More CMS functionality: address generator module for rendering SEO-friendly URLs
  • Some meta-search framework tools

2.3 (April 2007)

  • Integrated connection pooling (beneficial for non-pooling OleDb or Odbc implementations)
  • Improved caching of relocated configuration files
  • Re-write of templating code & introduction of conditional directive parser tags

2.2 (November 2006)

  • Combinatorial relationship functions
  • Seamless support for external configuration files in common .Config format
  • Robust error-handling HttpModule with email notification
  • Introduction of "{criteria}" & "{dependancy}" placeholder in .config
  • New ActionHttpHandlers supporting sessions: RoSessionActionHttpHandler, RwSessionActionHttpHandler
  • Deprecation of static datatype manipulation calls in ConnectionProvider; moved to schemes
  • Configurable ActionHttpHandler action types

2.1 (27th March 2006)

  • Introduction of new connection schemes
  • Support for native .NET 2.0 connection declaration

2.0 (24th November 2005)

  • .NET 2.0 support
  • .config properties refactored down into top-level "lpc/" configuration section

1.2 (26th April 2005)

  • Oracle sequencing support
  • Start of XmlHelper and broader XML support
  • Support for named datasources
  • Factored out Oracle and MySql specific code
  • Visual FoxPro support patches
  • Greater Web Service support
  • Cache optimisations

1.1 (29th October 2004)

  • Tracing and logging optimisations
  • Search framework improvements: IsNull and Like, ordering bug-fixes
  • Visual FoxPro support

1.0 (23rd February 2004)

  • Working domain object and relationship caching
  • Passes all NUnit tests

pre 1.0 (mid 2003)

  • 2003: Migration from J2EE.
  • 2002: Ported version becomes the backbone of sites for major .com. Optimisation takes page costs to sub-10ms on low-end SPARC devices.
  • 2001: The system is first conceived and implemented in Java. Prototype successfully employed by major .com to launch 300,000 page high-performance webapp.

Licensing & consultancy

Please contact us for details.

Contributors

  • Aidan Fitzpatrick
  • Andrew Dancy
  • Kevin Fisher
  • Magnus Solvåg
  • Trond Førde
  • David Santoro
  • Ken Fassone