Java Persistence/Advanced Topics

=Advanced Topics=

Events
An event is a hook into a system that allows the execution of some code when the event occurs. Events can be used to extend, integrate, debug, audit or monitor a system.

JPA defines several events for the persistent life-cycle of  objects. JPA events are defined through annotations or in the orm.xml. Any method of a persistent class can be annotated with an event annotation to be called for all instances of that class. An event listener can also be configured for a class using the annotation or   XML element. The specified listener class does not need to implement any interface (JPA does not use the Java event model), it only needs to annotate its methods with the desired event annotation.

JPA defines the following events:
 * - Invoked after an  is loaded into the persistence context, or after a   operation.
 * - Invoked before the  operation is invoked on an  .  Also invoked on   for new instances, and on cascade of a   operation.  The   of the object may not have been assigned, and code be assigned by the event.
 * - Invoked after a new instance is persisted to the database. This occurs during a   or   operation after the database   has occurred, but before the transaction is committed.  It does not occur during the   operation.  The   of the object should be assigned.
 * - Invoked before an instance is updated in the database. This occurs during a   or   operation after the database   has occurred, but before the transaction is committed.  It does not occur during the   operation.
 * - Invoked after an instance is updated in the database. This occurs during a   or   operation after the database   has occurred, but before the transaction is committed.  It does not occur during the   operation.
 * - Invoked before the  operation is invoked on an  .  Also invoked for cascade of a   operation.  It is also invoked during a   or   for   in JPA 2.0.
 * - Invoked after an instance is deleted from the database. This occurs during a   or   operation after the database   has occurred, but before the transaction is committed.  It does not occur during the   operation.

Default Entity Listeners
It is also possible to configure a default Entity listener. This listener will receive events for all of the Entity classes in the persistence unit. Default listeners can only be defined through XML.

If a default Entity listener is defined, and a class wants to define its own listener, or does not want the default listener, this can be disabled using the annotation or   XML element.

Events and Inheritance
Entity listeners are inherited. If a subclass does not wish to inherit a superclass Entity listener, then it must define the annotation or   XML element.

Events and Embeddables
JPA does not define any events for s.  Some JPA providers may allow defining events for   objects.

Extended Events
The JPA events are only defined for the Entity life-cycle. There are no EntityMangager events, or system level events.

Some JPA providers may provide additional events.


 * TopLink / EclipseLink : Provide an extended event mechanism. Additional   level events are defined through the   API.  A session level event mechanism is also provided through the   API.  The event objects also provide additional information including the database row and set of object changes.

Views
A database  is a virtual view of a table or query. Views are useful for hiding the complexity of a table, set of tables, or data-set.

In JPA you can map to a  the same as a table, using the   annotation. You can then map each column in the view to your object's attributes. Views are normally read-only, so object's mapping to views are normally also read-only. In most databases views can also be updatable depending on how complex to query is that they encapsulate. Even for complex queries database triggers can normally be used to update into the view.

Views can often be used in JPA to workaround mapping limitations. For example if a table join is not supported by JPA, or a database function is desired to be called to transform data, this can normally be done inside a view, so JPA can just map to the simplified data.

Using views does require database expertise, and the definition of views can be database dependent.

Interfaces
Interfaces can be used for two main purposes in a model. The first is as a public interface that defines the public API for the class. The second is as a common type that allows multiple distinct implementers.

Public Interfaces
If you have a public interface in JPA, you just need to map the implementation class and are fine for the most part. One issue is that you need to use the implementation class for queries, as JPA does not know about the interface. For JPQL the default alias is also the implementation class, but you could redefine this to be the public interface by setting the  of the to be the public interface.

Some JPA providers allow interfaces to be defined.


 * TopLink / EclipseLink : Support defining and querying on public interfaces using a  and the.

Example public interface alias
If you have a relationship that uses the public interface, instead of the implementation, then JPA will not know how to map this correctly. You can using the  attribute to define that the relationship is too the implementation class (see Target Entity).

Interface Types
If you have a common interfaces with multiple distinct implementers, this can have some issues. If you use the interface to define a variable relationship, then this is difficult to map. JPA has no direct support for interfaces or variable relationships. You could change the interface to be a  class, then use   inheritance to map it. Otherwise, you could split the relationship into multiple relationships, one per each implementer, or you could just remove the relationship and query for the related objects instead. Querying on an interface is also difficult, you would need to query on each implementer of the interface, and then union the results in memory.

Some JPA providers have support for mapping interfaces, and for variable relationships.


 * TopLink / EclipseLink : Support mapping interfaces using a  and a   and  .  Variable relationships can be defined using the   annotation or XML.

Stored Procedures
A stored procedure is a procedure or function that resides on the database. Stored procedures are typically written in some database specific language that is similar to SQL, such as PL/SQL on Oracle. Some databases such as Oracle also support stored procedures written in Java.

Stored procedures can be useful to perform batch data processing tasks. By writing the task in the database, it avoids the cost of sending the data to and from the database client, so can operate much more efficiently. Stored procedures can also be used to access database specific functionality that can only be accessed on the server. Stored procedures can also be used if strict security requirements as required, to avoid giving users access to the raw tables or unverified SQL operations. Some legacy application have also been written in database procedural languages, and need to be integrated with.

The disadvantages of using stored procedures is they are less flexible than using SQL, and require developing and maintaining functionality that is often written in a different language than the application developers may be used to, and difficult to develop and debug, and typically using a limited procedural programming language. There is also a general misconception that using stored procedures will improve performance, in that if you put the same SQL the application is executing inside a stored procedure it will somehow become faster. This is a false, and normally the opposite is true, as stored procedures restrict the dynamic ability of the persistence layer to optimize data retrieval. Stored procedures only improve performance when they use more optimal SQL than the application, typically when they perform an entire task on the database. To achieve optimal performance from SQL generated in an application you must use prepared statements - otherwise the database will have to create a new execution plan each time you submit a query.

JPA 2.1 StoredProcedureQuery
JPA 2.1 supports calling database stored procedures using the, and   annotation or   XML element. JPA supports both named stored procedures calls defined in meta-data and created through, and dynamic stored procedure calls created through.

is a JPA Query that provides additional API for setting the stored procedure parameters, and for accessing output parameters and multiple result sets. A  can return entity objects, or data, similar to native SQL queries. A  can be used to map the returned fields to the entity columns.

Some databases such as MySQL, SQL Server and Sybase support stored procedure that return result sets. Some databases such as Oracle do not, but do support output parameters that return cursors. If the stored procedure returns a result set, or a cursor output parameter,  or   can be used. If the stored procedure has multiple result sets, or multiple cursor output parameters, then calling  again will return the next result set or cursor output parameter. If the stored procedure does not return anything,  can be used. also defines an  API, that executes the procedure and returns a   indicating if a result set was returned. After execution, any of the stored procedures output parameters can be accessed using.

The stored procedure parameters are defined through the  annotation or   XML element, or for dynamic stored procedure calls through the   API. Parameters can be named or by index, define the corresponding Java type for the parameter, and define the parameter directional mode through. The mode can be one of,  ,  ,. A type of  can be used for ref cursors.

Stored Procedures in JPA 2.0
JPA 2.0 does not have any direct support for stored procedures. Some types of stored procedures can be executed in JPA through using native queries. Native queries in JPA allow any SQL that returns nothing, or returns a database result set to be executed. The syntax to execute a stored procedure depends on the database. JPA native SQL queries do not support stored procedures that use  or   parameters. Some databases such as DB2, Sybase and SQL Server allow for stored procedures to return result sets. Oracle does not allow results sets to be returned, only  parameters, but does define a   type that can be returned as an   parameter. Oracle also supports stored functions, that can return a single value. A stored function can normally be executed using a native SQL query by selecting the function value from the Oracle  table.

Some JPA providers have extended support for stored procedures, some also support overriding any CRUD operation for an  with a stored procedure or custom SQL. Some JPA providers have support for    parameters.


 * TopLink / EclipseLink : Support stored procedures and stored functions using the  annotations or XML, or the   classes.  Overriding any CRUD operation for a class or relationship are also supported using a   and the   class.  IN, OUT, INOUT, and     parameters are supported.

PL/SQL Stored Procedures
In Oracle stored procedures are typically written in Oracle's PL/SQL language. PL/SQL in Oracle supports some additional data-types, that Oracle does not support through SQL or JDBC. These include types such as,   and. Accessing these types or procedures is difficult from Java, as these types are not supported by JDBC. One workaround is to wrap the PL/SQL stored procedures with normal stored procedures that transform the PL/SQL types to standard SQL/JDBC types, such as,   , and. Some JPA providers have extended support for calling PL/SQL stored procedures.


 * TopLink / EclipseLink : Support PL/SQL stored procedures and functions using the  annotations or XML, or the   classes.

Structured Object-Relational Data Types
At the peak of object-oriented databases (OODBMS) many of the relational database vendors decided to add object-oriented concepts to relational data. These new hybrid databases were then called Object-Relational in that they could store both object and relational data. These object-relational data-types were standardized as part of SQL3 and support was added for them from Java in the JDBC 2.0 API. Although there was lots of hype around the new forms of data, object-relational data never caught on much, as people seemed to prefer their standard relational data. One would not normally recommend the use of object-relational data, as relational data is much more standard, but if dealing with really complex data, it may be something to investigate.

Some common object-relational database features include:
 * Object types (structures)
 * Arrays and array types
 * Nested tables
 * Inheritance
 * Object ids (OIDs)
 * Refs

Databases that support object-relational data include:
 * Oracle
 * DB2
 * PostgreSQL

The basic model allows you to define Structs or Object-types to represent your data, the structures can have nested structures, arrays of basic data or other structures, and refs to other structures. You can then store a structure in a normal relational table column, or create a special table to store the structures directly. Querying is basic SQL, with a few extensions to handle traversing the special types.

JPA does not support object-relational data-types, but some JPA providers may offer some support.


 * TopLink / EclipseLink : Support object-relational data-types through their  annotations and XML, or their   and mapping classes.  Custom support is also offered for Oracle spatial database JGeometry structures and other structured data-types using the   annotation or XML.

See also,
 * Complex data stored procedures (Blog)

XML Data Types
With the advent of XML databases, many relational database decided to add enhanced XML support. Although it was always possible to store XML in a relational database just using a  or   column, having the database aware of the XML data does have its advantages. The main advantage is databases that offer XML support allow querying of the XML data using XPath or XQuery syntax. Some databases also allow the XML data to be stored more efficiently than in Lob storage.

Databases with XML support include:
 * Oracle (XDB)
 * DB2
 * PostgreSQL

JPA has no extended support for XML data, although it is possible to store an XML String into the database, just mapped as a. Some JPA provider may offer extended XML data support. Such as query extensions, or allow mapping an XML DOM.

If you wish to map the XML data into objects, you could make use of the JAXB specification. You may even be able to integrate this with your JPA objects.


 * TopLink / EclipseLink : Support Oracle XDB XMLType columns using their .  XMLTypes can be mapped either as String or as an XML DOM (Document).  Query extensions are provided for XPath queries within   queries.  EclipseLink also includes a JAXB implementation for object-XML mapping.

Filters
Some times it is desirable to filter some of the contents of a table from all queries. This is normally because the table is shared, either by multiple types, applications, tenants, or districts, and the JPA application is only interested in a subset of the rows. It may also be that the table includes historical or archive rows that should be ignored by the JPA application.

JPA does not provide any specific support for filtering data, but there are some options available. Inheritance can be used to include a type check on the rows for a class. For example, if you had a  column in an   table, you could define an   and a   subclass whose discriminator   was , and always use   in the application. Similarly you could define an  subclass that used the   column as its class discriminator of value. Another solution is to use database views to filter the data and map the entities to the views.

These solutions do not work with dynamic filtering, where the filter criteria parameters are not known until runtime (such as tenant, or district). Also complex criteria cannot be modeled through inheritance, although database views should still work. One solution is to always append the criteria to any query, such as appending a JPQL string, or JPA Criteria in the application code.

Virtual Private Database (VPD) support may also provide a solution. Some databases such as Oracle support VPD, allow context based filtering of rows based on the connected user or proxy certificate.

Some JPA providers have specific support for filtering data.


 * TopLink / EclipseLink : Support filtering data through their  annotation and XML.  This allows an arbitrary JPQL fragment to be appended to all queries for the entity.  The fragment can contain parameters that can be set through persistence unit or context properties at runtime.  Oracle VPD is also supported, include Oracle proxy authentication and isolated data.

History
A common desire in database applications is to maintain a record and history of the database changes. This can be used for tracking and auditing purposes, or to allow undoing of changes, or to keep a record of the system's data over time. Many database have auditing functionality that allows some level of tracking changes made to the data. Some databases such as Oracle's Flashback feature allow the automatic tracking of history at the row level, and even allow querying on past versions of the data.

It is also possible for an application to maintain its own history through its data model. All that is required is to add a  and   timestamp column to the table. The current row is then the one in which the  timestamp is null. When a row is inserted its  timestamp will be set to the current time. When a row is updated, instead of updating the row a new row will be inserted with the same id and data, but a different  timestamp, and the old row will be updated to set its   timestamp to the current time. The primary key of the table will need to have the  timestamp added to it.

History can also be used to avoid deletion. Instead of deleting a row, the  timestamp can just be set to the current time. Another solution is to add a  boolean column to the table, and record when a row is deleted.

The history data could either be stored in-place in the altered table, or the table could be left to only contain the current version of the data, and a mirror history table could be added to store the data. In the mirror case, database triggers could be used to write to the history table. For the in-place case a database view could be used to give a view of the table as of the current time.

To query the current data from a history table, any query must include the clause where the  is. To query as of a point in time, the where clause must include where the point in time is between the  and   timestamps.

JPA does not define any specific history support.

Oracle flashback can be used with JPA, but any queries for historical data will need to use native SQL.

If a mirror history table is used with triggers, JPA can still be used to query to current data. A subclass or sibling class could also be mapped to the history table to allow querying of history data.

If a database view is used, the JPA could be used by mapping the  to the view.

If a history table is used JPA could still be used to map to the table, and a  and   attribute could be added to the object. Queries for the current data could append the current time to the query. Relationships are more difficult, as JPA requires relationships to be by primary key, and historical relationships would not be.

Some JPA providers have support for history.


 * TopLink / EclipseLink : Support Oracle flashback querying as well as application specific history. Historical queries can be defined using the query hint   or   queries.  Automatic tracking of history is also supported using the   API that supports maintaining and querying a mirror history table.

Auditing
See, Auditing and Security.

Replication
Data replication can be used to backup data, for fault tolerance and fail-over, or for load balancing and scaling the database.

For replication, changes are written to multiple databases, either by the application, JPA provider, or database back-end. For fail-over, if one of the databases goes down, the other can be used without loss of data or application downtime. For load-balancing, read requests can be load balanced across the replicated databases to reduce the load on each database, and improve the scalability of the application.

Most enterprise database support some form of automatic backup or replication. Clustered database such as Oracle RAC also allow for load balancing, fail-over and high availability. If your database supports replication or clustering, then it is normally transparent to JPA. A specialize DataSource (such as Oracle UCP, or WebLogic GridLink) may need to be used to handle load-balancing and fail-over.

JPA does not define any specific support for data replication, but some JPA provider provide replication support. If your database does not support replication, you can implement it yourself through having multiple persistence units, and persisting and merging your objects to both databases.


 * TopLink / EclipseLink : Support replication, load-balancing and fail-over. Replication and load balancing is supported through EclipseLink's partitioning support using the , and   annotations and XML.

Partitioning
Data partitioning can be used to scale an application across multiple database machines, or with a clustered database such as Oracle RAC.

Partitioning splits your data across each of the database nodes. There is horizontal partitioning, and vertical partitioning. Vertical partitioning is normally the easiest to implement. You can just put half of your classes on one database, and the other half on another. Ideally the two sets would be isolated from each other and not have any cross database relationships. This can be done directly in JPA, by having two different persistence units, one for each database.

For horizontal partitioning you need to split your data across multiple database nodes. Each database node will have the same tables, but each node's table will only store part of the data. You can partition the data by the data values, such as range partitioning, value partitioning, hash partitioning, or even round robin. JPA does not define any data partitioning support, so you either need to define a different class per partition, or use JPA vendor specific functionality.


 * TopLink / EclipseLink : Support both horizontal and vertical data partitioning. Hash, value, range, pinned and custom partitioning is supported at the Session, Entity, and Query level.  Partitioning is support through the , and   annotations and XML.

See also,
 * Data Partitioning - Scaling the Database (Blog)

NoSQL (and EIS, legacy, XML, and non-relational data)
See, NoSQL