Java Persistence/Relationships

=Relationships= A relationship is a reference from one object to another. In Java, relationships are defined through object references (pointers) from a source object to the target object. Technically, in Java there is no difference between a relationship to another object and a "relationship" to a data attribute such as a  or   (primitives are different), as both are pointers; however, logically and for the sake of persistence, data attributes are considered part of the object, and references to other persistent objects are considered relationships.

In a relational database relationships are defined through foreign keys. The source row contains the primary key of the target row to define the relationship (and sometimes the inverse). A query must be performed to read the target objects of the relationship using the foreign key and primary key information.

In Java, if a relationship is to a collection of other objects, a  or array type is used in Java to hold the contents of the relationship. In a relational database, collection relations are either defined by the target objects having a foreign key back to the source object's primary key, or by having an intermediate join table to store the relationship (both objects' primary keys).

All relationships in Java and JPA are unidirectional, in that if a source object references a target object there is no guarantee that the target object also has a relationship to the source object. This is different than a relational database, in which relationships are defined through foreign keys and querying such that the inverse query always exists.

JPA Relationship Types

 * OneToOne - A unique reference from one object to another, inverse of a.
 * ManyToOne - A reference from one object to another, inverse of a.
 * OneToMany - A  or   of objects, inverse of a.
 * ManyToMany - A  or   of objects, inverse of a.
 * Embedded - A reference to a object that shares the same table of the parent.
 * ElementCollection - JPA 2.0, a  or   of   or   objects, stored in a separate table.

This covers the majority of types of relationships that exist in most object models. Each type of relationship also covers multiple different implementations, such as OneToMany allowing either a join table, or foreign key in the target, and collection mappings also allow  types and   types. There are also other possible complex relationship types, see Advanced Relationships.

Lazy Fetching
The cost of retrieving and building an object's relationships far exceeds the cost of selecting the object. This is especially true for relationships such as  or   such that, if any employee were selected, it would trigger the loading of every employee through the relationship hierarchy. Obviously this is a bad thing, and yet having relationships in objects is very desirable.

The solution to this issue is lazy fetching (lazy loading). Lazy fetching allows the fetching of a relationship to be deferred until it is accessed. This is important not only to avoid the database access, but also to avoid the cost of building the objects if they are not needed.

In JPA lazy fetching can be set on any relationship using the  attribute. The  can be set to either   or   as defined in the FetchType enum. The default fetch type is  for all relationships except for   and , but in general it is a good idea to make every relationship. The  default for   and   is for implementation reasons (more difficult to implement), not because it is a good idea. Technically in JPA  is just a hint, and a JPA provider is not required to support it, however in reality all main JPA providers support it, and they would be pretty useless if they did not.

Magic
Lazy fetching normally involves some sort of magic in the JPA provider to transparently fault in the relationships as they are accessed. The typical magic for collection relationships is for the JPA provider to set the relationships to its own,  ,   or   implementation. When any (or most) method is accessed on this collection proxy, it loads the real collection and forwards the method. This is why JPA requires that all collection relationships use one of the collection interfaces (although some JPA providers support collection implementations too).

For  and   relationships the magic normally involves some sort of byte code manipulation of the entity class, or creation of a subclass. This allows the access to the field or get/set methods to be intercepted, and for the relationships to be first retrieved before allowing access to the value. Some JPA providers use different methods, such as wrapping the reference in a proxy object, although this can have issues with  values and primitive methods. To perform the byte code magic normally an agent or post-processor is required. Ensure that you correctly use your providers agent or post-processor otherwise lazy may not work. You may also notice additional variables when in a debugger, but in general debugging will still work as normal.

Basics
A  attribute can also be made , but this is normally a different mechanism than lazy relationships, and should normally be avoided unless the attribute is rarely accessed.

See Basic Attributes : Lazy Fetching.

Serialization, and Detaching
A major issue with lazy relationships is ensuring that the relationship is still available after the object has been detached or serialized. For most JPA providers, after serialization any lazy relationship that was not instantiated will be broken, and either throw an error when accessed or return null.

The naive solution is to make every relationship eager. Serialization suffers from the same issue as persistence, in that you can very easily serialize your entire database if you have no lazy relationships. So lazy relationships are just as necessary for serialization as they are for database access; however you need to ensure you have everything you will need after serialization instantiated upfront. You may mark only the relationships that you think you will need after serialization as ; this will work, but there are probably many cases when you do not need these relationships.

A second solution is to access any relationship you will need before returning the object for serialization. This has the advantage of being use case specific, so different use cases can instantiate different relationships. For collection relationships sending  is normally the best way to ensure a lazy relationship is instantiated. For  and   relationships, normally just accessing the relationship is enough (i.e.  ), although for some JPA providers that use proxies you may need to send the object a message (i.e. ).

A third solution is to use the JPQL  for the relationship when querying the objects. A join fetch will normally ensure the relationship has been instantiated. Some caution should be used with join fetch however, as it can become inefficient if used on collection relationships, especially multiple collection relationships as it requires an n^2 join on the database.

Some JPA providers may also provide certain query hints, or other such serialization options.

The same issue can occur without serialization, if a detached object is accessed after the end of the transaction. Some JPA providers allow lazy relationship access after the end of the transaction, or after the  has been closed, however some do not. If your JPA provider does not, then you may require that you ensure you have instantiated all the lazy relationships that you will need before ending the transaction.

Eager Join Fetching
One common misconception is that  means that the relationship should be join fetched, i.e. retrieved in the same SQL   statement as the source object. Some JPA providers do implement eager this way. However, just because something is desired to be loaded does not mean that it should be join fetched. Consider  - , a  's employee reference is made   as the employee is almost always loaded before the phone. However when loading the phone, you do not want to join the employee, the employee has already been read and is already in the cache or persistence context. Also just because you want two collection relationships loaded, does not mean you want them join fetch which would result in a very inefficient join that would return n^2 data.

Join fetching is something that JPA currently only provides through JPQL, which is normally the correct place for it, as each use case has different relationship requirements. Some JPA providers also provide a join fetch option at the mapping level to always join fetch a relationship, but this is normally not the same thing as. Join fetching is not normally the most efficient way to load a relationship anyway, normally batch reading a relationship is much more efficient when supported by your JPA provider.

See Join Fetching

See Batch Reading

Cascading
Relationship mappings have a  option that allows the relationship to be cascaded for common operations. is normally used to model dependent relationships, such as  ->. Cascading the  relationship allows for the  's ->  s to be persisted, removed, merged along with their parent.

The following operations can be cascaded, as defined in the CascadeType enum:
 * - Cascaded the  operation.  If   is called on the parent, and the child is also new, it will also be persisted.  If it is existing, nothing will occur, although calling   on an existing object will still cascade the persist operation to its dependents.  If you persist an object, and it is related to a new object, and the relationship does not cascade persist, then an exception will occur.  This may require that you first call persist on the related object before relating it to the parent.  General it may seem odd, or be desirable to always cascade the persist operation, if a new object is related to another object, then it should probably be persisted.  There is most likely not a major issue with always cascading persist on every relationship, although it may have an impact on performance.  Calling persist on a related object is not required, on commit any related object whose relationship is cascade persist will automatically be persisted.  The advantage of calling persist up front is that any generated ids will (unless using identity) be assigned, and the   event will be raised.
 * - Cascaded the  operation.  If   is called on the parent then the child will also be removed.  This should only be used for dependent relationships.  Note that only the   operation is cascaded, if you remove a dependent object from a   collection it will not be deleted, JPA requires that you explicitly call   on it.  Some JPA providers may support an option to have objects removed from dependent collection deleted, JPA 2.0 also defines an option for this.
 * - Cascaded the  operation.  If   is called on the parent, then the child will also be merged.  This should normally be used for dependent relationships.  Note that this only affects the cascading of the merge, the relationship reference itself will always be merged.  This can be a major issue if you use   variables to limit serialization, you may need to manually merge, or reset transient relationships in this case.  Some JPA providers provide additional   operations.
 * - Cascaded the  operation.  If   is called on the parent then the child will also be refreshed.  This should normally be used for dependent relationships.  Be careful enabling this for all relationships, as it could cause changes made to other objects to be reset.
 * - Cascaded all the above operations.

Orphan Removal (JPA 2.0)
Cascading of the  operation only occurs when the remove is called on the object. This is not normally what is desired on a dependent relationship. If the related objects cannot exist without the source, then it is normally desired to have them deleted when the source is deleted, but also have them deleted when they are no longer referenced from the source. JPA 1.0 did not provide an option for this, so when a dependent object was removed from the source relationship, it had to be explicitly removed from the. JPA 2.0 provides a  option on the OneToMany and OneToOne annotations and XML. Orphan removal will ensure that any object no longer referenced from the relationship is deleted from the database.

Target Entity
Relationship mappings have a  attribute that allows the reference class (target) of the relationship to be specified. This is normally not required to be set as it is defaulted from the field type, get method return type, or collection's generic type.

This can also be used if your field uses a public interface type, for example field is interface, but the mapping needs to be to implementation class. Another usage is if your field is a superclass type, but you want to map the relationship to a subclass.

Collections
Collection mappings include OneToMany, ManyToMany, and in JPA 2.0 ElementCollection. JPA requires that the type of the collection field or get/set methods be one of the Java collection interfaces,,  ,  , or.

Collection Implementations
Your field should not be of a collection implementation type, such as. Some JPA providers may support using collection implementations, many support  collection relationships to use the implementation class. You can set any implementation as the instance value of the collection, but when reading an object from the database, if it is  the JPA provider will normally put in a special   collection.

Duplicates
A  in Java supports duplicate entries, and a   does not. In the database, duplicates are generally not supported. Technically it could be possible if a  is used, but JPA does not require duplicates to be supported, and most providers do not.

If you require duplicate support, you may need to create an object that represents and maps to the join table. This object would still require a unique, such as a. See Mapping a Join Table with Additional Columns.

Ordering
JPA allows the collection values to be ordered by the database when retrieved. This is done through the annotation or   XML element.

The value of the  is a JPQL   string. This can be an attribute name followed by  or   for ascending or descending ordering. You could also use a path or nested attribute, or a "," for multiple attributes. If no  value is given it is assumed to be the   of the target object.

The  value must be a mapped attribute of the target object. If you want to have an ordered  you need to add an index attribute to your target object and an index column to its table. You will also have to ensure you set the index values. JPA 2.0 will have extended support for an ordered  using an.

Note that using an  does not ensure the collection is ordered in memory. You are responsible for adding to the collection in the correct order. Java does define a  interface and   collection implementation that does maintain an order. JPA does not specifically support, but some JPA providers may allow you to use a   or   for your collection type, and maintain the correct ordering. By default these require your target object to implement the  interface, or set a. You can also use the  method to sort a   when required. One option to sort in memory is to use property access and in your set and add methods call.

Order Column (JPA 2.0)
JPA 2.0 adds support for an. An  can be used to define an order   on any collection mapping. It is defined through the annotation or   XML element.

The  is maintained by the mapping and should not be an attribute of the target object. The table for the  depends on the mapping. For a  mapping it will be in the target object's table. For a  mapping or a   using a   it will be in the join table. For an  mapping it will be in the target table.

Example of a collection order column database
EMPLOYEE (table)

EMPLOYEE_PHONE (table)

PHONE(table)

Object corruption, one side of the relationship is not updated after updating the other side
A common problem with bi-directional relationships is the application updates one side of the relationship, but the other side does not get updated, and becomes out of sync. In JPA, as in Java in general, it is the responsibility of the application or the object model to maintain relationships. If your application adds to one side of a relationship, then it must add to the other side.

This is commonly resolved through  or   methods in the object model that handle both sides of the relationships, so the application code does not need to worry about it. There are two ways to go about this: you can either add the relationship maintenance code to only one side of the relationship, and only use the setter from that side (such as making the other side protected), or add it to both sides and ensure you avoid an infinite loop.

For example:

The code is similar for bi-directional  and   relationships.

Some expect the JPA provider to have magic that automatically maintains relationships. This was actually part of the EJB CMP 2 specification. However the issue is if the objects are detached or serialized to another VM, or new objects are related before being managed, or the object model is used outside the scope of JPA, then the magic is gone, and the application is left figuring things out, so in general it may be better to add the code to the object model. However some JPA providers do have support for automatically maintaining relationships.

In some cases it is undesirable to instantiate a large collection when adding a child object. One solution is to not map the bi-directional relationship, and instead query for it as required. Also some JPA providers optimize their lazy collection objects to handle this case, so you can still add to the collection without instantiating it.

Poor performance, excessive queries
This most common issue leading to poor performance is the usage of  relationships. This requires the related objects to be read when the source objects are read. So for example reading the president of the company with    will cause every   in the company to be read. The solution is to always make all relationships. By default  and   are   but   and   are not, so make sure you configure them to be. See, Lazy Fetching. Sometimes you have  configured but it does not work, see Lazy is not working.

Another common problems is the n+1 issue. For example consider that you read all  objects then access their. Since each  is accessed separately this will cause n+1 queries, which can be a major performance problem. This can be solved through Join Fetching and Batch Reading.

Lazy is not working
Lazy  and   relationships typically require some form of weaving or byte-code generation. Normally when running in JSE an  option is required to allow the byte-code weaving, so ensure you have the agent configured correctly. Some JPA providers perform dynamic subclass generation, so do not require an agent.

Example agent java -javaagent:eclipselink.jar ...

Some JPA providers also provide static weaving instead, or in addition to dynamic weaving. For static weaving some preprocessor must be run on your JPA classes.

When running in JEE lazy should normally work, as the class loader hook is required by the EJB specification. However some JEE providers may not support this, so static weaving may be required.

Also ensure that you are not accessing the relationship when you shouldn't be. For example if you use property access, and in your set method access the related lazy value, this will cause it to be loaded. Either remove the set method side-effects, or use field access.

Broken relationships after serialization
If your relationship is marked as  then if it has not been instantiated before the object is serialized, then it may not get serialized. This may cause an error, or return  if it is accessed after deserialization.

See, Serialization, and Detaching

Dependent object removed from OneToMany collection is not deleted
When you remove an object from a collection, if you also want the object deleted from the database you must call  on the object. In JPA 1.0 even if your relationship is, you still must call  , only the remove of the parent object is cascaded, not removal from the collection.

JPA 2.0 will provide an option for having removes from the collection trigger deletion. Some JPA providers support an option for this in JPA 1.0.

See, Cascading

My relationship target is an interface
If your relationship field's type is a public interface of your class, and only has a single implementer, then this is simple to solve, you just need to set a  on your mapping. See, Target Entity.

If your interface has multiple implementers, then this is more complex. JPA does not directly support mapping interfaces. One solution is to convert the interface to an abstract class and use inheritance to map it. You could also keep the interface, create the abstract class and make sure each implementer extends it, and set the  to be the abstract class.

Another solution is to define virtual attributes using get/set methods for each possible implementer, and map these separately, and mark the interface get/set as. You could also not map the attribute, and instead query for it as required.

See, Variable and Heterogeneous Relationships

Some JPA providers have support for interfaces and variable relationships.


 * TopLink, EclipseLink : Support variable relationships through their  annotation and XML.  Mapping to and querying interfaces are also supported through their  's   API.

=Advanced=

JPA 2.0 Relationship Enhancements

 * ElementCollection - A  or   of   or   values.
 * Map Columns - A  or   or   that has a ,   or   key not part of the target object.
 * Order Columns - A  or   or   can now have a   that defines the order of the collection when a   is used.
 * Unidirectional OneToMany - A  no longer requires the   inverse relationship to be defined.

Other Types of Relationships

 * Variable OneToOne, ManyToOne - A reference to an interface or common unmapped inheritance class that has multiple distinct implementors.
 * Variable OneToMany, ManyToMany - A   or   of heterogeneous objects that share an interface or common unmapped inheritance class that has multiple distinct implementers.
 * Nested collection relationships, such as an array of arrays,  of  s, or   of  s, or other such combinations.
 * Object-Relational Data Type - Relationships stored in the database using,  ,  , or   types.
 * XML relationships - Relationships stored as XML documents.

Maps
Java defines the  interface to represent collections whose values are indexed on a key. There are several  implementations, the most common is , but also   and.

JPA allows a  to be used for any collection mapping including, ,   and. JPA requires that the  interface be used as the attribute type, although some JPA providers may also support using   implementations.

In JPA 1.0 the map key must be a mapped attribute of the collection values. The annotation or   XML element is used to define a map relationship. If the  is not specified it defaults to the target object's.

Map Key Columns (JPA 2.0)
JPA 2.0 allows for a  where the key is not part of the target object to be persisted. The  key can be any of the following:
 * A  value, stored in the target's table or join table.
 * An  object, stored in the target's table or join table.
 * A foreign key to another, stored in the target's table or join table.

Map columns can be used for any collection mapping including,,   and.

This allows for great flexibility and complexity in the number of different models that can be mapped. The type of mapping used is always determined by the value of the, not the key. So if the key is a  but the value is an   a   mapping is still used. But if the value is a  but the key is an   a   mapping is used.

This allows some very sophisticated database schemas to be mapped. Such as a three way join table, can be mapped using a  with a   for the third foreign key. For a  the key is always stored in the. For a  it is stored in the   if defined, otherwise it is stored in the target  's table, even though the target   does not map this column. For an  the key is stored in the element's table.

The annotation or   XML element is used to define a map relationship where the key is a   value, the  and  can also be used with this for   or   types. The annotation or   XML element is used to define a map relationship where the key is an   value, the  can also be used with this for composite foreign keys. The annotation or   XML element can be used when the key is an   or to specify the target class or type if generics are not used.

Example of a map key column relationship database
EMPLOYEE (table)

PHONE(table)

Example of a map key join column relationship database
EMPLOYEE (table)

PHONE(table)

PHONETYPE(table)

Example of a map key class embedded relationship database
EMPLOYEE (table)

EMPLOYEE_PHONE (table)

PHONE (table)

Join Fetching
Join fetching is a query optimization technique for reading multiple objects in a single database query. It involves joining the two object's tables in SQL and selecting both object's data. Join fetching is commonly used for  relationships, but also can be used for any relationship including   and.

Join fetching is one solution to the classic ORM n+1 performance problem. The issue is if you select n  objects, and access each of their addresses, in basic ORM (including JPA) you will get 1 database select for the   objects, and then n database selects, one for each   object. Join fetching solves this issue by only requiring one select, and selecting both the  and its.

JPA supports join fetching through JPQL using the  syntax.

Example of JPQL Join Fetch
This causes both the  and   data to be selected in a single query.

Outer Joins
Using the JPQL  syntax a normal   join is performed. This has the side effect of filtering any  from the result set that did not have an address. An  join in SQL is one that does not filter absent rows on the join, but instead joins a row of all   values. If your relationship allows  or an empty collection for collection relationships, then you can use an   join fetch, this is done in JPQL using the   syntax.

Note that  joins can be less efficient on some databases, so avoid using an   if it is not required.

Mapping Level Join Fetch and EAGER
JPA has no way to specify that a join fetch always be used for a relationship. Normally it is better to specify the join fetch at the query level, as some use cases may require the related objects, and other use cases may not. JPA does support an  option on mappings, but this means that the relationship will be loaded, not that it will be joined. It may be desirable to mark all relationships as  as everything is desired to be loaded, but join fetching everything in one huge select could result in a inefficient, overly complex, or invalid join on the database.

Some JPA providers do interpret  as join fetch, so this may work on some JPA providers. Some JPA providers support a separate option for always join fetching a relationship.


 * TopLink, EclipseLink : Support a  annotation and XML on a mapping to define that the relationship always be join fetched.

Nested Joins
JPA 1.0 does not allow nested join fetches in JPQL, although this may be supported by some JPA providers. You can join fetch multiple relationships, but not nested relationships.

Duplicate Data and Huge Joins
One issue with join fetching is that duplicate data can be returned. For example consider join fetching an 's phoneNumbers relationship. If each  has 3   objects in its phoneNumbers collection, the join will require to bring back n*3 rows. As there are 3 phone rows for each employee row, the employee row will be duplicated 3 times. So you are reading more data than if you have selected the objects in n+1 queries. Normally the fact that your executing fewer queries makes up for the fact that you may be reading duplicate data, but if you consider joining multiple collection relationships you can start to get back j*i duplicate data which can start to become an issue. Even with  relationships you can be selecting duplicate data. Consider join fetching an 's manager: if all or most employee's have the same manager, you will end up selecting this manager's data many times. In this case you would be better off not using join fetch, and allowing a single query for the manager.

If you start join fetching every relationship, you can start to get some pretty huge joins. This can sometimes be an issue for the database, especially with huge outer joins.

One alternative solution to join fetch that does not suffer from duplicate data is using Batch Fetching.

Batch Fetching
Batch fetching is a query optimization technique for reading multiple related objects in a finite set of database queries. It involves executing the query for the root objects as normal. But for the related objects the original query is joined with the query for the related objects, allowing all of the related objects to be read in a single database query. Batch fetching can be used for any type of relationship.

Batch fetching is one solution to the classic ORM n+1 performance problem. The issue is if you select n  objects, and access each of their addresses, in basic ORM (including JPA) you will get 1 database select for the   objects, and then n database selects, one for each   object. Batch fetching solves this issue by requiring one select for the  objects and one select for the   objects.

Batch fetching is more optimal for reading collection relationships and multiple relationships as it does not require selecting duplicate data as in join fetching.

JPA does not support batch reading, but some JPA providers do.


 * TopLink, EclipseLink : Support a  annotation and xml element and a   query hint to enable batch reading.  Three forms of batch fetching are supported, ,  , and.

See also,
 * Batch fetching - optimizing object graph loading

Filtering, Complex Joins
Normally a relationship is based on a foreign key in the database, but on occasion it is always based on other conditions. Such as  having many  s but also a single home phone, or one of his phones that has the   type, or a collection of "active" projects, or other such condition.

JPA does not support mapping these types of relationships, as it only supports mappings defined by foreign keys, not based on other columns, constant values, or functions. Some JPA providers may support this. Workarounds include, mapping the foreign key part of the relationship, then filtering the results in your get/set methods of your object. You could also query for the results, instead of defining a relationship.


 * TopLink, EclipseLink : Support filtering and complex relationships through several mechanisms. You can use a   to define a   on any mapping using the   criteria API.  This allows for any condition to be applied including constants, functions, or complex joins.  You can also use a   to define the SQL or define a   for the mapping's.

Variable and Heterogeneous Relationships
It is sometimes desirable to define a relationship where the type of the relationship can be one of several unrelated, heterogeneous values. This could either be a OneToOne, ManyToOne, OneToMany or ManyToMany relationship. The related values may share a common interface, or may share nothing in common other than subclassing. It is also possible to conceive of a relationship that could also be any  value, or even an.

In general JPA does not support variable, interface, or heterogeneous relationships. JPA does support relationships to inheritance classes, so the easiest workaround is normally to define a common superclass for the related values.

Another solution is to define virtual attributes using get/set methods for each possible implementer, and map these separately, and mark the heterogeneous get/set as. You could also not map the attribute, and instead query for it as required.

For heterogeneous  or   relationships one solution is to serialize the value to a binary field. You could also convert the value to a  representation that you can restore the value from, or store the value to two columns, one storing the   value and the other the class name or type.

Some JPA providers have support for interfaces, variable relationships, and/or heterogeneous relationships.


 * TopLink, EclipseLink : Support variable relationships through their  annotation and XML.  Mapping to and querying interfaces are also supported through their  's   API.

Nested Collections, Maps and Matrices
It is somewhat common in an object model to have complex collection relationships such as a  of  s (i.e. a matrix), or a   of  s, or a   of  s, and so on. Unfortunately these types of collections map very poorly to a relational database.

JPA does not support nested collection relationships, and normally it is best to change your object model to avoid them to make persistence and querying easier. One solution is to create an object that wraps the nested collection.

For example if an  had a   of  s keyed by a   project-type and the value a   or  s.  To map this a new   class could be created to store the project-type and a   to.

Example nested collection database
EMPLOYEE (table)

PROJECTTYPE (table)

PROJECTTYPE_PROJECT (table)

/Mapping