Java Persistence/Persisting

=Persisting=

JPA uses the API for runtime usage. The  represents the application session or dialog with the database. Each request, or each client will use its own  to access the database. The  also represents a transaction context, and in a typical stateless model a new   is created for each transaction. In a stateful model, an  may match the lifecycle of a client's session.

The  provides an API for all required persistence operations. These include the following CRUD operations:
 * (INSERT)
 * (UPDATE)
 * (DELETE)
 * (SELECT)

The  is an object-oriented API, so does not map directly onto database SQL or DML operations. For example to update an object, you just need to read the object and change its state through its  methods, and then call   on the transaction. The  figures out which objects you changed and performs the correct updates to the database, there is no explicit update operation in JPA.

Detached vs Managed
JPA defines two main states for an object for a given persistence context, managed and detached.

A managed object is one that was read in the current persistence context (EntityManager/JTA transaction). A managed object is registered with the persistence context and the persistence context will track changes to that object and maintain its object identity. If the same object is read again, in the same persistence context, or traversed through another managed object's relationship, the same identical object will be returned. Calling  on a new object will also make it become managed. Calling merge on a detached object will return the managed copy of the object. An object should never be managed by more than one persistence context. An object will be managed by its persistence context until the persistence context is cleared through, or the object is forced to be detached through. A removed object will no longer be managed after a  or. On a, all managed objects will become detached. In a JTA managed  all managed objects will be detached on any JTA commit or rollback.

A detached object is one that is not managed in the current persistence context. This could be an object read through a different persistence context, or an object that was cloned or serialized. A new object is also considered detached until  is called on it. An object that was removed and flushed or committed, will become detached. An object could be considered both managed in the context of one persistence context, and detached in the context of another persistence context.

A managed object should only ever reference other managed objects, and a detached object should only reference other detached objects. Avoid relating or mixing detached and managed objects, this will normally lead to issues, as your application could access two copies of the same object causing loss of changes or stale data. Incorrectly relating managed and detached objects is probably one of the most common issues users run into in JPA.

Persist
The operation is used to insert a new object into the database. does not directly insert the object into the database: it just registers it as new in the persistence context (transaction). When the transaction is committed, or if the persistence context is flushed, then the object will be inserted into the database.

If the object uses a generated, the   will normally be assigned to the object when   is called, so   can also be used to have an object's   assigned. The one exception is if  sequencing is used, in this case the   is only assigned on   or   because the database will only assign the   on. If the object does not use a generated, you should normally assign its   before calling.

The  operation can only be called within a transaction, an exception will be thrown outside of a transaction. The  operation is in-place, in that the object being persisted will become part of the persistence context. The state of the object at the point of the commit of the transaction will be persisted, not its state at the point of the  call.

should normally only be called on new objects. It is allowed to be called on existing objects if they are part of the persistence context, this is only for the purpose of cascading persist to any possible related new objects. If  is called on an existing object that is not part of the persistence context, then an exception may be thrown, or it may be attempted to be inserted and a database constraint error may occur, or if no constraints are defined, it may be possible to have duplicate data inserted.

can only be called on  objects, not on   objects, or collections, or non-persistent objects. objects are automatically persisted as part of their owning.

Calling  is not always required. If you related a new object to an existing object that is part of the persistence context, and the relationship is cascade persist, then it will be automatically inserted when the transaction is committed, or when the persistence context is flushed.

Cascading Persist
Calling  on an object will also cascade the   operation to across any relationship that is marked as cascade persist. If a relationship is not cascade persist, and a related object is new, then an exception may be thrown if you do not first call  on the related object. Intuitively you may consider marking every relationship as cascade persist to avoid having to worry about calling persist on every objects, but this can also lead to issues.

One issue with marking all relationships cascade persist is performance. On each persist call all of the related objects will need to be traversed and checked if they reference any new objects. This can actually lead to  performance issues if you mark all relationships cascade persist, and persist a large new graph of objects. If you just call  on the root object, this is ok. However, if you call  on each object in the graph, then you will traverse the entire graph for each object in the graph, and this can lead to a major performance issue. The JPA spec should probably define  to only apply to new objects, not already part of the persistence context, but it requires   apply to all objects, whether new, existing, or already persisted, so can have this issue.

A second issue is that if you  an object to have it deleted, if you then call   on the object, it will resurrect the object, and it will become persistent again. This may be desired if it is intentional, but the JPA spec also requires this behavior for cascade persist. So if you  an object, but forget to remove a reference to it from a cascade persist relationship, the   will be ignored.

I would recommend only marking relationships that are composite or privately owned as cascade persist.

Merge
The operation is used to merge the changes made to a detached object into the persistence context. does not directly update the object into the database, it merges the changes into the persistence context (transaction). When the transaction is committed, or if the persistence context is flushed, then the object will be updated in the database.

Normally  is not required, although it is frequently misused. To update an object you simply need to read it, then change its state through its  methods, then commit the transaction. The  will figure out everything that has been changed and update the database. is only required when you have a detached copy of a persistence object. A detached object is one that was read through a different  (or in a different transaction in a JEE managed  ), or one that was cloned, or serialized. A common case is a    where the object is read in one transaction, then updated in another transaction. Since the update is processed in a different transaction, with a different, it must first be merged. The  operation will look-up/find the managed object for the detached object, and copy each of the detached objects attributes that changed into the managed object, as well as cascading any related objects marked as cascade merge.

The  operation can only be called within a transaction, an exception will be thrown outside of a transaction. The  operation is not in-place, in that the object being merged will never become part of the persistence context. Any further changes must be made to the managed object returned by the, not the detached object.

is normally called on existing objects, but can also be called on new objects. If the object is new, a new copy of the object will be made and registered with the persistence context, the detached object will not be persisted itself.

can only be called on  objects, not on   objects, or collections, or non-persistent objects. objects are automatically merged as part of their owning.

Cascading Merge
Calling  on an object will also cascade the   operation across any relationship that is marked as cascade merge. Even if the relationship is not cascade merge, the reference will still be merged. If the relationship is cascade merge the relationship and each related object will be merged. Intuitively you may consider marking every relationship as cascade merge to avoid having to worry about calling merge on every objects, but this is normally a bad idea.

One issue with marking all relationships cascade merge is performance. If you have an object with a lot of relationships, then each  call can require to traverse a large graph of objects.

Another issues arises if your detached object is corrupt in some way. For example say you have an  who has a , but that manager has a different copy of the detached   object as its. This may cause the same object to be merged twice, or at least may not be consistent which object will be merged, so you may not get the changes you expect merged. The same is true if you didn't change an object at all, but some other user did, if  cascades to this unchanged object, it will revert the other user's changes, or throw an   (depending on your locking policy). This is normally not desirable.

I would recommend only marking relationships that are composite or privately owned as cascade merge.

Transient Variables
Another issue with  is transient variables. Since  is normally used with object serialization, if a relationship was marked as   (Java transient, not JPA transient), then the detached object will contain , and   will be merged into the object, even though it is not desired. This will occur even if the relationship was not cascade merge, as  always merges the references to related objects. Normally transient is required when using serialization to avoid serializing the entire database when only a single, or small set of objects are required.

One solution is to avoid marking anything, and instead use   relationships in JPA to limit what is serialized (lazy relationships that have not been accessed, will normally not be serialized). Another solution is to manually merge in your own code.

Some JPA providers provide extended  operations, such as allowing a shallow merge or deep merge, or merging without merging references.

Remove
The operation is used to delete an object from the database. does not directly delete the object from the database, it marks the object to be deleted in the persistence context (transaction). When the transaction is committed, or if the persistence context is flushed, then the object will be deleted from the database.

The  operation can only be called within a transaction, an exception will be thrown outside of a transaction. The  operation must be called on a managed object, not on a detached object. Generally you must first  the object before removing it, although it is possible to call   on the object's   and call remove on the reference. Depending on how you JPA provider optimizes  and , it may not require reading the object from the database.

can only be called on  objects, not on   objects, or collections, or non-persistent objects. objects are automatically removed as part of their owning.

Cascading Remove
Calling  on an object will also cascade the   operation across any relationship that is marked as cascade remove.

Note that cascade remove only effects the  call. If you have a relationship that is cascade remove, and remove an object from the collection, or dereference an object, it will not be removed. You must explicitly call  on the object to have it deleted. Some JPA providers provide an extension to provide this behavior, and in JPA 2.0 there will be an  option on   and   mappings to provide this.

Reincarnation
Normally an object that has been removed, stays removed, but in some cases you may need to bring the object back to life. This normally occurs with natural ids, not generated ones, where a new object would always get an new id. Generally the desire to reincarnate an object occurs from a bad object model design, normally the desire to change the class type of an object (which cannot be done in Java, so a new object must be created). Normally the best solution is to change your object model to have your object hold a type object which defines its type, instead of using inheritance. But sometimes reincarnation is desirable.

When done in two separate transactions, this is normally fine, first you  the object, then you   it back. This can be more complex if you wish to  and   an object with the same   in the same transaction. If you call  on an object, then call   on the same object, it will simply no longer be removed. If you call  on an object, then call   on a different object with the same   the behavior may depend on your JPA provider, and probably will not work. If you call  after calling , then call  , then the object should be successfully reincarnated. Note that it will be a different row, the existing row will have been deleted, and a new row inserted. If you wish the same row to be updated, you may need to resort to using a native SQL update query.

=Advanced=

Refresh
The operation is used to refresh an object's state from the database. This will revert any non-flushed changes made in the current transaction to the object, and refresh its state to what is currently defined on the database. If a  has occurred, it will refresh to what was flushed. Refresh must be called on a managed object, so you may first need to  the object with the active   if you have a non-managed instance.

Refresh will cascade to any relationships marked  refresh, although it may be done lazily depending on your fetch type, so you may need to access the relationship to trigger the refresh. can only be called on  objects, not on   objects, or collections, or non-persistent objects. objects are automatically refreshed as part of their owning.

Refresh can be used to revert changes, or if your JPA provider supports caching, it can be used to refresh stale cached data. Sometimes it is desirable to have a  or   operation refresh the results. Unfortunately JPA 1.0 does not define how this can be done. Some JPA providers offer query hints to allow refreshing to be enabled on a query.


 * TopLink / EclipseLink : Define a query hint  to allow refreshing to be enabled on a query.

JPA 2.0 defines a set of standard query hints for refeshing, see JPA 2.0 Cache APIs.

Lock
See, Read and Write Locking.

Get Reference
The operation is used to obtain a handle to an object without requiring it to be loaded. It is similar to the  operation, but may return a proxy or unfetched object. JPA does not require that  avoid loading the object, so some JPA providers may not support it and just perform a normal find operation. The object returned by  should appear to be a normal object, if you access any method or attribute other than its   it will trigger itself to be refreshed from the database.

The intention of  is that it could be used on an insert or update operation as a stand-in for a related object, if you only have its   and want to avoid loading the object. Note that  does not verify the existence of the object as   does. If the object does not exist and you try to use the unfetched object in an insert or update you may get a foreign key constraint violation, or if you access the object it may trigger an exception.

Flush
The operation can be used to write all changes to the database before the transaction is committed. By default JPA does not normally write changes to the database until the transaction is committed. This is normally desirable as it avoids database access, resources and locks until required. It also allows database writes to be ordered, and batched for optimal database access, and to maintain integrity constraints and avoid deadlocks. This means that when you call,  , or   the database DML   is not executed, until commit, or until a flush is triggered.

The  does not execute the actual  : the   still happens when an explicit   is requested in case of resource local transactions, or when a container managed (JTA) transaction completes.

Flush has several usages:
 * Flush changes before a query execution to enable the query to return new objects and changes made in the persistence unit.
 * Insert persisted objects to ensure their s are assigned and accessible to the application if using   sequencing.
 * Write all changes to the database to allow error handling of any database errors (useful when using JTA or SessionBeans).
 * To flush and clear a batch for batch processing in a single transaction.
 * Avoid constraint errors, or reincarnate an object.

Clear
The operation can be used to clear the persistence context. This will clear all objects read, changed, persisted, or removed from the current  or transaction. Changes that have already been written to the database through, or any changes made to the database will not be cleared. Any object that was read or persisted through the  is detached, meaning any changes made to it will not be tracked, and it should no longer be used unless merged into the new persistence context.

can be used similar to a rollback to abandon changes and restart a persistence context. If a transaction commit fails, or a rollback is performed the persistence context will automatically be cleared.

is similar to closing the  and creating a new one, the main difference being that   can be called while a transaction is in progress. can also be used to free the objects and memory consumed by the. It is important to note that an  is responsible for tracking and managing all objects read within its persistence context. In an application managed  this includes every objects read since the   was created, including every transaction the   was used for. If a long lived  is used, this is an intrinsic memory leak, so calling   or closing the   and creating a new one is an important application design consideration. For JTA managed s the persistence context is automatically cleared across each JTA transaction boundary.

Clearing is also important on large batch jobs, even if they occur in a single transaction. The batch job can be slit into smaller batches within the same transaction and  can be called in between each batch to avoid the persistence context from getting too big.

Close
The operation is used to release an application managed  's resources. JEE JTA managed s cannot be closed, as they are managed by the JTA transaction and JEE server.

The life-cycle of an  can last either a transaction, request, or a users session. Typically the life-cycle is per request, and the  is closed at the end of the request. The objects obtained from an  become detached when the   is closed, and any   relationships may no longer be accessible if they were not accessed before the   was closed. Some JPA providers allow  relationships to be accessed after close.

Get Delegate
The operation is used to access the JPA provider's   implementation class in a JEE managed. A JEE managed  will be wrapped by a proxy   by the JEE server that forwards requests to the   active for the current JTA transaction. If a JPA provider specific API is desired the  API allows the JPA implementation to be accessed to call the API.

In JEE a managed  will typically create a new   per JTA transaction. Also the behavior is somewhat undefined outside of a JTA transaction context. Outside a JTA transaction context, a JEE managed  may create a new   per method, so   may return a temporary   or even. Another way to access the JPA implementation is through the, which is typically not wrapped with a proxy, but may be in some servers.

In JPA 2.0 the  API has been replaced by the   API which is more generic.

Unwrap (JPA 2.0)
The operation is used to access the JPA provider's   implementation class in a JEE managed. A JEE managed  will be wrapped by a proxy   by the JEE server that forwards requests to the   active for the current JTA transaction. If a JPA provider specific API is desired the  API allows the JPA implementation to be accessed to call the API.

In JEE a managed  will typically create a new   per JTA transaction. Also the behavior is somewhat undefined outside of a JTA transaction context. Outside a JTA transaction context, a JEE managed  may create a new   per method, so   may return a temporary   or even. Another way to access the JPA implementation is through the, which is typically not wrapped with a proxy, but may be in some servers.

Example unwrap
/Runtime

Java Persistence