Java Persistence/Querying

=Querying= Querying is a fundamental part of persistence. Being able to persist something is not very useful without being able to query it back. There are many querying languages and frameworks; the most common query language is SQL used in relational databases.

JPA provides several querying mechanisms:
 * JPQL : (BNF)
 * Criteria API
 * Native SQL Queries

JPA primarily uses the Java Persistence Querying Language (JPQL), which is based on the SQL language and evolved from the EJB Query Language (EJBQL). It basically provides the SQL syntax at the object level instead of at the data level. JPQL is similar in syntax to SQL and can be defined through its BNF definition.

JPA also provides the Criteria API that allows dynamic queries to be easily built using a Java API. The Criteria API mirrors the JPQL syntax, but provides Java API for each operation/function instead of using a separate query language.

JPA provides querying through the  interface, and the   and   annotations and the   and   XML elements.

Other querying languages and frameworks include:
 * SQL
 * EJBQL
 * JDOQL
 * EQL (EclipseLink Query Language)
 * Query By Example (QBE)
 * TopLink Expressions
 * Hibernate Criteria
 * Object Query Language (OQL)
 * Query DSL

Named Queries
There are two main types of queries in JPA, named queries and dynamic queries. A named query is used for a static query that will be used many times in the application. The advantage of a named query is that it can be defined once, in one place, and reused in the application. Most JPA providers also pre-parse/compile named queries, so they are more optimized than dynamic queries which typically must be parsed/compiled every time they are executed. Since named queries are part of the persistence meta-data they can also be optimized or overridden in the orm.xml without changing the application code.

Named queries are defined through the and  annotations, or   XML element. Named queries are accessed through the API, and executed through the  interface.

Named queries can be defined on any annotated class, but are typically defined on the  that they query for. The name of the named query must be unique for the entire persistence unit, they name is not local to the. In the orm.xml named queries can be defined either on the  or on any.

Named queries are typically parametrized, so they can be executed with different parameter values. Parameters are defined in JPQL using the  syntax for named parameters, or the   syntax for positional parameters.

A collection of query hints can also be provided to a named query. Query hints can be used to optimize or to provide special configuration to a query. Query hints are specific to the JPA provider. Query hints are defined through the annotation or   XML element.

Dynamic Queries
Dynamic queries are normally used when the query depends on the context. For example, depending on which items in the query form were filled in, the query may have different parameters. Dynamic queries are also useful for uncommon queries, or prototyping.

JPA provides two main options for dynamic queries, JPQL and the Criteria API.

Dynamic queries can use parameters, and query hints the same as named queries.

Dynamic queries are accessed through the API, and executed through the  interface.

Criteria API (JPA 2.0)
See Criteria API.

JPQL
See JPQL.

Parameters
Parameters are defined in JPQL using the  syntax, i.e.  .  The parameter values are set on the   using the  API.

Parameters can also be defined using the, mainly for native SQL queries. You can also use. These are positional parameters, not named parameters and are set using the  API. The  is the index of the parameter in the SQL. Positional parameters start a 1 (not 0). Some JPA providers also allow the  syntax for native queries.

For temporal parameters you can also pass the temporal type, depending on if you want the ,   or   from the value.

Parameters are normally basic values, but you can also reference objects if comparing on their, i.e.,   can take the   object as a parameter. The parameter values are always at the object level when comparing to a mapped attribute, for example if comparing a mapped  the   value is used, not the database value.

Parameters are always set on the, no matter what type of query it is (JPQL, Criteria, native SQL, NamedQuery).

Named Parameter:

Positional Parameter:

Query Results
Normally JPA queries return your persistent  objects. The returned objects will be managed by the persistent context and changes made to the objects will be tracked as part of the current transaction. In some cases, more complex queries can be built that just return data instead of  objects, or even perform update or deletion operations.

There are three methods to execute a, each returning different results:
 * 
 * 
 * 

returns a  of the results. This is normally a  of   objects, but could also be a list of data, or arrays.

This returns a  (List of Employee objects). The objects will be managed. This returns a  (List of String values). The data is not managed. This returns a  (List of object arrays each with two String values). The data is not managed. This returns a  (List of object arrays each with an Employee and Address objects). The objects will be managed. This returns a  (List of object arrays each with the row data). The data is not managed.

returns the results. This is normally an  object, but could also be data, or an object array. If the query returns nothing, an exception is thrown. This is unfortunate, as typically just returning  would be desired. Some JPA providers may have an option to return  instead of throwing an exception if nothing is returned. An exception is also thrown if the query returns more than just a single row. This is also unfortunate, as typically just returning the first result is desired. Some JPA providers may have an option to return the first result instead of throwing an exception, otherwise you need to call  and get the first element.

This returns an. The object will be managed. This returns a. The data is not managed. This returns an  (object array with two String values). The data is not managed. This returns an  (object array with an Employee and Address object). The objects will be managed. This returns an  (object array with the row data). The data is not managed.

returns the database row count. This can be used for    JPQL queries, or any native SQL (DML or DDL) query that does not return a result.

Joining, querying on a OneToMany relationship
To query all employees with a phone number in 613 area code a join is used.

JPQL:

Criteria:

Subselect, querying all of a to many relationship
To query all employees whose projects are all in trouble a subselect with a double negation is used.

JPQL:

Criteria:

Subselect, querying a to many relationship where all of the relations are in a list
To query all employees who are in all the list of projects a subselect with a count is used. First collect the project ids from the projects (using the objects may work as well). Then get the size of the list. Your query will need two arguments, one of the list of ids, and one for the size of the list.

JPQL:

Join fetch, read both employee and address in same query
To query all employees and their address a join fetch is used. This selects both the employee and address data in the same query. If the join fetch was not used, the employee address would still be available, but could cause a query for each employee for its address. This reduces n+1 queries to 1 query.

Join fetch:

Join fetch can also be used on collection relationships:

Outer joins can be used to avoid  and empty relationships from filtering the results:

You can also select multiple objects in a query, but note that this does not instantiate the relationship, so accessing the relationship could still trigger another query:

Inverse ManyToMany, all employees for a given project
To query all employees for a given project where the employee project relationship is a ManyToMany.

If the relationship is bi-directional you could use:

If it is uni-directional you could use:

or,

How to simulate casting to a subclass
To query all employees who have a large project with a budget greater than 1,000,000 where the employee only has a relationship to Project, not to the LargeProject subclass. JPA 1.0 JPQL does not define a cast operation (JPA 2.0 may define this), so querying on an attribute of a subclass is not obvious. This can be done indirectly however, if you add a secondary join to the subclass to the query.

How to select the first element in a collection
To query the employees first project for a particular employee. There are a few different ways to do this, some using straight JPQL, and some using the    API. If a JPA 2.0 indexed list is used to map the collection, then the  function can be used.

setMaxResults:

JPQL:

JPA 2.0:

How to order by the size of a collection
To query all employees ordered by the number of projects. There are a few different ways to do this, some end up using sub selects in SQL, and some use group by. Depending on your JPA provider and database you solution may be limited to one or the other.

Using SIZE function (uses sub-select in SQL)

Using SIZE function, also selects the size (uses group by)

Using GROUP BY

Using GROUP BY and alias

=Advanced=

Join Fetch and Query Optimization
There are several ways to optimize queries in JPA. The typical query performance issue is that an object is read first, then its related objects are read one by one. This can be optimized using  in JPQL, otherwise by query hints specific for each JPA provider.

See,
 * Join Fetching
 * Batch Fetching

Timeouts, Fetch Size and other JDBC Optimizations
There are several JDBC options that can be used when executing a query. These JDBC options are not exposed by JPA, but some JPA providers may support query hints for them.
 * Fetch size : Configures the number of rows to fetch from the database in each page. A larger fetch size is more efficient for large queries.
 * Timeout : Instructs the database to cancel the query if its execution takes too long.


 * EclipseLink/TopLink : Provide many query hints including:
 * "eclipselink.jdbc.fetch-size" - Fetch size.
 * "eclipselink.jdbc.timeout" - Timeout.
 * "eclipselink.read-only" - The objects returned from the query are not managed by the persistence context, and not tracked for changes.
 * "eclipselink.query-type" - Defines the native type of query to use for the query.
 * "eclipselink.sql.hint" - Allows an SQL hint to be included in the SQL for the query.
 * "eclipselink.jdbc.bind-parameters" - Specifies if parameter binding should be used or not, (is used by default).

Update and Delete Queries
JPQL also allows for  and   queries to be executed. This is not the recommend or normal way to modify objects in JPA. Normally in JPA you first read the object, then either modify it directly using its  methods to update it, or call the   method to delete it.

and  queries in JPQL are for performing batch updates or deletions. There allow a set of objects to be updated or deleted in a single query. These queries are useful for performing batch operations, or clearing test data.

and  queries have a   the same as   queries, and can use the same functions and operations, and traverse relationships and make use of sub selects. and  queries are executed using the   method, and return the row count from the database. Note that some caution should be used in execute these queries in an active persistence context, as the queries may effect the objects that have already been registered in the. Normally it is a good idea to  the   after executing the query, or to execute the query in a new   or transaction.

Flush Mode
Within a transaction context in JPA, changes made to the managed objects are normally not flushed (written) to the database until commit. So if a query were executed against the database directly, it would not see the changes made within the transaction, as these changes are only made in memory within the Java. This can cause issues if new objects have been persisted, or objects have been removed or changed, as the application may expect the query to return these results. Because of this JPA requires that the JPA provider performs a flush of all changes to the database before any query operation. This however can cause issues if the application is not expecting that a flush as a side effect of a query operation. If the application changes are not yet in a state to be flushed, a flush may not be desired. Flushing also can be expensive and causes the database transaction, and database locks and other resources to be held for the duration of the transaction, which can effect performance and concurrency.

JPA allows the flush mode for a query to be configured using the enum and the  API. The flush mode is either  the default which means flush before every query execution, or   which means only flush on commit. The flush mode can also be set on an  using the  API, to affect all queries executed with the. The API can be called directly on the   anytime that a flush is desired.

Some JPA providers also let the flush mode be configured through persistence unit properties, or offer alternatives to flushing, such as performing the query against the in memory objects.


 * TopLink / EclipseLink : Allow the auto flush to be disabled using the persistence unit property.

Pagination, Max/First Results
A common requirement is to allow the user to page through a large query result. Typically a web user is given the first page of n results after a query execution, and can click next to go to the next page, or previous to go back.

If you are not concerned about performance, or the results are not too big, the easiest way to implement this is to query all of the results, then access the sub-list from the result list to populate your page. However, you will then have to re-query the entire results on every page request.

One simple solution is to store the query results in a stateful  or an http session. This means the initial query make take a while, but paging will be fast. Some JPA providers also support the caching of query results, so you can cache the results in your JPA providers cache and just re-execute the query to obtain the cached results.

If the query result is quite large, then another solution may be required. JPA provides the  API,  to allow paging through a large query result. The  can also be used as a safeguard to avoid letting users execute queries that return too many objects.

How these query properties are implemented depends on the JPA provider and database. JDBC allows the  to be set, and most JDBC drivers support this, so it will normally work for most JPA providers and most databases. Support for  can be less guaranteed to be efficient, as it normally requires database specific SQL. There is no standard SQL for pagination, so whether if this is supported depends on your database, and your JPA providers support.

When performing pagination, it is also important to order the result. If the query does not order the result, then each subsequent query could potentially return the results in a different order, and give a different page. Also if rows are insert/deleted in between the queries, the results can be slightly different.

Example using firstResult, maxResults
An alternative to using  is to filter the first result in the where clause based on the order by and the value from the previous page.

Example using maxResults and order by
Another alternative is to only query the s, and store this result in a stateful   or an http session. Then query for the set of s for each page.

Example using Ids and IN
Pagination can also be used for server processes, or batch jobs. On the server, it is normally used to avoid using too much memory upfront, and allow processing each batch one at a time. Any of these techniques can be used, also some JPA providers support returning a database cursor for the query results that allows scrolling through the results.


 * TopLink / EclipseLink : Support streams and scrollable cursors through the query hints  and , and   and   classes.

Native SQL Queries
Typically queries in JPA are defined through JPQL. JPQL allows the queries to be defined in terms of the object model, instead of the data model. Since developers are programming in Java using the object model, this is normally more intuitive. This also allows for data abstraction and database schema and database platform independence. JPQL supports much of the SQL syntax, but some aspects of SQL, or specific database extensions or functions may not be possible through JPQL, so native SQL queries are sometimes required. Also some developers have more experience with SQL than JPQL, so may prefer to use SQL queries. Native queries can also be used for calling some types of stored procedures or executing DML or DDL operations.

Native queries are defined through the and  annotations, or   XML element. Native queries can also be defined dynamically using the API.

A native query can be for a query for instances of a class, a query for raw data, an update or DML or DDL operation, or a query for a complex query result. If the query is for a class, the  attribute of the query must be set. If the query result is complex, a Result Set Mapping can be used.

Native queries can be parameterized, so they can be executed with different parameter values. Parameters are defined in SQL using the  syntax for positional parameters, JPA does not require native queries support named parameters, but some JPA providers may. For positional parameter the position starts a 1 (not 0).

A collection of query hints can also be provided to a native query. Query hints can be used to optimize or to provide special configuration to a query. Query hints are specific to the JPA provider. Query hints are defined through the annotation or   XML element.

Result Set Mapping
When a native SQL query returns objects, the SQL must ensure it returns the correct data to build the  using the correct column names as specified in the mappings. If the SQL is more complex and returns different column names, or returns data for multiple objects then a must be used.

is a fairly complex annotation containing an array of, , and. This allows multiple  objects in combination with raw data, and non-mapped classes, to be returned. The  contains an array of, which can be used to map the alias name used in the SQL to the column name required by the mapping. This is required if you need to return two different instances of the same class, or if the SQL needs to alias the columns differently for some reason. Note that in the  the   is the name of the attribute in the object, not the column name in the mapping. This seems odd, because this would make mapping an  or composite id relationship not possible.

Normally it is easiest to either select raw data or a single object with native SQL queries, so s can normally be avoided, as they are quite complex. Also note that even if you select the  and its   with the SQL, these are two unrelated objects, the employee's address is not set, and may trigger a query if accessed unless a cache hit occurs. Some JPA providers may provide a query hint to allow join fetching to be used with native SQL queries.


 * TopLink / EclipseLink : Support join fetching with native SQL queries through the   query hint.

ConstructorResult (JPA 2.1)
JPA 2.1 defines a annotation to allow the returning of non-mapped classes from native SQL queries. The  is similar to the JPQL   operator that allows the calling of a class constructor passing in the raw data. The  has a   and   array of. The target class must define a constructor taking the same number of arguments and types as defined by the.

Stored Procedures
See Stored Procedures

Raw JDBC
It can sometimes be required to mix JDBC code with JPA code. This may be to access certain JDBC driver specific features, or to integrate with another application that uses JDBC instead of JPA.

If you just require a JDBC connection, you could access one from your JEE server's, or connect directly to   or a third party connection pool. If you need a JDBC connection in the same transaction context and your JPA application, you could use a JTA  for JPA and your JDBC access to have them share the same global transaction. If you are not using JEE, or not using JTA, then you may be able to access the JDBC connection directly from your JPA provider.

Some JPA providers provide an API to access a raw JDBC connection from their internal connection pool, or from their transaction context. In JPA 2.0 this API is somewhat standardized by the  API on.

To access a JDBC connection from an, some JPA 2.0 providers may support:

This connection could then be used for raw JDBC access. It normally should not be close when finished, as the connection is being used by the  and will be released when the   is closed or transaction committed.


 * TopLink / EclipseLink : Support unwrapping the JDBC.

/Runtime