5.4.1HAPI FHIR JPA Schema

This page is a work in progress. It is not yet comprehensive.

It contains a description of the tables within the HAPI FHIR JPA database. Note that columns are shown using Java datatypes as opposed to SQL datatypes, because the exact SQL datatype used will vary depending on the underlying database platform. The schema creation scripts can be used to determine the underlying column types.

5.4.2Background: Persistent IDs (PIDs)

The HAPI FHIR JPA schema relies heavily on the concept of internal persistent IDs on tables, using a Java type of Long (8-byte integer, which translates to an int8 or number(19) on various database platforms).

Many tables use an internal persistent ID as their primary key, allowing the flexibility for other more complex business identifiers to be changed and minimizing the amount of data consumed by foreign key relationships. These persistent ID columns are generally assigned using a dedicated database sequence on platforms which support sequences.

The persistent ID column is generally called PID in the database schema, although there are exceptions.

5.4.3HFJ_RESOURCE: Resource Master Table

The HFJ_RESOURCE table indicates a single resource of any type in the database. For example, the resource Patient/1 will have exactly one row in this table, representing all versions of the resource.

5.4.3.1Columns

Name	Datatype	Nullable	Description
PARTITION_ID	Integer	Nullable	This is the optional partition ID, if the resource is in a partition. See Partitioning.
PARTITION_DATE	Timestamp	Nullable	This is the optional partition date, if the resource is in a partition. See Partitioning.
RES_VER	Long		This is the current version ID of the resource. Will contain `1` when the resource is first created, `2` the first time it is updated, etc. This column is equivalent to the HFJ_RES_VER.RES_VER column, although it does not have a foreign-key dependency in order to allow selective expunge of versions when necessary. Not to be confused with RES_VERSION below.
RES_VERSION	String		This column contains the FHIR version associated with this resource, using a constant drawn from FhirVersionEnum. Not to be confused with RES_VER above.
RES_TYPE	String		Contains the resource type (e.g. `Patient`)
FHIR_ID	String		Contains the FHIR Resource id element. Either the PID, or the client-assigned id.
HASH_SHA256	Long		This column contains a SHA-256 hash of the current resource contents, exclusive of resource metadata. This is used in order to detect NO-OP writes to the resource.
RES_PUBLISHED	Timestamp		Contains the date that the first version of the resource was created.
RES_UPDATED	Timestamp		Contains the date that the most recent version of the resource was created.
RES_DELETED_AT	Timestamp	Nullable	If the most recent version of the resource is a delete, this contains the timestamp at which the resource was deleted. Otherwise, contains NULL.

5.4.4HFJ_RES_VER: Resource Versions and Contents

The HFJ_RES_VER table contains individual versions of a resource. If the resource Patient/1 has 3 versions, there will be 3 rows in this table.

The complete raw contents of the resource is stored in either the RES_TEXT or the RES_TEXT_VC column, using the encoding specified in the RES_ENCODING column.

5.4.4.1Columns

Name	Relationships	Datatype	Nullable	Description
PARTITION_ID		Integer	Nullable	This is the optional partition ID, if the resource is in a partition. See Partitioning.
PARTITION_DATE		Timestamp	Nullable	This is the optional partition date, if the resource is in a partition. See Partitioning.
PID	PK	Long		This is the row persistent ID.
RES_ID	FK to HFJ_RESOURCE	Long		This is the persistent ID of the resource being versioned.
RES_VER		Long		Contains the specific version (starting with 1) of the resource that this row corresponds to.
RES_ENCODING		String		Describes the encoding of the resource being used to store this resource in RES_TEXT. See Encodings below for allowable values.
RES_TEXT		byte[] (SQL LOB)		Contains the actual full text of the resource being stored, stored in a binary LOB.
RES_TEXT_VC		String (SQL VARCHAR2)		Contains the actual full text of the resource being stored, stored in a textual VARCHAR2 column. Only one of `RES_TEXT` and `RES_TEXT_VC` will be populated for any given row. The other column in either case will be null.

5.4.4.2Encodings

Value	Description
JSONC	The resource is serialized using FHIR JSON encoding, and then compressed into a byte stream using GZIP compression.

5.4.4.3Columns

Name	Relationships	Datatype	Nullable	Description
PARTITION_ID		Integer	Nullable	This is the optional partition ID, if the resource is in a partition. See Partitioning.
PARTITION_DATE		Timestamp	Nullable	This is the optional partition date, if the resource is in a partition. See Partitioning.
PID	PK	Long		This is the row persistent ID.
RESOURCE_PID	FK to HFJ_RESOURCE	Long		This is the persistent ID of the resource being versioned.
FORCED_ID		String		Contains the specific version (starting with 1) of the resource that this row corresponds to.
RESOURCE_TYPE		String	Contains the string specifying the type of the resource (Patient, Observation, etc).

5.4.5HFJ_RES_LINK: Search Links

When a resource is created or updated, it is indexed for searching. Any search parameters of type Reference are resolved, and one or more rows may be created in the HFJ_RES_LINK table.

5.4.5.1Columns

Name	Datatype	Nullable	Description
PARTITION_ID	Integer	Nullable	This is the optional partition ID, if the resource is in a partition. See Partitioning. Note that the partition indicated by the PARTITION_ID and PARTITION_DATE columns refers to the partition of the SOURCE resource, and not necessarily the TARGET.
PARTITION_DATE	Timestamp	Nullable	This is the optional partition date, if the resource is in a partition. See Partitioning. Note that the partition indicated by the PARTITION_ID and PARTITION_DATE columns refers to the partition of the SOURCE resource, and not necessarily the TARGET.
PID	Long		Holds the persistent ID
SRC_PATH	String		Contains the FHIRPath expression within the source resource containing the path to the target resource, as supplied by the SearchParameter resource that defined the link.
SRC_RESOURCE_ID	Long		Contains a FK reference to the resource containing the link to the target resource.
TARGET_RESOURCE_ID	Long	Nullable	Contains a FK reference to the resource that is the target resource. Will not be populated if the link contains a reference to an external resource, or a canonical reference.
TARGET_RESOURCE_URL	String	Nullable	If this row contains a reference to an external resource or a canonical reference, this column will contain the absolute URL.
SP_UPDATED	Timestamp		Contains the last updated timestamp for this row.

5.4.6Background: Search Indexes

The HFJ_SPIDX (Search Parameter Index) tables are used to index resources for searching. When a resource is created or updated, a set of rows in these tables will be added. These are used for finding appropriate rows to return when performing FHIR searches. There are dedicated tables for supporting each of the non-reference FHIR Search Datatypes: Date, Number, Quantity, String, Token, and URI. Note that Reference search parameters are implemented using the HFJ_RES_LINK table above.

5.4.6.1Search Hashes

The SPIDX tables leverage "hash columns", which contain a hash of multiple columns in order to reduce index size and improve search performance. Hashes currently use the MurmurHash3_x64_128 hash algorithm, keeping only the first 64 bits in order to produce a LongInt value.

For example, all search index tables have columns for storing the search parameter name (SP_NAME) and resource type (RES_TYPE). An additional column which hashes these two values is provided, called HASH_IDENTITY.

In some configurations, the partition ID is also factored into the hashes.

5.4.6.2Tables

5.4.6.3Common Search Index Columns

The following columns are common to all HFJ_SPIDX_xxx tables.

Name	Relationships	Datatype	Nullable	Description
PARTITION_ID		Integer	Nullable	This is the optional partition ID, if the resource is in a partition. See Partitioning. Note that the partition indicated by the PARTITION_ID and PARTITION_DATE columns refers to the partition of the SOURCE resource, and not necessarily the TARGET.
PARTITION_DATE		Timestamp	Nullable	This is the optional partition date, if the resource is in a partition. See Partitioning. Note that the partition indicated by the PARTITION_ID and PARTITION_DATE columns refers to the partition of the SOURCE resource, and not necessarily the TARGET.
SP_ID		Long		Holds the persistent ID
RES_ID	FK to HFJ_RESOURCE	Long		Contains the PID of the resource being indexed.
SP_NAME		String	Nullable	This is the name of the search parameter being indexed.
RES_TYPE		String	Nullable	This is the name of the resource being indexed.
HASH_IDENTITY		Long		A hash of SP_NAME and RES_TYPE. Used to narrow the table to a specific SearchParameter during sorting, and some queries.
SP_UPDATED		Timestamp	Nullable	This is the time that this row was last updated.
SP_MISSING		boolean		If this row represents a search parameter that is not populated at all in the resource being indexed, this will be populated with the value `true`. Otherwise it will be populated with `false`.

5.4.7HFJ_SPIDX_DATE: Date Search Parameters

For any FHIR Search Parameter of type date that generates a database index, a row in the HFJ_SPIDX_DATE table will be created. Range queries with Date parameters (e.g. Observation?date=ge2020-01-01) will query the HASH_IDENTITY, SP_VALUE_LOW_DATE_ORDINAL and/or SP_VALUE_HIGH_DATE_ORDINAL columns. Range queries with DateTime parameters (e.g. Observation?date=ge2021-01-01T10:30:00) will query the HASH_IDENTITY, SP_VALUE_LOW and/or SP_VALUE_HIGH columns. Sorting is done by the SP_VALUE_LOW column.

5.4.7.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Datatype	Nullable	Description
SP_VALUE_LOW	Timestamp	Nullable	This is the lower bound of the date in question. For a point in time date to millisecond precision (such as an Instant with a value of `2020-05-26T15:00:00.000`) this represents the exact value. For an instant value with lower precision, this represents the start of the possible range denoted by the value. For example, for a value of `2020-05-26` this represents `2020-05-26T00:00:00.000`. For a Period with a lower (start) value present, this column contains that value. For a Period with no lower (start) value present, this column contains a timestamp representing the "start of time".
SP_VALUE_HIGH	Timestamp	Nullable	This is the upper bound of the date in question. For a point in time date to millisecond precision (such as an Instant with a value of `2020-05-26T15:00:00.000`) this represents the exact value. For an instant value with lower precision, this represents the start of the possible range denoted by the value. For example, for a value of `2020-05-26` this represents `2020-05-26T23:59:59.999`. For a Period with an upper (end) value present, this column contains that value. For a Period with no upper (end) value present, this column contains a timestamp representing the "end of time".
SP_VALUE_LOW_DATE_ORDINAL	Integer	Nullable	This column contains the same Timestamp as `SP_VALUE_LOW`, but truncated to Date precision and formatted as an integer in the format "YYYYMMDD".
SP_VALUE_HIGH_DATE_ORDINAL	Integer	Nullable	This column contains the same Timestamp as `SP_VALUE_HIGH`, but truncated to Date precision and formatted as an integer in the format "YYYYMMDD".

5.4.8HFJ_SPIDX_NUMBER: Number Search Parameters

FHIR Search Parameters of type number produce rows in the HFJ_SPIDX_NUMBER table. Range queries and sorting use the HASH_IDENTITY and SP_VALUE columns.

5.4.8.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Relationships	Datatype	Nullable	Description
SP_VALUE		Double	Not nullable	This is the value extracted by the SearchParameter expression.

5.4.9HFJ_SPIDX_QUANTITY: Quantity Search Parameters

FHIR Search Parameters of type quantity produce rows in the HFJ_SPIDX_QUANTITY table. Range queries (e.g. Observation?valueQuantity=gt100) with no units provided will query the HASH_IDENTITY and SP_VALUE columns. Range queries (e.g. Observation?valueQuantity=gt100||mmHg) with a unit but not unit-sytem provided will use the HASH_IDENTITY_AND_UNITS and SP_VALUE columns. Range queries (e.g. Observation?valueQuantity=gt100|http://unitsofmeasure.org|mmHg) with a full system and unit will use the HASH_IDENTITY_SYS_UNITS and SP_VALUE columns. Sorting is done via the HASH_IDENTITY and SP_VALUE columns.

5.4.9.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Datatype	Description
HASH_IDENTITY_AND_UNITS	Long	A hash like HASH_IDENTITY that also includes the SP_UNITS column.
HASH_IDENTITY_SYS_UNITS	Long	A hash like HASH_IDENTITY that also includes the SP_SYSTEM and SP_UNITS columns.
SP_SYSTEM	String	The system of the quantity units. e.g. "http://unitsofmeasure.org".
SP_UNITS	String	The units of the quantity. E.g. "mg".
SP_VALUE	Double	This is the value extracted by the SearchParameter expression.

5.4.10HFJ_SPIDX_QUANTITY_NRML: Normalized Quantity Search Parameters

Hapi Fhir supports searching by normalized units when enabled (see https://hapifhir.io/hapi-fhir/apidocs/hapi-fhir-jpaserver-model/ca/uhn/fhir/jpa/model/entity/StorageSettings.html#getNormalizedQuantitySearchLevel()). When this feature is enabled, each row stored in HFJ_SPIDX_QUANTITY to also store a row in HFJ_SPIDX_QUANTITY_NRML in canonical UCUM units. E.g. a weight recorded in an Observation as

"valueQuantity" : {
    "value" : 172,
    "unit" : "lb_av",
    "system" : "http://unitsofmeasure.org",
    "code" : "[lb_av]"
  },

would match the search Observation?valueQuantity=172, but would also match the search Observation?valueQuantity=78|http://unitsofmeasure.org|kg. The row in HFJ_SPIDX_QUANTITY would contain the value 172 pounds, while the HFJ_SPIDX_QUANTITY_NRML table would hold the equivalent 78 kg value. Only value searches that provide fully qualified units are eligible for normalized searches. Sorting only uses the HFJ_SPIDX_QUANTITY table.

5.4.10.1Columns

Same as HFJ_SPIDX_QUANTITY above, except the SP_VALUE, SP_SYSTEM, and SP_UNITS columns hold the converted value in canonical units instead of the value extracted by the SearchParameter. This table is only used for range queries with a unit which can be converted to canonical units.

5.4.11HFJ_SPIDX_STRING: String Search Parameters

FHIR Search Parameters of type string produce rows in the HFJ_SPIDX_STRING table. The default string search matches by prefix, ignoring case or accents. This uses the HASH_IDENTITY column and a LIKE prefix clause on the SP_VALUE_NORMALIZED columns. The :exact string search matches exactly. This uses only the HASH_EXACT column. Sorting is done via the HASH_IDENTITY and SP_VALUE_NORMALIZED columns.

5.4.11.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Datatype	Description
HASH_EXACT	Long	A hash like HASH_IDENTITY that also includes the SP_VALUE_EXACT column.
SP_VALUE_NORMALIZED	String	An UPPERCASE string with accents removed.
SP_VALUE_EXACT	String	The extracted string unchanged.

5.4.12HFJ_SPIDX_TOKEN: Token Search Parameters

FHIR Search Parameters of type token extract values of type Coding, code, and others. These produce rows in the HFJ_SPIDX_TOKEN table. The default token search accepts three parameter formats: matching the code (e.g. Observation?code=15074-8), matching both system and code (e.g. Observation?code=http://loinc.org|15074-8), or matching a system with any code (e.g. Observation?http://loinc.org|). All three are exact searches and use the hashes: HASH_VALUE, HASH_SYS_AND_VALUE, and HASH_SYS respectively. Sorting is done via the HASH_IDENTITY and SP_VALUE columns.

5.4.12.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Datatype	Description
HASH_VALUE	Long	A hash like HASH_IDENTITY that also includes the SP_VALUE column.
HASH_SYS_AND_VALUE	Long	A hash like HASH_IDENTITY that also includes the SP_SYSTEM and SP_VALUE columns.
HASH_SYS	Long	A hash like HASH_IDENTITY that also includes the SP_SYSTEM column.
SP_SYSTEM	String	The system of the code.
SP_VALUE	String	This is the bare code value.

5.4.13HFJ_SPIDX_URI: URI Search Parameters

FHIR Search Parameters of type uri produce rows in the HFJ_SPIDX_URI table. The default uri search matches the complete uri. This uses the HASH_URI column for an exact match. A uri search with the :above modifier will match any prefix. This also uses the HASH_URI column, but also tests hashes of every prefix of the query value. A uri search with the :below modifier will match any extension. This query uses the HASH_IDENTITY and a LIKE prefix match of the SP_URI column. Sorting is done via the HASH_IDENTITY and SP_URI columns.

5.4.13.1Columns

Note: This table has the columns listed below, but it also has all common columns listed above in Common Search Index Columns.

Name	Relationships	Datatype	Nullable	Description
HASH_URI		Long		A hash like HASH_IDENTITY that also includes the SP_URI column.
SP_URI		String		The uri string extracted by the SearchParameter.

5.4.14HFJ_SPIDX_IDENTITY: Search Parameter Identities

The HFJ_SPIDX_IDENTITY table stores hash identities representing unique combinations of search parameter name and resource type of non-reference FHIR Search Datatypes: Date, Number, Quantity, String, Token, and URI.

While not currently used in search execution, this table provides metadata to interpret hash identities in the HFJ_SPIDX_xxx tables. When the server is configured to omit writing SP_NAME and RES_TYPE to the HFJ_SPIDX_xxx tables, the HFJ_SPIDX_IDENTITY table can be used to reconstruct the SP_NAME and RES_TYPE values for the stored hash identities. See Enabling Index Storage Optimization for more information.

5.4.14.1Columns

Name	Datatype	Description
SP_IDENTITY_ID	Integer	This is the persistent ID of the Search Parameter Identity.
HASH_IDENTITY	Long	A hash of SP_NAME and RES_TYPE. Used in the HFJ_SPIDX_xxx tables to narrow results to a specific SearchParameter during sorting and certain queries.
RES_TYPE	String	The FHIR resource type associated with this search parameter.
SP_NAME	String	The name of the FHIR search parameter associated with this identity.

5.4.15HFJ_IDX_CMB_TOK_NU: Combo Non-Unique Search Param

This table is used to index Non-Unique Combo Search Parameters.

5.4.15.1Columns

Name	Relationships	Datatype	Description
PID		Long	A unique persistent identifier for the given index row.
RES_ID	FK to HFJ_RESOURCE	Long	Contains the PID of the resource being indexed.
IDX_STRING		String	This column contains a FHIR search expression indicating what is being indexed. For example, if a non-unique combo search parameter is present which indexes a combination of Observation#code and Observation#status, this column might contain a value such as `Observation?code=http://loinc.org\|1234-5&status=final`
HASH_COMPLETE		Long	This column contains a hash of the value in column `IDX_STRING`.

5.4.16HFJ_IDX_CMP_STRING_UNIQ: Combo Unique Search Param

This table is used to index Unique Combo Search Parameters.

5.4.16.1Columns

Name	Relationships	Datatype	Description
PID		Long	A unique persistent identifier for the given index row.
RES_ID	FK to HFJ_RESOURCE	Long	Contains the PID of the resource being indexed.
IDX_STRING		String	This column contains a FHIR search expression indicating what is being indexed. For example, if a unique combo search parameter is present which indexes a combination of Observation#code and Observation#status, this column might contain a value such as `Observation?code=http://loinc.org\|1234-5&status=final`
HASH_COMPLETE		Long	This column contains a hash of the value in column `IDX_STRING`.
HASH_COMPLETE_2		Long	This column contains an additional hash of the value in column `IDX_STRING`, using a static salt of the value prior to the hashing. This is done in order to increase the number of bits used to hash the index string from 64 to 128.

5.3 Database Support 5.5 Configuration

Welcome to HAPI FHIR

Getting Started

Working With The FHIR Model

Client

Plain Server

JPA Server

JPA Server: MDM

JPA Server: Partitioning and Multitenancy

JPA Server: Batch Processing

JPA Server: HFQL (SQL) Driver

Clinical Reasoning

Interceptors

Security

Validation

Android

Tools

Contributing to HAPI FHIR

Appendix

5.4.1HAPI FHIR JPA Schema

5.4.2Background: Persistent IDs (PIDs)

5.4.3HFJ_RESOURCE: Resource Master Table

5.4.4HFJ_RES_VER: Resource Versions and Contents

5.4.5HFJ_RES_LINK: Search Links

5.4.6Background: Search Indexes

5.4.7HFJ_SPIDX_DATE: Date Search Parameters

5.4.8HFJ_SPIDX_NUMBER: Number Search Parameters

5.4.9HFJ_SPIDX_QUANTITY: Quantity Search Parameters

5.4.10HFJ_SPIDX_QUANTITY_NRML: Normalized Quantity Search Parameters

5.4.11HFJ_SPIDX_STRING: String Search Parameters

5.4.12HFJ_SPIDX_TOKEN: Token Search Parameters

5.4.13HFJ_SPIDX_URI: URI Search Parameters

5.4.14HFJ_SPIDX_IDENTITY: Search Parameter Identities

5.4.15HFJ_IDX_CMB_TOK_NU: Combo Non-Unique Search Param

5.4.16HFJ_IDX_CMP_STRING_UNIQ: Combo Unique Search Param