Mapping Array Column of postgres in Hibernate

Representing Array Column in HQL:

Hibernate does not support database arrays (e.g. ones mapped to java.sql.Array) out of the box.

To Support Array Column of postgres we need to implement custom UserType.

For suppose following is the table syntax which users varchar[] array column

CREATE TABLE posts (title varchar, tags varchar[]);
INSERT INTO posts VALUES ('apple', '{fruit,seasonal}');
INSERT INTO posts VALUES ('badam', '{dry fruit,healthy,tasty}');
INSERT INTO posts VALUES ('coriander', '{leafy vegetables,tasty}');

Following custom UserType (StringArrayType), supports varchar[] column in Postgres.

package com.varam.blog.hql;
import java.io.Serializable;
import java.sql.Array;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Types;

import org.hibernate.HibernateException;
import org.hibernate.engine.spi.SessionImplementor;
import org.hibernate.usertype.UserType;

public class StringArrayType implements UserType {

 private final int[] arrayTypes = new int[] { Types.ARRAY };

 @Override
 public int[] sqlTypes() {
 return arrayTypes;
 }

 @Override
 public Class<String[]> returnedClass() {
 return String[].class;
 }

 @Override
 public boolean equals(Object x, Object y) throws HibernateException {
 return x == null ? y == null : x.equals(y);
 }

 @Override
 public int hashCode(Object x) throws HibernateException {
 return x == null ? 0 : x.hashCode();
 }

 @Override
 public Object nullSafeGet(ResultSet rs, String[] names, SessionImplementor session, Object owner) throws HibernateException, SQLException {
 // get the first column names
 if (names != null && names.length > 0 && rs != null && rs.getArray(names[0]) != null) {
 String[] results = (String[]) rs.getArray(names[0]).getArray();
 return results;
 }
 return null;
 }

 @Override
 public void nullSafeSet(PreparedStatement st, Object value, int index, SessionImplementor session) throws HibernateException,
 SQLException {
 // setting the column with string array
 if (value != null && st != null) {
 String[] castObject = (String[]) value;
 Array array = session.connection().createArrayOf("text", castObject);
 st.setArray(index, array);
 } else {
 st.setNull(index, arrayTypes[0]);
 }
 }

 @Override
 public Object deepCopy(Object value) throws HibernateException {
 return value == null ? null : ((String[]) value).clone();
 }

 @Override
 public boolean isMutable() {
 return false;
 }

 @Override
 public Serializable disassemble(Object value) throws HibernateException {
 return (Serializable) value;
 }

 @Override
 public Object assemble(Serializable cached, Object owner) throws HibernateException {
 return cached;
 }

 @Override
 public Object replace(Object original, Object target, Object owner) throws HibernateException {
 return original;
 }
}

usage of custom array type in annotation based hibernate model

@Entity
@Table(name = "posts", schema = "public")

public class Posts extends IEntity {
@Column(unique=true, name = "title") private String title;
@Column(name = "accountid") 
@Type(type = "com.varam.blog.hql.StringArrayType")
private String[] tags; 

//Setters and getter methods   
}

This will also help the hbm2ddl to automatically generate the sql array columns in postgres.

create a column with array type in postgres

PostgreSQL Array Column Types

Every one of us remembers the moment when you first learned about databases and you were told that if you want to store multiple values in one column you have to split that in a 1:N relationship, because each column can only hold one value. Well, that’s not true in the world of PostgreSQL. You are free to create array columns with arbitrary length. You can even perform array-like queries on them.

But first we need to start by creating a table. We’ll do this with the most obvious example — file sharing among multiple user groups. in googledrive, dropbox… a file can be shared between multiple user groups, so that all the user in that particular user group can be able to see the file.

Let’s first create a new database so that we can play around and drop it at the end of out session, keeping our machine clean.

$ psql -U darth postgres
postgres=# CREATE DATABASE test TEMPLATE template0;
CREATE DATABASE
postgres=# \c test;

Defining array column types

The way we tell PostgreSQL that a column type is an array is by simply appending [] at the end of the type, now lets create posts  table with usergroups column as an array.

CREATE TABLE posts (
  id SERIAL PRIMARY KEY,
  filename VARCHAR(255) NOT NULL,
  username VARCHAR(255) NOT NULL,
  usergroups VARCHAR(255)[] DEFAULT '{}'
);
test=# \d posts
                                  Table "public.posts"
 Column    |           Type           |                     Modifiers
-----------+--------------------------+----------------------------------------------------
 id        | integer                  | not null default nextval('posts_id_seq'::regclass)
 filename  | character varying(255)   | not null
 username  | character varying(255)   | not null
 usergroups| character varying(255)[] | default '{}'::character varying[]
Indexes:
    "posts_pkey" PRIMARY KEY, btree (id)

In our example code we’re also defining a default value, in this case an empty array, which is defined as '{}' (the single quotes are important here). If we wanted a different default value, we can simply put it in the curly brackets '{thing}'. Note that these arrays columns can also be indexed and thus lighting fast:

CREATE INDEX idx_test on things USING GIN(usergroups);

Now that we know the syntax, we can actually start inserting some data.

test=# INSERT INTO posts (filename, username, usergroups) VALUES ('sample post.doc', 'vicky, '{dev,eng}');
INSERT 0 1

test=# INSERT INTO posts (filename, username, usergroups) VALUES ('another post.pdf', 'jhon', '{dev,qa}');
INSERT 0 1

test=# SELECT * FROM posts;
 id |    filename      |  username | usergroups
----+------------------+-----------+---------------
  1 | sample post.doc  | vicky     | {dev,eng}
  2 | another post.pdf | jhon      | {dev,qa}
(2 rows)

The advantage of doing this over just serializing the usergroups as a string (for example “dev,eng”) is that PostgreSQL knows it’s an array and it can perform operations on it. We can easily select posts with a single usergroup.

test=# SELECT title FROM posts WHERE 'dev' = ANY(usergroups);
    title
------------------
 sample post.doc
 another post.pdf
(2 rows)

We can also update a specific element of an array (notice that the indexing starts at 1 by default).

test=# UPDATE posts SET usergroups[2] = 'hr' WHERE id = 1;
UPDATE 1

test=# SELECT * FROM posts;
id |    filename      |  username | usergroups
----+------------------+-----------+---------------
  1 | sample post.doc  | vicky     | {dev,hr}
  2 | another post.pdf | jhon      | {dev,qa}
(2 rows)

There’s a lot operations that are supported right of the box, you can see the full list here.

get the count of each usergroup:

with the help of unnest() method,  we can n order to get the count of files that are present in each group, first we need to know the individual group names present in each row, we can obtain this with the help of unnest() method which expands an array into set of rows. The following query results us in the group name vs count of files.

test=# select unnest(usergroups) as ug, count(unnest(usergroups)) from posts
 group by ug;
 id |   ug   |  count
----+--------+--------
  1 | dev    |   2   
  2 | hr     |   1
  3 | qa     |   1 
(3 rows)

records that match multiple groups:

test=# SELECT * from posts where Array[‘qa’,’hr’]:: varchar[] && usergroups;
id |    filename       |  username | usergroups
----+------------------+-----------+---------------
  1 | sample post.doc  | vicky     | {dev,hr}
  2 | another post.pdf | jhon      | {dev,qa}
(2 rows)