Goldman Sachs collections – nearly everything you want from collections in Java

Java collection framework is not that powerful as experienced Java developer would expect.
For example, how do you sort a list?
Simple answer would be to use java.util.Collections.sort() method with some kind of java.util.Comparator implementation. Additionally Guava Ordering support can be used.
However, the solution is not exactly what object oriented developer looks for.
Similarly to sorting a collection you would probably deal with finding min or max element in a collection using  java.util.Collections.min() and java.util.Collections.max() methods respectively.
After all how to filter a collection? Or how to select a list of particular property extracted from the objects stored in the collection? It can be done in pure Java using a for loop, using Apache Commons Collections and its CollectionUtils.filter(), CollectionUtils.collect() or Guava Collections2.filter(). Nonetheless, still none of those solutions is fully satisfying from my point of view.
Of course, there is Java 8 in the game, but it is a quite new release that cannot be used in every project, especially in legacy one and its collection framework is still not optimal.

As a rescue for the above problems the Goldman Sachs Collections (GS Collections) framework comes in. It is a collection framework that Goldman Sachs open sourced in January 2012.

Here is quick feature overview of GS Collections comparing to Java 8, Guava, Trove and Scala:



Seeing this, even if you thought that Java 8 had everything you need from collections, you still should have a look at GS Collections.

Following this brief introduction I am going to present a quick overview of the main features GS Collections has to offer. Some of the examples are variants of the exercises in the GS Collections Kata which is a training class they use in Goldman Sachs to train developers how to use GS Collections. The training is also open sourced as a separate repository.

Going back to the example from the beginning of the post, it would be perfect if we have methods like sort(), min(), max(), select(), collect(), etc. on every collection. It is simple to put them in a util class but it does not reflect the object oriented design.

GS Collections has an interfaces accomplishing this in the following way (as an example):

public interface MutableList<T> extends List<T>{
    MutableList<T> sortThis(Comparator<? super T> comparator);
    <V> MutableList<V> collect(Function<? super T, ? extends V> function);
    MutableList<T> select(Predicate<? super T> predicate);

GS Collections classes do not extend Java Collection Framework classes. They are instead new implementation of both Java Collection Framework and GS Collections interfaces.


 Collect pattern

The collect patterns returns  a new collection where each element has been transformed. An example can be a case when we need to return price of each item in the shopping cart.

Collect pattern uses function which takes an object and returns an object of a different type. It simply transforms objects.


MutableList<Customer> customers = company.getCustomers();
MutableList<String> customerCities = customers.collect(new Function<Customer, String>() {
    public String valueOf(Customer customer) {
     return customer.getCity();

or using Java 8 lambda expressions:

MutableList<Customer> customers = company.getCustomers();
MutableList<String> customerCities = customers.collect(customer->customer.getCity());
or using method reference:


Select pattern

The select pattern (aka filter) returns the elements of a collection that satisfy some condition. For example select only those customers who live in London. The pattern uses predicate which is a type taking an object and returning a boolean.

MutableList<Customer> customers = company.getCustomers();
MutableList<Customer> customersFromLondon = Predicate<Customer>() {
  public boolean accept(Customer each) {
    return each.getCity().equalsIgnoreCase("London");

or using Java 8 lambda expressions:

MutableList<Customer> customers =;
MutableList<Customer> customersFromLondon =
each -> each.getCity().equalsIgnoreCase("London"));


Reject pattern

The reject pattern returns the collection elements that do not satisfy the Predicate.
MutableList<Customer> customersNotFromLondon =
.reject(new Predicate<Customer>() {
    public boolean accept(Customer each) {
      return each.getCity().equalsIgnoreCase("London");
One note in regards to anonymous inner classes when it is not possible to use Java 8. It is advisable to encapsulate them in the domain object and then the above snippet changes into:
MutableList<Customer> customersNotFromLondon =

Other patterns using Predicate

  • Count pattern
    • Returns the number of elements that satisfy the Predicate.
  • Detect pattern
    • Finds the first element that satisfies the Predicate.
  • Any Satisfy
    • Returns true if any element satisfies the Predicate.
  • All Satisfy
    • Returns true if all elements satisfy the Predicate.


GS Collections includes helpful, collections-specific utilities for writing unit tests. There are implemented as extension of JUnit.
Instead of checking the collections size:
Assert.assertEquals(2, customersFromLondon.size());
you can use:
Verify.assertSize(2, customersFromLondon)
MutableList<Integer> list = FastList.newListWith(1, 2, 0, -1);
Verify.assertAllSatisfy(list, IntegerPredicates.isPositive());
Some more examples:
Verify.assertContainsAll(customersFromLondon, customer1, customer2, customer3);


GS Collections provides several built-in predicates:
MutableList<Integer> mutableList = FastList.newListWith(25, 50, 75, 100);&nbsp;
MutableList<Integer> selected =;
MutableList<Person> theLondoners = Predicates.attributeEqual(
Person::getCity, "London"));


I personally prefer immutable data structures to mutable ones. The pros are that they can be pass around without making defensive copies, they can be concurrently accessed without possibility of corruption, etc.
Methods toList(), toSortedList(), toSet(), toSortedSet(), toBag() always return new, mutable copies.
MutableList<Integer> list = FastList.newListWith(3, 1, 2, 2, 1);&nbsp;
MutableList<Integer> noDuplicates = list.toSet().toSortedList();
ImmutableCollection interface does not extend Collection therefore has no mutating methods.
ImmutableList<Integer> immutableList = FastList.newListWith(1, 2, 3).toImmutable();
ImmutableList<Integer> immutableList2 = Lists.immutable.of(1, 2, 3);

Flat collect

Flat collect pattern is a special case of collect pattern. While using a collect pattern when function returns a collection result is a collection of collections. On the other hand, flat collect in this case returns a single “flattened” collections instead of collection of collections.
or in pre-Java 8 way:
company.getCustomers().flatCollect(new Function<Customer, Iterable<Order>>() {
  public Iterable<Order> valueOf(Customer customer) {
    return customer.getOrders();

Static utilities

As stated in the beginning processing collections using methods on the interfaces is the preferred, object oriented approach. However it is not always feasible. As a solution GS Collections, similarly to JDK, introduces several static utility classes like Iterate, ListIterate, etc.
Some of them can be used to inter operate  with Java Collection Framework. What is more, they allow developers to refactor existing code base into the one using GS Collections incrementally.
List<Integer> list = ...;
MutableList<Integer> selected =, Predicates.greaterThan(50));
Integer[] array = ...;
MutableList<Integer> selected =, Predicates.greaterThan(50));
String result= "1a2a3", CharPredicate.IS_DIGIT);

Parralel iteration

GS Collections provides static utility for parallel iteration which can be used for data-intensive algorithms. It looks like the serial case, hiding complexity of writing concurrent code.
List<Integer> list = ...;
Collection<Integer> selected =, Predicates.greaterThan(50));
Remember that parallel algorithms are not usually a solution for performance problems. 

FastList as a replacement for ArrayList

FastList is considered a drop-in replacement for ArrayList. It is definitely more memory efficient and can be used to refactor legacy code in steps.
Let’s refactor that simple piece of code using GS Collections:
List<Integer> integers = new ArrayList<>();
Step 1:
List<Integer> integers = new FastList<Integer>();
Step 2:
List<Integer> integers = FastList.newList();
Step 3:
List<Integer> integers = FastList.newListWith(1, 2, 3);
or if you need unmodifiable collection:
List<Integer> integers = FastList.newListWith(1, 2, 3).asUnmodifable();
Step 4:
MutableList<Integer> integers = FastList.newListWith(1, 2, 3);
The analogous refactorings can be carried out for maps and sets using respectively UnifiedMap and UnifiedSet.
UnifiedMap<Integer, String> map = 
   UnifiedMap.newWithKeysValues( 1, "1", 2, "2", 3, "3");

Parallel lazy evaluation

There are situation when first optimization which comes to the mind is to parallel operations. It can be justified especially in processing large chunks of data like collections of millions elements in multi -processor environment. GS Collections offers a functionality to implement it in a friendly way:
MutableList<Item> data = ...;
ExecutorService executorService = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors();
ParallelListIterable<Item> itemsLazy = FastList.newList(data).asParallel(executorService, 50000);

asParallel() method takes two parameters:

  • executorService
  • batchSize which determines the number of elements from the backing collection that get processed by each task submitted to the thread pool; from my experience the appropriate batch size has significant influence on performance and should be determined during performance tests


I did personally a few performance tests comparing lazy and parallel lazy evaluations using GS Collections but I did not do any comparison between GS Collections and other collections framework. Since Goldman Sachs promises that their implementation is optimized for performance and memory usage I tried to find any tests that prove that.
Here is an example comparison of GS Collections, Java 8 Collections and Scala Collections:




This is just a tip of the iceberg in regards of GS collections. The framework offers much more like support for stack data structure (MutableStack), bag data structure (MutableBag), multimaps (MutableListMultimap), grouping functionalities (groupBy, groupByEach), lazy evaluation (asLazy()). 
From my point of view it is a quality replacement for current Java Collections Framework.

Measure and find bottlenecks before they affect your users 1/2

Inspired by a few talks during the last 33rd Degree conference I decided to implement, in one of the applications I develop, metrics which allow developers or operation team monitor running application and possibly detect potential problems early on.

After a quick investigation I decided to use the Metrics framework. What I expected was exposing at least some statistics concerning usage of application components. First of all I would use them during application stress tests to find particular components slowing down the application. After going into production, I imagine that such statistics would be helpful to monitor the running application.

Metrics framework perfectly fits my expectations. It is a Java framework. What is more there are Javascript ports which I am going to use to monitor Node.js server (more about it in one of the next posts).
I decided to integrate the tool with Spring application context using metrics-spring but of course it is possible to use it without Spring.

Here is the Spring application context with Metrics support:

<?xml version="1.0" encoding="UTF-8"?>
    xmlns:metrics=""   xsi:schemaLocation="">
    <metrics:metric-registry id="metrics"/>
    <metrics:annotation-driven metric-registry="metrics"/>
    <metrics:reporter id="metricsJmxReporter" metric-registry="metrics" type="jmx"/>
    <metrics:reporter id="metricsLogReporter" metric-registry="metrics" type="slf4j" period="1m"/>

The configuration defining a few beans:

  • metrics-registry is the bean used to register generated metrics; its explicit definition is optional, if not defined new MetricRegistry bean is created
  • annotation-driven element tells that annotations are used to mark methods/beans under monitoring
  • reporter element is used to report gathered statistics to the defined consumers; there are a few reporter implementation provided (jmx, console, slf4j, ganglia, graphite); I decided to use two of them:
    • jmx (JmxReporter) exposing metrics as JMX MBeans; they can be explored using standard tools like jconsole or VisualVM
    • slf4j (Slf4jReporter) logging metrics to an SLF4J logger; period attribute defines the interval used to report statistics to a log file

When configuration is done, it is time to annotate the bean methods which are to be monitored. To do that there is a simple @Timed annotation provided:

@RequestMapping(value = "/transaction", method = RequestMethod.POST)
public HttpEntity<SubmitResultDTO> transaction(@RequestBody NewTransactionDTO transaction) { ...}
Using that simple configuration you get JMX MBeans exposed providing a nice set of metrics:
What is more, if any statistic is clicked an online chart is presented:



Besides the JMX reporter there is also the SLF4J reported defined, which logs the following pieces of information:



Except JMX or SLF4J reporting more sophisticated tools can be used to consume statistics provided my Metrics. I would recommend trying Ganglia or Graphite as there are reporters provided for those consumers (GangliaReporter and GraphiteReporter).

Another useful tool from Zeroturnaround – Xrebel

Some time ago I signed up for a beta testing of a new tool from Zeroturnaround : Xrebel. I have been waiting a bit impatiently for a testing program to start. Finally I have got an information about availability of 1.0.0 version of the tool for download. Without much hesitation I have started tests.

What does the tool do?

As you may expect, based on previous Zeroturnarond tools, generally speaking it is intended to improve quality of Java developer every day work. More precisely speaking it allows developers to live monitor JVM application in regards of session size and SQL queries sent to the underlying database.


There is no doubt that Zeroturnaround guys know all the ins and outs of JVM so the installation is as simple as adding -javaagen:[path]xrebel.jar JVM parameter to the application server. And that is it. Starting an application server the following output is presented:


Application successfully started with Xrebel assist.

When web application is executed in the web browser for the first time with Xrebel there is a simple form presented in the web browser allowing user to activate the tool.

Let’s discover tools features.


When the application is executed in the web browser Xrebel adds small toolbar on the left side of the screen:

Picture 1. Toolbar
Picture 1. Toolbar

When first click on any option the Setting window is showed when user can enter package name which will be monitored and thresholds which will be used by Xrebel to notify user when exceeded:

Picture 2. Package settings
Picture 2. Package settings


Picture 3. Thresholds settings
Picture 3. Thresholds settings


I expected easy set up and Xrebel did not disappointed me.


Main feature of the Xrebel is live application monitoring and as shown on Picture 1. general information are presented on the toolbar itself. First sections regards SQL queries. First number (5) indicates number of SQL queries executed so far. Next position shows queries execution time (288.5 ms). Section below shows session information: total size and size difference from the last request.

When clicked on each session additional information is presented such as exact query executed in the database, number of rows returned and execution time:


As far as session is concerned, there are sizes of each stored element.


The main purpose of the Xrebel is to quickly find bugs regarding dodgy database access and abnormal session size increment. In my opinion, even it is early beta version, it fulfills expectations.
Simple, not disturbing tool providing all information needed for a developer to track down the issue cause.

If you feel interested in the tool sign up for Xrebel beta testing:

Xrebel beta tests

Hibernate + 2nd level cache bug

Recently, I have come across a nasty bug. When Hibernate executes a cacheable query and applying a ResultTransformer on a result of that query, java.lang.ClassCastException is thrown.

  .add(Restrictions.eq(CURR_ID, criteria.getCurrency()))       
  .add(Restrictions.eq(TYPE, criteria.getRateType().name()))        
  .add(, criteria.getFromTime().toDate()))         
  .add(, TIME_DTO_PROPERTY)             
  .add(, RATE_DTO_PROPERTY))         

It seems that the ResultTransformer is applied on the result before putting it into cache. Cache expects Object list but it receives transformed results (in this case RateDTO list). It cannot deal with it and as a result exception is thrown.

The bug is present in Hibernate 3.6.6.Final version.

One of the workarounds would be to remove


and allow query return Object list.

Having Object list they can be transformed manually to the desired list:

public static List transformToBean(Class resultClass, List < String > aliasList, List resultList) {
  if (CollectionUtils.isEmpty(aliasList)) {
    throw new IllegalArgumentException("aliasList is required");
  if (CollectionUtils.isEmpty(resultList)) {
    return Collections.EMPTY_LIST;

  List transformedList = new ArrayList();
  AliasToBeanResultTransformer aliasToBeanResultTransformer = new AliasToBeanResultTransformer(resultClass);
  Iterator it = resultList.iterator();
  Object[] obj;
  while (it.hasNext()) {
    obj = (Object[]);
    transformedList.add(aliasToBeanResultTransformer.transformTuple(obj, (String[]) aliasList.toArray()));
  return transformedList;

Java enums + template method design pattern

Let’s consider the following code snippets:

public enum Currency { 


String currency = "EUR";

  System.out.println("Transfer permitted");

How often do we see much the same scenario? It is not completely wrong since there is a try to use enums so it is not entirely “string driven” programming. However there is still a space to do some refactoring.

What about doing such stuff more in the object-oriented way? Enums are very powerful Java feature but for most of the cases there are only used in the simplest possible way.

public enum Currency {
  EUR() {
      public boolean isTransferPermitted() {
        return true;
  USD() {
    public boolean isTransferPermitted() {
      return false;
  public abstract boolean isTransferPermitted();


Currency currency = Currency.valueOf("EUR");

  System.out.println("Transfer permitted");

In my opinion, refactored code is clearer and more self-documenting then the original one. Generally speaking it is a Template Method design pattern applied to Java enums.

Java application performance monitoring

I am going to present quick overview of useful tools which can be helpful when you trying to troubleshoot application in case of performance issues.

Describing each tool, some overview is provided, the way the tool is installed, free/paid info, inclusion in the JDK and the information if it can be used in production environment.

  • logs
    • application logs, application server logs, database logs
    • no installation
    • free
    • included with JDK
    • can be used in production
  • application server monitoring
    • administration console
    • provided by default with application server or installation needed – varies between application servers
    • free (comes with application server)
    • not included with JDK
    • can be used in production (if properly secured)
  • New Relic
    • application performance management and monitoring
    • application has to be started with agent
    • basic features: free; advanced features: paid
    • not included with JDK
    • can be used in production
  • AppDynamics
    • application performance management software; useful for developers as well as operations; great visual tool
    • application has to be started with agent
    • lite version: free; pro version: paid
    • not included with JDK
    • can be used in production
    • YourKit
      • cpu and memory profiler
      • installation needed
      • paid
      • not included with JDK
      • cannot be used in production
    • JVisualVM
      • cpu and memory profiler
      • no installation needed
      • free
      • provided with JDK
      • some features can be used in production
    • JConsole
      • JMX metrics
      • no installation needed
      • free
      • provided with JDK
      • can be used in production
    • JMap/JHat
      • prints Java process memory map
      • no installation needed, attaches to the running process
      • free
      • included with JDK
      • cannot be used in production
    • JStack
      • prints thread dump of Java process
      • no installation needed, attaches to the running process
      • free
      • included with JDK
      • can be used in production

    JPA EntityManager operations order

    I have run into interesting issue recently. I use in the project JPA + Hibernate + EJB. The issue concerns saving and deleting entities in the same transaction. Database table which is used has an unique constraint defined on two columns.

    What I have done was removing entity calling


    then the new entity has been added with the same values in two properties associated with columns used in the unique constraint but different values in other properties using:


    Those two operations have been carried out in a single transaction and have been executed in the order as presented above. Removal first, addition second.
    What turned out, the operations on the database were executed in the inverted order and unique constraint got violated. It looked like the new entity was added before removing the previous one.

    Apparently, looking at JPA specification, it does not force implementations to execute operations on the database in the order they were added to the transaction.

    To deal with the situation as above, JPA provides


    method. Its responsibility is to synchronize persistence context to the underlying database.
    So to avoid unique constraint violation you need to call flush() method after remove() method.

    What is more, there is no risk that if the transaction is rolled back after calling flush() the entity will be removed anyway.  flush() force the persistence context to be synchronized to the database, but the transaction is still not committed, except it is committed  manually. If EJB layer is configured by default and JTA is used, then the transaction will be committed only after the methods returns from the EJB layer.

    Spring Integration

    I have been really impressed by the Spring Integration project recently so I decided to write a few words about it.

    Just to make a quick introduction:

    Spring Integration can be described as an extension of the Spring programming model which supports Enterprise Integration Patterns.

    It enables lightweight messaging within Spring-based applications and supports integration with external systems via declarative adapters. Those adapters provide a higher level of abstraction over Spring’s support for remoting, messaging, and scheduling.

    In my opinion the biggest advantage of the project is fact that it provides simple model to build enterprise applications while maintaining separation of concerns at the same time. It results in testable, maintainable and easy to change code. Code in services is responsible only for business logic, while executing services’ methods, synchronous/asynchronous communication, etc. is described in configuration and handled by the framework.

    Another aspect of the project I really like is fact that it introduces messaging model into your application. I mean introducing messaging into communication between modules/layers or even particular services in the application. Many people pigeonholes messaging as way to integrate with external systems. I agree that it perfectly suits into such scenarios but why not use it to integrate modules and services which belongs to the same application? Spring Integration provides us with clear and simple tooling to support such architecture.

    Let’s move to an example.

    Consider simple case where we have a queue of credit card transactions which are to be processed. Transaction under 10000$ are processed by one service while transactions over 10000$ are processed by another one.

    What we need is Transaction class:

    public class Transaction {
      private BigDecimal amount;
      public Transaction(BigDecimal amount){
        this.amount = amount;
      public BigDecimal getAmount(){
        return amount;

    Transactions under 10000$ processing service:

    public class TransactionProcessingService {
      private static Logger logger = Logger.getLogger(TransactionProcessingService.class);
      public void process(Transaction transaction){"process in TransactionProcessingService; amount=" + transaction.getAmount());

    Transactions over 10000$ processing service:

    public class AntiMoneyLoundryService {
        private static Logger logger = Logger.getLogger(AntiMoneyLoundryService.class);
        public void process(Transaction transaction) {
  "process in AntiMoneyLoundryService; amount=" + transaction.getAmount());

    Spring configuration:

    <beans:beans xmlns=""
        <channel id="transactions" />
        <router input-channel="transactions"
                expression="payload.amount le 10000 ? 'normalProcessing' : 'antiMoneyLoundryProcessing'" />
        <service-activator input-channel="normalProcessing" ref="transactionProcessingService" />
        <service-activator input-channel="antiMoneyLoundryProcessing" ref="antiMoneyLoundryService" />
        <beans:bean id="transactionProcessingService"
                    class="sample.TransactionProcessingService" />
        <beans:bean id="antiMoneyLoundryService"
                    class="sample.AntiMoneyLoundryService" />

    And finally code which sends two messages with two different transactions to the input channel:

    ApplicationContext context = new ClassPathXmlApplicationContext("/META-INF/spring/integration/creditCardTransactions.xml", TransactionProcessingDemo.class);
    MessageChannel transactionsInputChanngel = context.getBean("transactions", MessageChannel.class);
    Message<Transaction> normalTransaction = MessageBuilder.withPayload(new Transaction(new BigDecimal(9999))).build();
    Message<Transaction> bigTransaction = MessageBuilder.withPayload(new Transaction(new BigDecimal(90000))).build();

    Here is the log produced:

    [TransactionProcessingService] process in TransactionProcessingService; amount=9999
    [AntiMoneyLoundryService] process in AntiMoneyLoundryService; amount=90000

    As we can notice, messages were properly forwarded by the router element to the appropriate services.

    Java code should be self explanatory. Have a closer look at Spring configuration file.
    In line 8 input channel is defined with id transactions. It defines a point to point message channel to which messages can be send and from which they can be fetched.

    Line 9 introduces  router element. A Router determines the next channel a message should be sent based on the incoming message. Code in expression attribute is a SpEL expression to be evaluated at runtime. In the example presented the expression evaluates to a channel name. The other way to map the result to the channel name is using a mapping sub element.
    The condition based on which channel is chosen can be also implemented in POJO.

    Lines 11-12 contain service-activator definitions. A Service Activator is a component that invokes a service based on an incoming message and sends an outbound message based on the value returned by the service invocation.

    Lines 13-16 contain simple beans definitions.

    What do we get as a result of such design?

    There is no coupling between TransactionProcessingService and AntiMoneyLoundryService.
    Choosing the right service is fully handled by Spring Integration. The choose is based on condition placed in the configuration, not in the code itself.
    These are the advantages of Spring Integration which we can see based on that simple example.

    What is more, Spring Integration introduces and in some way enforces us to design our domain in an event-drive manner. If properly designed it perfectly mirrors the real world. We all live in event-driven environment. Each day we receive so many messages: mails, phone calls, text messages, RSS feeds, just to name a few. Software we produce, the business domain is like a slice of the real world. To increase success of the project the domain we model should reflect the real world the most accurately. Spring Integration is nearly a perfect tool to model that event-driven nature of any business we create software for.

    Personally, I truly recommend deeper dive into that part of the Spring ecosystem. I am almost sure that no one will be disappointed.