Softwareentwicklung (RRZE)

Informationen rund um die Softwareentwicklung am RRZE

Inhalt

Three cool new features in smdb v3.10.2.0

Mongo DB Logo

This article describes three new features in version v3.10.2.0 of the RRZE’s smdb plugin. Let’s dash through the basics and go straight to the features. Who knows – may be one or two of the features will prove useful for you?

The basics – smdb provides a thin abstraction layer for accessing a MongoDB replicaset and databases. The abstraction layer closely approximates the original MongoDB Java Driver implementation. The first three numbers which are part of the plugin’s version represent the MongoDB Java Driver version which is included as a dependency.

In this article the application data will be addressed as ‚document‘ and the options used by the MongoDB API methods, for example CreateCollectionOptions, as ‚option‘.

Enough blabber – let’s just jump to the new features and if you read till the end you will find a useful bonus feature.

1.Document and option cloning

Once enabled, this feature will try to clone all the documents or options used by the smdbService before manipulating them in any way. This allows us to preserve the original document or option instances and use them further in our code without any concerns. The clone functionality relies on the Clonable Java interface, so if you plan to use it – be sure to use document and option objects which support Clonable. If the object instance provided to a smdbService method does not support cloning, the original object instance will be used.

The most common way for createing documents or options in our applications is through a LinkedHashMap which supports Clonable. A basic example of creating a document and an option that work fine with cloning looks like this:


// this is a LinkedHashMap
def myDocument = [
  givenname: 'Krasimir',
  surname: 'Zhelev',
  address: [
    street:'Matrtnesstraße 1',
    city: 'Erlangen'
  ]
]

// this is a LinkedHashMap
def myOption = [upsert: true]

It is important to mention that with version v3.10.2.0 cloning of options is enabled by default. Cloning of data documents is disabled by default to ensure backward compatibility.

Options cloning allows us, for example, to create collections by using the following code:


Map myOptions = [collation: [locale: 'de']]
smdbService.createCollection('myDB', 'germanNamesCollection', myOptions)
smdbService.createCollection('myDB', 'germanTitlesCollection', myOptions)

The code shown above used to fail earlier because the collation option was replaced by an instance of the Java MongoDB Driver Class Collation within the myOptions object during the first createCollection call. The second method call was thus not able to handle correctly the myOptions object as it was trying to map the option once again.

So if the options cloning is enabled by default, how do I enable document cloning then? Simple – just add cloneDocs = true to your MongoDB connection options in the application.groovy (application.yml) file:

grails {
  mongo {
    replicaSet = ["mongodb:27017"]
    username = "myUsername"
    password = "************"
    databaseName = "myDB"

    options {
     connectionsPerHost = 300
     connectTimeout = 2000
    }

    // enables document cloning
    cloneDocs = true
  }
}

Please note that document cloning could impact performance and will be skipped if the data fails to clone. Oone more time – if data cloning fails, the original document will be used. Please bare in mind that cloning is a global feature an will be enabled for and affect all documents fed to the smdbService. In most cases document cloning is not required so it is disabaled by default.

Let’s now jump to some more interesting features.

2. Document cleanup

Version v3.10.2.0 of the smdb plugin supports document cleaning. What does document cleaning mean? As an example consider the following code:


def myDocument = [
  givenname: 'Krasimir',
  surname: 'Zhelev',
  title: null,
  address: [
    street:'Matrtnesstraße 1',
    city: 'Erlangen'
    mobile: "",
  ],
  cars:[
    plates:[:],
    engines:[:],
    colors:[]
  ]
]

Do you really need to store empty or null values in your database? Let’s just preserve some storage by cleaning up those ‚useless‘ values. We can enable the cleanup mode by enabling the cleanDocs = true flag in our MongoDB configuration, quite similar to the cloneDocs option:

grails {
  mongo {
    replicaSet = ["mongodb:27017"]
    username = "myUsername"
    password = "************"
    databaseName = "myDB"

    options {
      connectionsPerHost = 300
      connectTimeout = 2000
    }

    // enables document cleaning
    cleanDocs = true
  }
}

If we now use the same smdbService.insertOne(‚myDb‘, ‚myUsersCollection‘, myDocument) method only data which is not empty will be persisted. The document will effectively become the following, before saving:


def myDocument = [
  givenname: 'Krasimir',
  surname: 'Zhelev',
  address: [
    street:'Matrtnesstraße 1',
    city: 'Erlangen'
  ]
]

Cleaning up documents reduces their storage footprint before they hit the database. The cleanDocs flag is a global option just like cloneDocs and will affect all documents which are processed by the smdbService. If your value accessors look like that: def primaryCarColor = myUser.cars.colors.first() those will fail after enabling this option. You will have to go through your code and ensure you use null-safe (NPE-safe) code, meaning your code would look as follows: def primaryCarColor = myUser.cars?.colors?.first(), assuming of course you check the car color only if you already have loaded the user. I strongly encourage using null-safe (NPE-safe) code even without activating the cleanDocs feature.

This feature’s CPU and memory overheads should be negligible unless you process enormous documents.

3. Document conversions

I will start with the bad news – this features can only be used if the cleanDocs feature is enabled. This limitation is imposed by the necessity to recursively iterate over the data in a document and process all its keys and values – that is how we remove the empty and null value elements. To put it short – the conversions feature uses the same cleanup() method but adds some additional functionality.

Enough bad news – let’s see what conversions can do for us. Let’s assume for example that you are a heavy GString user and often use GStrings in your documents as keys or values. Everything will be fine as long as you define your data like this, assuming an existing user in the myUser instance:


def myDocument = [
  givenname: "${myUser.firstname}",
  surname: "${myUser.lastname}",
  address: [
    street: 'Matrtnesstraße 1',
    city: 'Erlangen'
  ]
]

smdbService.insertOne('myDb', 'myUsersCollection', myDocument)

In this case the expressions „${myUser.firstname}“ and „${myUser.lastname}“ will be nicely handled by the MongoDB codecs provided by the smdb plugin. The document will be successfully saved and we can continue implementing our business logic. If we, however, change the example a bit, and try to persist the document below, we will cause a ClassCastException:


def prefix = "extUser_"

def myDocument = [
  "${prefix+'givenname'}": "Krasimir",
  "${prefix+'surname'}": "Zhelev",
  address: [
    street:'Matrtnesstraße 1',
    city: 'Erlangen'
  ]
]

smdbService.insertOne('myDb', 'myUsersCollection', myDocument)

But why? What happened in this case is – we changed the keys of the document from Strings to GStrings. MongoDB supports only Strings as keys and the codecs provided by the smdb plugin do not process keys – only values. Essentially you cannot use anything else but a String as a key, no Integers, no Longs, no Booleans, no MyOwnIdClass  … nope – only Strings. So how do we solve this problem – we convert all the keys to Strings of course. The simplest and probably the best way to currently achieve this is through a key converter rule which we can register in the BootStrap.groovy:


def smdbService

def init = { servletContext ->

  smdbService.keyConverters.gstring = {k -> (k instanceof GString)?k.toString():k}

}

Now when we invoke the insertOne() method the document will be successfully saved if and only if the cleanDocs feature was enabled. What we did in the code above is to register a key converter in the global keyConvertes map under the name of ‚gstring‘. This converter will be recursively executed for every key in a document and in case the key is of type GString it will be replaced by its String value.

As you can see this converter is just a simple closure which manipulates the key value. It is extremely important to note that the original key value will be replaced by the object returned from the closure. You should use key converters with utmost caution and always return the original key value if you do not handle this type of class – otherwise you risk nulling all your keys or replacing them with a fixed value for example. Though primarily build for type conversions you can also use the converters for other tasks like fetching data from a database or mapping a key to a different value – practically anything you can program with a closure. Still we discourage such use unless really necessary.

Converters will be executed consecutively in the order they were registered. You are responsible for the order of converters if you want apply multiple converters in a predefined sequence. Converters are also applied globally to all documents passing through the smdbService.

Let’s have a different example to illustrate some of the points above. In this case we have an application heavily using Date objects and we want to automatically convert them when we persist a document:


smdbService.keyConverters.gstring = {k -> (k instanceof GString)?k.toString():k}

smdbService.keyConverters.date = {k -> (k instanceof Date)?k.getTime().toString():k}

smdbService.keyConverters.capitalize = {k -> (k in ['givenname', 'surname'])?k.capitalize():k}

smdbService.keyConverters.clearPrefix = {k -> k.startsWith('__')?(k-'__'):k}

smdbService.valueConverters.date = {v -> (v instanceof Date)?v.format('yyyyMMddHHmmssSSS'):v}

def now = new Date()

def myDocument = [
  __givenname: "Krasimir",
  __surname: "Zhelev",
  now: 'PseudoId#createdBy:Me',
  created: now
]

smdbService.insertOne('myDb', 'myUsersCollection', myDocument)

In the code above we’ve kept our GString conversion but added some additional rules. First we decided to convert all keys of type Date to String using a long timestamp format: k.getTime().toString(). This was used to create a pseudo id for our user. Another rule was added that will capitalize the fields givenname and surname automatically and turn them to Givenname and Surname correspondingly. Additionally we remove the annoying underscore prefix from the attribute names with the clearPrefix rule. For Date values (not keys) on the other hand we’ve selected a different rule and will save these values in a more humanly readable format yyyyMMddHHmmssSSS. The resulting document will look like this after applying the conversion rules:


def myDocument = [
  Givenname: "Krasimir",
  Surname: "Zhelev",
  1559037714830: "PseudoId#createdBy:Me",
  created: "20190528120154830"
]

Of course the example above makes no sense and will probably break all Date values in your application. The goal here was to demonstrate conversion rule chaining – for example swapping the order of the capitalize and the clearPrefix rules will produce different results. Additionally a value converter was used – those work just as the key converters and should actually be of more common use, let’s say for serializing application classes where the build-in smdb codecs do not suffice.

Bonus Feature

If you have been as disappointed as me so far because of the actual limited usage of document cleanup and conversions then let’s jump straight to the bonus feature. If you do not want to affect your whole application’s data storage routines and convert or cleanup every single document that you store – well … I have a solution for you. The whole cleanup and conversion functionality is implemented in a single simple method which can be directly accessed from the smdbService. The primary goal of the smdb plugin is to provide an easy access to the MongoDB backend – it is not data manipulation, conversion, extraction and so on. That’s why an effort has been made to keep it as simple as possible and as close as possible to the MongoDB Java API. This however does not forbid us to use some of its processing methods from other services and scripts. If you need to cleanup documents only from a certain source, specific type or in a specific code location – just use the cleanup() method directly. The method is really simple (gitlab) and you can use it also as a template and extend its functionality further if required – converting values only for specific keys and so on. The current implementation should still be enough in the most common use cases. The method signature is defined like this:


def cleanup(obj, ignore=[], valueConverters=[:], keyConverters=[:], denull=true)

It is all pretty self-explaining. The method will simply handle the recursive processing for you. As an obj you can provide any map or list and it will be recursively processed. By default the method cleans up all empty values from a document, which is indicated by the default value of the denull parameter – true. If you do not want to remove the empty values – simply call the method with denull = false. If you disable denulling and do not provide any other options then the method will simply return a copy of the original document as there is nothing to be processed. This can also be used as a Clonable replacement in some scenarios. The ignore parameter controls which keys will be filtered out while processing the document. If, for example, your user has a password field and you do not want to store it – then simply call the method like that cleanup(myUser, [‚password‘]). The valueConverters and keyConverters were discussed earlier so I will just skip any explanations and show an example usage:


def myDocument = [
  givenname : "Krasimir",
  surname : "Zhelev",
  gender: null,
  security:[
    password: "very_secret_password",
    token: "some_token_value"
  ],
  address: [
    work:[
      street:'Matrtnesstrasse 1',
      city: 'Erlangen'
    ],
    private: [:]
  ]
]

def valueConverters = [:]

valueConverters.street = {v -> v.contains('strasse')?v.replaceAll('strasse', 'straße'):v}

def keyConverters = [:]

keyConverters.hidden = {k -> (k in ['security', 'token'])?('__'+k):k}

def convertedMap = smdbService.cleanup(myDocument, ['password'], valueConverters, keyConverters, true)

This will result in the following convertedMap map:


[ 
  givenname: 'Krasimir',
  surname: 'Zhelev',
  __security: [
    __token:some_token_value
  ],
  address: [
    work:[
      street: Matrtnesstraße 1,
      city: Erlangen]
  ]
]

Fell free to use this method whenever you want to recusrsively process a map or a list. It msut not be by any means related to persisting data in MongoDB.

Hint: If you have a Java Class, for example a User POJO and do not know how to create a map from it but still want to use the cleanup method – consider using the JSON converter provided by Grails.

Conclusion

A brief description of three new features of the smdb plugin was provided in this article – document and option cloning, document cleanup, document conversions. The intended usage of the new features was demonstarted by examples. As the goal of the smdb plugin is to approximate closely the original MongoDB Java Driver API, the features were implemented as global functions thus affecting every document being processed by the smdbService but still preserving the original plugin API. As a bonus feature, the underlying implementation of the celanup() method was presented, which allows for a recursive document manipulation. The cleanup() method can be used directly through the smdbService which allows us to use its functionality anywhere in our applications.

Thanks for reading till the end! If you have any questions or ideas – you know how to contact me.

Apache Camel deprecates hazelcast:seda

The long migration of the Camel

Apache Camel LogoAt the time of writing of this post the website of the Apache Camel ™  framework has been undergoing a major migration to GitHub for quite a while. If you are looking for information the best place to start at this moment is with the README.md file on GitHub. Detailed documentation of the existing components can currently be found on the components page also on GitHub. We keep our fingers crossed for the Apache Camel team – we hope you will complete soon the migration. We are big fans of your work.

Deprecation of hazelcast:seda

Hazelcast LogoThe hazelcast:seda component has been deprecated. I am not exactly sure in which Apache Camel version this happened but the one we are currently using (2.21.x) in our Grails Kamel plugin displays a warning message. The good news is – the component is still here, it has been renamed to: hazelcast-seda. So if you want to migrate to the new component all you need to do is replace the hazelcast:seda with hazelcast-seda.

Not so fast

carWait, not so fast – there is a small caveat! If you have a look in the component’s documentation you will find some parameters you can use. What is of major importance for us is the hazelcastInstanceName. If you do not set this component parameter the route will still start up and can process data, however it will use a default hazelcast instance initiated by the Apache Camel context. The instance will be named something like hazelcastInstance_dev1. The Apache Camel context tries to load a hazelcast instance named ‚hazelcastInstance‘ and if this instance does not exists, it gets created using the configuration in hazelcast-default.xml. Generally speaking this instance will perform just fine but will be confined only to the current machine (Docker exposed ports and so). This  however is not what we are after.

Migration

So how do we migrate? Let’s assume there is a hazelcast instance named hazelcast in your application. If you are using the encore plugin with the default settings this will be exactly the case. There are two possible ways to configure properly the route in our case (from the Apache Camel documentation):

  • hazelcastInstanceThe hazelcast instance reference which can be used for hazelcast endpoint.
  • hazelcastInstanceName The hazelcast instance reference name which can be used for hazelcast endpoint. If you don’t specify the instance reference, camel use the default hazelcast instance from the camel-hazelcast instance.

Here is a short example of how the configuration can be changed. I will use the hazelcastInstanceName parameter because it is easier to use. Assuming you have a route named myRoute, your configuration should change from:

hazelcast:seda:myRoute?transferExchange=true&pollTimeout=1000&concurrentConsumers=7

to:

hazelcast-seda:myRoute?hazelcastInstanceName=hazelcast&transferExchange=true&pollTimeout=1000&concurrentConsumers=7

Please note the hazelcastInstanceName=hazelcast part. If you have registered the hazelcast instance under a different name you have to adjust the example above accordingly.

That’s all. Happy Camel riding and routing. I hope you find this short post helpful. By the way – this is the first post on our new blog.