JesterJ 1.0 Beta2 Released

October 16, 2018| zmancy| in category Uncategorized No comments yet

Several new features are now available in JesterJ, which is slowly but surely moving towards full 1.0. My recent preparation for my talk at Activate 2018 has inspired me to fix a few things and make these improvements available in a release

New Features in 1.0 Beta 2

Plan Visualization

This is the key feature that really drove the production of the Beta 2 release. I added this so I could more easily show what is going on in JesterJ for my talk, and I’ve found it’s tremendously helpful and gratifying to be able to see the connections/layout of a JesterJ ingestion DAG. I just had to share it! Here’s an example from my talk:

Blue ovals represent Implementations of Scanner.java producing data and arrows indicate document flow. This image nicely demonstrates the DAG (Directed Acyclic Graph) feature. Three data sources feed four flows into three terminal steps that are all putting data into solr. The bottom right oval is a step configured with a SendToSolrCloudProcessor.java instance that adds user click through data to a TRA (Time Routed Alias). The middle oval is sending to a standard solr collection (non-time series) and the bottom left is invoking a streaming expression update() to build up partial calculations for an NPMR (non parametric multiplicative regression) model in solr. The second and third from the bottom on the right are examples of custom implementations of Processor.java that I hope to detail in a future post.

Other Enhancements

Java Config Jar command line Parameter – When considering the use of Java Config initially I was unsure I would keep it long term so I added a Java property to identify the jar file containing the config, but since this has turned out to be convenient and is the supported mode of operation, I’ve added a (required) positional command line parameter for it. I also intend to further simplify the command line before 1.0 final
Defaults for some methods in Scanner.java – Another user-friendliness enhancement. The initial implmentation of Scanner.java was very forward looking. Two methods have been repeatedly implemented with no-op’s and so the default keyword has been used to make these no-op implementations the standard unless overridden
RouteByStepName warns if documents are dropped. This router was dropping documents silently, which could be useful at times, but in coordination with the Router/Scanner bug (see below) lead me into a very long debugging session. I decided it’s better to warn about dropped documents.

Bug Fix: IP Adress changes

One of the more irritating bugs I ran into cropped up in a prior presentation where I had intended to show JesterJ working in real time. I had everything all set, verified it multiple times, shut my laptop and went to the Meetup. When it came to my part of the Meet Up, I went to start JesterJ, only to get a stack trace.

I can hardly blame my audience if they were unimpressed 🙁

The underlying problem lies in that the embedded Cassandra within JesterJ writes down the ip address in a config file (~/.jj/cassandra/cassandra.yml) and moving the laptop causes this to change, so Cassandra fails to bind the old address and JesterJ won’t start. This is mostly not a problem for production instances where ip addresses shouldn’t change frequently, but if things do need to be moved, this would become a major irritation. I can also see someone excited to use JesterJ taking their laptop to a meeting to show their team/boss and the switch from docking station to wireless changes their IP… boom!

Other Bug Fixes

Routers on Scanners work again. Some time ago a bug crept in that caused the router added to a scanner to never be built from it’s builder. The system was defaulting to RouteByStepName which was silently dropping documents.
Log4J incremented to 2.6.1 – This is to avoid LOG4J2-1409 but to minimize changes in project dependencies (and associated licensing documentation work) this was a minimal upgrade rather than a move to the latest/greatest.

There were a few other changes to the build and issues relating to our use (and misconfiguration) of Artifactory, but those probably aren’t worth blogging about. As always you can stay abreast of all changes by following the issues on Github