Monday, August 3, 2020

Using Kafka Streams Interactive Queries to Peek Inside of your KTables

Recently we found a bug in one of our Kafka Streams applications, and as I was looking into it, I found that we had a
Stream -> Table left join that was failing. This didn't make sense, as every indication was that the data, with the correct key, should have been in the KTable at the time that the join was attempted.

So, I set out to verify that. It was easy to see what was in the stream, but I was struggling to figure out how to see what was in the table. Using Kafkacat, I could see that the data was in the underlying topic, but I needed to see that in context with the KTable at runtime.

That's when I turned to the helpful geniuses on the Confluent Community Slack group. On there, someone suggested that I use an interactive query.

Now, to some, this might be a no-brainer, but I am still somewhat new to Kafka and had never used interactive queries. But there's a first time for everything, so I dug into it.

I guess I shouldn’t have been surprised by how easy it was, but Kafka Streams never ceases to amaze me. The following bit of code is all it took to give me a view inside my KTable:

Let's walk through this code.

(Line 3) First off I needed to get a hold of the KafkaStreams instance, in order to access the state store.

Since the bit of topology that I’m working on is in a different Java class from the one where the stream is created and launched, I have to make a call to get it.

(Line 4) To access the state store, I needed its name, so I called queryableStoreName() on the KTable.

(Line 5) Now I can get a hold of the state store itself, in the form of a ReadOnlyKeyValueStore, using KafkaStream's store() method.

(Line 6) To see all of the records in the store, I used a KeyValueIterator that is returned from the store.all() method.

(Line 7-10) For each record, I print the key and value, and then, on line 11, I close the state store.

I bundled that all up in a handy method called queryKTableStore().

Now I was able to add a peek() statement, calling this method, to my topology, right before the leftJoin that was failing.

That gave me output like this: key: 10001 Widget: {id:10001, name: Winding Widget, price: 299.95}
key: 10002 Widget: {id:10002, name: Whining Widget, price: 199.95}
key: 10003 Widget: {id:10003, name: Wonkey Widget, price: 499.95}

And of course, the key I was trying to join on, 10004, was not in the store, which means that it was not in the KTable. I added another peek() call after the failed join attempt, and now the output was more like this:

key: 10001 Widget: {id:10001, name: Winding Widget, price: 299.95}
key: 10002 Widget: {id:10002, name: Whining Widget, price: 199.95}
key: 10003 Widget: {id:10003, name: Wonkey Widget, price: 499.95}
key: 10004 Widget: {id:10004, name: Wonder Widget, price: 999.95}

Now it's there! Mystery solved! I have a timing problem on my hands... which is another mystery, but one for a different post. For now, I just wanted to point out this simple and powerful feature of Kafka Streams.

Before leaving I also wanted to point out that that the ReadOnlyKeyValueStore is limited to one application instance. In my case, running locally, I only had one instance, but in a distributed environment, things could get more complicated. Also, ReadOnlyKeyValueStore has another method for accessing data by key, if you already know the key. store.get(key) will return the value if it exists for that key. Of course there is more you can do and you can learn more about it in the Developer's Guide

Wednesday, May 27, 2020

Online Meetups Are Different, But Still A Valuable Resource

I’ve always been a fan of user groups, which are now mostly known as meetups.  In the past I’ve led Java and Groovy user groups, and they were always rewarding experiences.

More recently, I’ve been helping to organize the St. Louis Apache Kafka Meetup, hosted by Confluent.  We only had one in-person meeting before the COVID-19 rules came into play, and we had to convert subsequent meetings to online.

I was pretty bummed, thinking that online meetups would just be like webinars, or maybe recorded conference videos, which are great, but nothing like a live event. Now, after attending over a dozen online meetups around the world, I have to say I was pleasantly surprised.

The Confluent Community team does an excellent job of running these Zoom meetups. At the start of the meetup, everyone can be unmuted, so there is a great time of networking and catching up with friends, old and new. Then the host mutes the audience and the presenter gets started. During the presentation, attendees ask questions in the chat. Some presenters will pause to answer questions along the way, others will answer them at the end, but I have yet to see a question go ignored.

After the presentation, the host allows attendees to unmute again, and the discussions are just what you’d expect with an in-person meetup, except the participants might be in another country!

Another bonus for Confluent’s online meetups is that they are recorded.  You can watch videos from over twenty meetups from the past few months, on the Confluent Meetup Hub.  This site is a treasure trove, not just for the recordings, but because it also shows you which meetups are coming up, so you can join them live.

When you do join one of these online meetups, and I’m sure you will, you should consider turning on your video, if possible, and introducing yourself.  The Apache Kafka community is made up of some of the friendliest people I’ve ever met, so I can guarantee that you will be welcomed!  If you continue attending meetups in a particular time zone, you will get to know the regulars and even become one yourself.

So, while I still miss the in-person meetups, and I am looking forward to them returning, I am very grateful for the online meetups as well, and at the risk of being greedy, I am hoping, in the future, that we can have both!

And, speaking of in-person meetups, there are Apache Kafka meetups all over the world.  Find the one closest to you at  I would encourage you to join one (or more) of the meetup groups, so that you will hear when in-person meetups are beginning again.

Wednesday, March 4, 2020

Saint Louis Apache Kafka® Meetup by Confluent

One of the many ways that Confluent supports the developer community is by hosting Meetups around the world. For example, they just had Tim Berglund out in Paris (poor guy) for what looks like a great event!

They also host a meetup right here in St. Louis, and they've given me the great privilage of helping to organize it. 

All that to say this: 

Save the date!  On Tuesday, March 24th, we will have two great presenters at the Saint Louis Apache Kafka Meetup!

Mitch Henderson, a Technical Account Manager par excellence with Confluent, will talk about how to make our Kafka installations fault-tolerant, even to the datacenter level.

After that, Neil Buesing, the Director of Real-time Data at Object Partners, will show us how to build a web application using Kafka and Kafka Streams as our database. Prepare to have your mind blown on this one!

This is going to be a packed meeting, but we'll have plenty of pizza and soft drinks on hand in case it runs long.  So, if you're in the St. Louis area, or can get to the St. Louis area (you know you've always wanted to visit), please plan on joining us March 24th, at 6pm.  All of the details can be found on our Meetup page.

Oh, and you can follow us on Twitter too.

Wednesday, February 12, 2020

Confluent KSQL Workshop in Saint Louis

Recently Confluent and World Wide Technology held a hands-on workshop on Stream Processing with KSQL. Nick Dearden, from Confluent, led the training and did an excellent job.  He gave a very clear introduction to the problem space, and the role that KSQL plays.

Then we launched into the hands-on lab.  Wow!  I have never been to a hands-on that was so smooth.

There were 50 students in the rooms and each of us had a pre-assigned AWS user account, with which we could ssh into a server running KSQL and MySQL.  There was a data generator running, I believe using the Kafka Connect Datagen connector (though I  could be wrong on that).  So, everything was ready to go and we were all working through the exercises within minutes.

Along the way, if anyone got stuck, Brian Likosar and Cliff Gilmore were on hand to help out.  From what I could see, nobody was stuck for long.

The exercises were simple, yet detailed enough to show some of the cool features of KSQL. I had seen several video demos of KSQL before, but this was my first time trying it out.  It was pretty fun.

For me the highlight—beyond just being in a room with so much Kafka brainpower—was when we ran  explain on one of the queries we had written, and lo and behold, there's the KStream topology!  I guess I should have figured this, but it was still cool to see.  KSQL is basically a really slick Kafka Streams app.

So, the workshop was fun and informative, and KSQL is a pretty powerful tool, especially for those who are not living in the JVM. But the real take-away, for me, was that the Kafka Streams API is amazing!

Tuesday, February 11, 2020

Spock: Expect two calls from one method with specific results.

This is probably nothing new to many people, but it was a very pleasant surprise to me, so I'm posting here in case there are others as clueless as I. 

Here's the situation. I've got a method under test, in Spock. (Yay! Groovy!) There is a mocked class that has a method that will be called twice, so we want to expect that, and we need to specify the return results because they are used by the method under test. In some scenarios, the expected params are known and easy to construct in the test, like a String literal. This is the easy one.

However, in some scenarios, the expected params are more difficult to construct in the test code, and I still want to expect the two calls in a specific order and with specific return values—which, again, are used downstream.

This is where I was stumped.  My very limited Spock skills came back to haunt me, and Google failed to hide my ignorance. No worries: I'm working on a project with the brilliant (and helpful) developers from Object Partners, and Neal Buesing came up with this little gem.

Needless to say, it worked like a charm, and kept me from cluttering up the test with a ton of code to construct the complex objects that would have been needed.

So, if you're reading and didn't know about this trick, tuck it away for later, and you won't have to admit to the world that you were stumped.

Friday, February 7, 2020

Pass me another cup of Kafka Kool-Aid

Ok, I admit it.  I have thoroughly imbibed the event-driven Kool-Aid.

It all started with a new job. This new job would involve Java development (I miss Groovy), and we were going to be using something called "Kafka". So, before the job began, I started looking into Kafka. I soon made the connection that my good friend, Tim Berglund, works for the company founded by the creators of Kafka, Confluent.  Tim and his colleagues have produced a seemingly endless supply of resources to help people like me understand this amazing new world. (Hmm... I wonder if there is a name for a barista who makes Kool-Aid instead of coffee?)

Anyhow, after watching several of the short Confluent videos, I graduated to conference recordings from Kafka Summit. Wow!  It was like I'd reached the Kool-Aid bottling plant! So much good information by such good presenters. I can't wait to attend one of these in person!

Then the job started.  To my surprise, I found that I'd be working with consultants from Object Partners.  I was familiar with OPI (as we affectionately call them) from the good ol' days of Grails, so it was great to know that they were also working in this amazing space.  I guess someone at OPI knows how to pick great technologies!

The team from OPI has been a tremendous help and encouraged me to dive right in.  So, I have.  I'm currently working on my first Kafka Streams story, and having a blast!

There are so many other great resources and friendly, helpful, and brilliant people that I could talk about. I hope to write more here, as I continue to learn and enjoy this refreshing beverage.  For now, I'll just pour another cup and get back to coding.