What Might Be Throwing An Exception in my Ruby Code?

Hi everyone, I am a software developer who has never worked in Ruby before. I am building an application using Logstash, and Logstash permits Ruby scripting. (I’m sorry, I don’t know what version of Ruby, but I assume a recent version.) At one point in my Logstash, I am using the following Ruby code:

socket = TCPSocket.new("192.168.3.1", 12345)
socket.write event.to_hash
response = socket.recv(10000000)
event.set("NewData", response)

In Logstash, an event is a data record, which you can think of as a record in a database. So, the above code should do the following:

  • Opens a TCP socket to remote host 192.168.3.1, port 12345
  • Writes the event (expressed as a hash string) to the socket
  • the remote server does some processing
  • When the remote server sends a response back, the Ruby code writes that response is written to variable “response”
  • Adds a new field to the event called “NewData,” and populates it with the value of the “response” variable.

Should be simple. But I’ve noticed that perhaps 33% of the time, I don’t see a valid value in the “NewData” field. And when I look at Logstash’s log, I see this:

[2020-10-09T16:34:10,476][ERROR][logstash.filters.ruby ][main][09de6b10cf3fdaa7a5ae8b4e3fcd73837267db580130edc34de5fc2c7e5e9cb2] Ruby exception occurred: Java heap space

I see that line hundreds of times in my log. I don’t know what it means, but that “Ruby exception occurred” makes me wonder if one of the four lines of Ruby code is throwing an exception, and hence, is my problem.

If any of those lines is throwing an exception, I’m betting it’s the recv() line? Or maybe the write() line? I just don’t know, and Googling these functions hasn’t really helped much. Can anyone offer any practical experience here? What, if anything, might be throwing these exceptions? I realize this is an open-ended question; thanks for any advice you might be able to offer.

As you have a “Java heap space” error, I assume you are running jruby on the JVM. You can try increasing the JVM memory: jruby - java.lang.OutOfMemoryError: Java heap space - Stack Overflow

But that memory problem seems to be discussed on the elastico forums, with a solution mentioned in one post: Java heapspace issue while running logstash - #3 by Gauti - Logstash - Discuss the Elastic Stack

Thanks PCL,

You really went above and beyond. As I mentioned, this is Ruby within a Logstash instance, and I have no idea what Logstash is doing under the hood. Its possible, I guess, that Ruby invokes Java, Java fails to allocate memory, and that’s what my log error message means. Not sure how I solve that. Thanks!

As @pcl observes, logstash looks like it is using jruby under the covers, which is an implementation of ruby that runs on the Java virtual machine, in the same way that C# and Visual Basic both run on the Microsoft CLR. So you get to write your logstash extension in ruby, but it actually executes in the JVM. When the JVM gets started, you can pass parameters to it, and one of these allows you to change the amount of heap space it has available. I’m sure that the logstash config will allow you to adjust this.

How big is the expected response? I suspect that your call to recv() will probably allocate a 10MB buffer in the JVM for each call, even if it doesn’t use it all. If you aren’t ever expecting a 10MB response you could reduce this value.

Also, I’d really question the advisability of calling out to an external service every time a log record gets written to logstash, as it could cause a bottleneck. The developers of logstash have gone to great lengths to make their processing as slick as possible. Has the same care been taken to minimise performance issues in the development of whatever is on the other end of your socket call?

Thanks for writing, specious, I really appreciate it. That’s fascinating that Logstash uses Ruby-on-top-of-Java; I wonder why they did that? Maybe because server admins can code in Ruby, but Java is a little more demanding.

In my case, the individual log records that Logstash receives are ~10 KB in size, and then the remote process on the other end of the socket sends back a char* string no bigger than 50 Bytes. So the heap allocation of 10 MB for every recv() call is deinfately overkill. I’ll have to research and see if I can’t get that globally reduced.

Thanks for taking the time to write. Its feedback like this that makes me a better designer. :slight_smile:

You don’t need a global change; just update your code to reduce the buffer size on the recv call

response = socket.recv(1000)

50 is probably too small, as it will send more than just the response payload. Try 1000 and see if it improves at all.

Also, logstash is written in Java, so it kind of makes sense for them to use jruby for the scripting as it can run in the same JVM.

Wow, thanks, specious! I didn’t realize I could set the receive buffer like that! This is why this forum is so awesome. A little advice from you goes a long, long way to helping me with my project. It would have taken me years of experience to find that. Thank you!

Good information thanks for sharing
vmware