Memory limit for pdfalto subprocess in Grobid server not working with docker image (original) (raw)

So we have a long run Grobid server process we run with Xmx18G. What we notice is that processing one batch of ~1000 pdfs consumes 7-10GB, but then processing the 2nd batch of ~1000 pdfs consumes another 7-10GB and eventually the server gets killed with OOM.

This is a consistent finding where the server keeps consuming more and more memory and needs to be restarted. Is there possibly a memory leak? Are there any knobs / workarounds we can play with?