One feature of the _search API endpoint is to allow users to submit Groovy code in the search query itself. The server will then execute the code in a sandboxed environment, returning the result to the user. This way, the elasticsearch code can be used to execute… more code.
Quite a dangerous feature indeed. Thankfully, according to the documentation[0], this feature is disabled by default since v1.4.3. (Hindsight is 20/20, but this probably should have been the case from the get-go)
This is old. There are bots that scan servers for ES and try to exploit this vulnerability. Our servers were hacked through this hole before we upgraded to a newer version of ES.
Now we don't allow remote access to ES directly. We use a proxy script that parses HTTP requests from users, interacts with local ES instance and serves users with transformed data.
It was served through a proxy script, but actual ES instance was accessible to anyone and yes anyone could delete documents, groovy dynamic scripting was also enabled. Stupid setup I know... Our documents were not touched, but our servers were used to ddos something.
I've gotten this too. It started DOSing ip addresses in China, which is a pretty funny thing for elasticsearch to be able to do. Lesson learned: don't expose elasticsearch directly to the internet; even though it is a REST server, it was never designed for that. At the very least take 10 minutes to write an nginx proxy around it.
I had a Digital Ocean VPS shut down because of the previous exploit (mentioned in the article). I was just testing an open-source project of my own and did a standard/stock install.
I don't really understand why a project like this would default to having such vulnerable settings turned ON.
tl;dr: If you're doing sandboxing by blacklisting specific strings in your input, you're doing it wrong. Java actually has good sandboxing capabilities. Use them!
The author of that sandboxing feature of Groovy recently wrote a blog article on improving sandboxing of Groovy scripts that are executed at runtime, see http://melix.github.io/blog/2015/03/sandboxing.html
His conclusion says that using Groovy for scripting on the JVM comes at the price of security, and that the customizers in the Groovy distribution and those in the wild aren't enough to guarantee security of execution of scripts in the general case, but if you loosen some of the dynamic features of Groovy, you can work around those limitations through type checking extensions, though that solution isn't available in Groovy's core distro. He adds he'll have less time to work on Groovy, probably refering to how he and the other full-timer building Groovy had their funding pulled beginning the following week (end March 2015). There's now noone working fulltime building Groovy.
I have always followed CouchDB from a distance, so I am curious, has anyone been able to pull the same crap writing Erlang or Javascript? This is more relevant to the latter bc of the sandbox, but this NoSQL architecture decision sounds very familiar.
I swore I read something about attempting to break the JS sandbox but found nothing with my weak Googlefu.
So there's plenty of bugs in the JS interpreters, and you can use those to do regular exploits and run arbitrary code.
Separately is escaping the sandbox due to a bug in the verification. JS is probably far easier to secure, since all the APIs are "safe" to call. That is, there isn't a huge JS stdlib that needs to be restricted. Like brainfuck-there's no way to make dangerous calls. Whereas Java has a huge stdlib that's accessible, so you're fighting an uphill battle to lock it down.
JS is a sandboxed language from the start, as it was originally designed to run in browsers. As we know from years of Java Applet-based exploits, it seems anything that allows you to run arbitrary code inside the JVM is just asking for exploits.
One feature of the _search API endpoint is to allow users to submit Groovy code in the search query itself. The server will then execute the code in a sandboxed environment, returning the result to the user. This way, the elasticsearch code can be used to execute… more code.
Quite a dangerous feature indeed. Thankfully, according to the documentation[0], this feature is disabled by default since v1.4.3. (Hindsight is 20/20, but this probably should have been the case from the get-go)
0. https://www.elastic.co/guide/en/elasticsearch/reference/curr...