Stateful Experiments on AWS Lambda

teraflop · on March 12, 2018

The article's conclusion -- that Lambda is cheaper than EC2 instances for this use case -- is completely wrong. The author only counted the per-request overhead, and neglected to add the actual cost of the GB-hours consumed. If each container uses 512MB of memory, then keeping one request running at a time for an entire month costs about $22. For comparison, a t2.nano instance with the same amount of memory costs $4/month.

Lambda is a value-added service on top of EC2. It only makes financial sense to use it when you don't want something running constantly, or otherwise have a way to take advantage of the extremely fine granularity in billing. (Or if you're willing to pay a premium to have Amazon manage your process lifecycles for you.)

cmeiklejohn · on March 12, 2018

Good point.

But, most of the post was meant as joke to see what I could push lambda to do.

sitkack · on March 12, 2018

Someone already published a paper that pushed those boundaries https://cs.nyu.edu/~anirudh/excamera.pdf

curun1r · on March 12, 2018

$4/mo doesn't include the cost of monitoring and maintenance that come with running your own instance. When you can't amortize those costs across a large fleet of instances, Lambda is often much cheaper when looking at the total cost. Also, to get anywhere near the same reliability expectations, you'd need at least 2 instances and an ELB. Granted that will be able to handle a lot more traffic, but it's still not fair to compare Lambda to a single instance.

thallium205 · on March 12, 2018

You would also need to include a messaging queue infrastructure as lambda offers a dead letter queue. So that will require a few more RabbitMQ ec2 boxes in cluster mode.

bpicolo · on March 12, 2018

SQS/SNS + Elastic Beanstalk for queue workers should do the trick?

thallium205 · on March 12, 2018

Or that, correct. I was just trying to keep it strictly within EC2 for the comparison.

alien_ · on March 12, 2018

Great article, and an interesting use of Lambda, thanks for sharing!

To answer your final question: I wrote a spot instance automation tool, you can check it out at autospotting.org, so I would give spot instances a try. The latest developments from AWS on the spot market are real game changers, I think most of the workloads can now safely run on spot, my AutoSpotting tool makes it a breeze to migrate from on-demand AutoScaling groups while keeping them a bit more reliable than the native AutoScaling integration for spot.

As of a few months ago the pricing is much more stable than before, I've rarely seen terminations even over the maximum three months of history for instances that used to go bust multiple times a day. You also now pay them on a per-second basis, and you can hibernate the last one to keep the state of the group while everything is down.

So my approach for this would be to have an AutoScaling group of the smallest spot instances that can run your app, scale them to N nodes right before your experiment, then when you're done scale down to a single one, which you use as data seed next time, which you detach and hibernate with API calls.

Next time you re-attach the seed to the empty group, and scale out to N once again and run your test. So you only pay for the length of your test on a per-second basis.

You can also keep the seed as an on demand node outside of the spot group and have it run from the free tier if you still have some time left, or just hibernate it as well.

mncharity · on March 12, 2018

Keith Winstein (Stanford) et al's gg [1] is also fun. Sort of `make -j1000` for 10 cents. Create a deterministic-compilation model of a C build task, upload the source files, briefly run a lot of lambdas, download the resulting executable. (Though it's more general than that.)

For folks long despairing that our programming environments have been stuck in a rut for decades, we're about to be hit by both the opportunity to reimagine our compilation tooling, and the need to rewrite the world again (as for phones) for VR/AR. If only programming language and type systems research hadn't been underfunded for decades, we'd be golden.

[1] https://github.com/StanfordSNR/gg ; video of talk demo: https://www.youtube.com/watch?v=O9qqSZAny3I&t=55m15s ; some slides (page 24): http://www.serverlesscomputing.org/wosc2/presentations/s2-wo...

ghayes · on March 12, 2018

I've found out-of-the-box distributed erlang difficult to run in environments with a lot of instance churn (e.g. containerized deployments on Kubernetes), so much so that I generally opt to not connect my nodes for erlang message passing. Does anyone here have experience running Lasp in Kubernetes? Is Lasp effective in monitoring and adjusting to new or dead nodes?

di4na · on March 12, 2018

Lasp use Partisan for its distribution, not the default erlang distribution.

It was built exactly to deal with a high level of churn, specifically for edge computing with spotty network. So yes it does.

https://github.com/lasp-lang/partisan

jcora · on March 12, 2018

On mobile the characters on your site are literally like a millimeter wide you should fix that

cmeiklejohn · on March 12, 2018

Thanks for the feedback.

jcora · on March 12, 2018

Np mate:) I recommend ghost btw!