Declarative workflows is such a good idea, fantastic, and I love the AI first principles where pipeline creation and editing the pipeline can be done with AI too.
The declarative style keeps the workflow detail at a high enough level to iterate super quick - love that. More important to me is that it’s structured and seems like it would be more testable (I see validation in your docs).
Zooming in to the pipe/agent steps I can’t quite see if you can leverage MCP as a client and make tool calls? Can you confirm? If not what’s your solution for working with APIs in the middle of your pipeline?
Also a quick question, declarative workflows won’t solve the fact that LLMs output is always non deterministic, and so we can’t always be guaranteed the output from prior steps will be correct. What tools or techniques are you using/recommending to measure the reliability of the output from prior steps? I’m thinking of how might you measure at a step level to help you prioritise which prompts need refinements or optimisations? Is this a problem you expect to own in Pipex or one to be solved elsewhere?
Great job guys, your approach looks like the right way to solve this problem and add some reliability to this space. Thanks for sharing!
Hi Clafferty,
Providing an MCP server was a no brainer and we have the first version available. But you're right, using "MCP as a client" is a question we have started asking ourselves too. But we haven't had the time to experiment yet, so, no definitive answer.
For now, we have a type of pipe called PipeFunc which can call a python function and so possibly any kind of tool under the hood. But that is really a makeshift solution. And we are eager to get your point of view and discuss with the community to get it right.
Many companies are working on evals and we will have a strategy for integration with Pipelex. What we have already is modularity: you can test each pipe separately or test a whole workflow, that is pretty convenient. Better yet, we have the "conceptual" level of abstraction: the code is the documentation. So you don't need any additional work to explain to an eval system what we were expecting at each workflow step: it's already written into it. We even plan to have an option (typically for debug mode) that checks every input and every output complies semantically with what was intended and expected.
Thanks a lot for your feedback! It's a lot of work so greatly appreciated.
You’re going to need to make a few aggressive WAF Rules, pepper in some whitelisting rules and if you can, add rate limiting.
1. Block all unverified bots with a bot score of 1. This will still allow popular web crawlers but could be strict enough to block a curl request.
2. Use Manage Challenge for unverified bots with a bot score less than 30. This will silence most of the trouble making bots and provides a JavaScript (not necessarily Captcha) solution for users who are incorrectly scored.
3. Add rate limiting. Figure out a realistic access rate, double it and use that as a hard limit that will block traffic for an hour or day depending on your needs.
4. Add more sensitive rate limits and play with manage challenge rules. Use the simulate option before enabling any rate limits. You can add challenges here if you feel a limit might be affecting users too. Simulate for a few days before enabling
5. Review rate limits and firewall reports regularly and adjust. With any Managed Challenge rules make sure to check the percentage completed to see if you’re trapping real users. This number should be as close to 0 as possible. Repeat step 4.
You’ll want to get around your own blocking rules with some complimentary whitelisting rules.
Although it’s advised to lock down your origin server to prevent non Cloudflare traffic hitting your server you might not be able to do so easily, if you’ve got load balancers and other infra in your way that can’t be touched. Just make sure your root domain isn’t leaking your www IP address. You can use CNAME flattening and you should be alright.
The difficulty in these solutions is managing all the rules you can make. Things can quickly become too complicated to make changes easily. Keep it simple, have a few basic but aggressive blocking rules and revise your whitelist and rate limits regularly. Good luck
Thank you Conor! We are definitely excited as well :) did you sign up for updates? Also, I'd love to grant you access to our invite-only slack community for hardware builders. Ping me adar@jiga3d.com .
Netlifys approach to static sites seemed like a no brainer since HTML is easy to host. They added lots of sugar around that to make it effortless and easy. So M3o seems like a great idea.
Micro on the other hand will need some buy in but I'll definitely consider it since we use Go already.
One concern is your pricing seems too reasonable (free)!
> One concern is your pricing seems too reasonable (free)!
Is that reasonable? Building anything on top of obvious loss leaders is risky. And the costs here for a year of service are either $0 on the free plan or $420 for the cheapest paid plan. There's a chasm of difference there.
I can't understand people offering platforms and services with a pitch that tries to describe what sets $THING apart from others, but when it comes to pricing, they settle for looking around at what everyone else is doing and copying that—which usually results in offering a free tier and paid plans usually starting at $5–9+ per month. Chances are, what you're selling probably isn't as unique as you think it is (this post mentions Netlify several times, for example, but Netlify doesn't only do static hosting―you can run microservices with Netlify, too, and Netlify's paid plan is cheaper).
More businesses could and should differentiate themselves in their pricing model, because the door is wide open and pretty much nobody else is showing interest in taking advantage of the opportunity.
> Is that reasonable? Building anything on top of obvious loss leaders is risky. And the costs here for a year of service are either $0 on the free plan or $420 for the cheapest paid plan. There's a chasm of difference there.
The dev tier is free, which is a capped environment with much laxer SLAs than the production tear. Prod tier is paid.
What's the takeaway I'm supposed to leave with after having read this reply? What information does this comment add? I'm genuinely confused about the subtext here.
Thanks for the feedback. And yes we're learning and iterating on pricing. I think the key will be fair usage policy and resource limits that allow people to scale with their usage. We'll also work on additional pricing tiers for larger teams which I think will be more appropriate. Theres a lot we can't do on $35/user/month. But it's a start!
This looks awesome, great job! One thing that will slow me down from using this is I've not settled on an ID or Access Management system. Being a small company, we occasionally need to grant system access to contractors or other dev teams. The problem is we don't want to grant the access too wide and specifying fine grained controls takes a lot of time.
Armon mentions Okta and Ping, does anyone have any recommendations in this space that would work for managing a small team with occasional on/off boarding of contractors?
I'm currently looking into Nano (formerly Raiblocks). Coincidentally they've also just released their updated roadmap at the same time. https://developers.nano.org/roadmap
At the moment it appears like a lot of cryptos are jumping on the lightning solution, I'm curious to see if Nano can compete with them by avoiding lightning all together.
Great job, I'll be trying this out during my next sprint. So far Soundrown is my favourite.
Both seem very similar in functionality but I prefer the visual style of Soundrown more.
http://soundrown.com/
The declarative style keeps the workflow detail at a high enough level to iterate super quick - love that. More important to me is that it’s structured and seems like it would be more testable (I see validation in your docs).
Zooming in to the pipe/agent steps I can’t quite see if you can leverage MCP as a client and make tool calls? Can you confirm? If not what’s your solution for working with APIs in the middle of your pipeline?
Also a quick question, declarative workflows won’t solve the fact that LLMs output is always non deterministic, and so we can’t always be guaranteed the output from prior steps will be correct. What tools or techniques are you using/recommending to measure the reliability of the output from prior steps? I’m thinking of how might you measure at a step level to help you prioritise which prompts need refinements or optimisations? Is this a problem you expect to own in Pipex or one to be solved elsewhere?
Great job guys, your approach looks like the right way to solve this problem and add some reliability to this space. Thanks for sharing!