If we want guarantees, (and we do when it comes to security and correct processing of data), we need deterministic gates. The most robust, best specified means of achieving this is MCP, and currently, the best way to manage this in teams is via remote MCP over HTTP.
A lot of the best tooling around AI we're seeing is adding deterministic gates that the probabilistic AI agents work with. This is why I'm using MCP over http. I'm happy for the the agent to use it's intelligence and creativity to help me solve problems, but for a range of operations, I want a gate past which actions run with the certainty of normal software functions. NanoClaw sells itself on using deterministic filtering of your WhatsApp messages before the agent gets to see them, and proxies API keys so the agent bever gets them - this is a similar type of deterministic gate that allows for more confidence when working with AI.
I follow a similar pattern. My autonomous agent Smith has a service mesh that I plug MCPs into, which gives me a single place to define policy (OPA for life) and monitoring. The service gateway own credentials. This pattern is secure, easy to manage and lets you can programmatically generate a CLI from the tool catalog. https://github.com/sibyllinesoft/smith-gateway if you want to understand the model and how to implement it yourself.
The boundary also needs to hold if the agent is compromised. Proxying keys is the right instinct. We took the same approach at the action layer: cryptographic warrants scoped to the task, delegation-aware, verified at the MCP tool boundary before execution. Open source core. https://github.com/tenuo-ai/tenuo
It seems like we're going back to expert systems in a kind of inverted sense with all of this chaining of deterministic steps. But now the "experts" are specialized and well-defined actions available to something smart enough to compose them to create new, more powerful actions. We've moved the determinism to the right spot, maybe? Just a half-thought.
I'm just trying to learn this stuff now, so I don't the literature. The "trajectory view" through action space is what makes the most sense to me.
Along these lines, another half-baked pattern I see is kind of a time-lagged translation of stuff from modern stat mech to deep learning/"AI". First it was energy based systems and the complex energy landscape view, a-la spin glasses and boltzmann machines. The "equilibrium" state-space view, concerned with memory and pattern storage/retrieval. Hinton, amit, hopfield, mackay and co.
Now, the trajectory view that started in the 90s with jarzynski and crooks and really bloomed in 2010+ with "stochastic thermodynamics" seems to be a useful lens. The agent stuff is very "nonequilibrium"/ "active"-system coded, in the thermo sense... With the ability to create, modify, and exploit resources (tools/memory) on the fly, there's deep history and path dependence. I see ideas from recent wolpert and co.(Susanne still, crooks again, etc.) w.r.t. thermodynamics of computation providing a kind of through line, all trajectory based. That's all very vague I know, but I recently read the COALA paper and was very enchanted and have been trying to combine what I actually know with this new foreign agent stuff.
It's also very interesting to me how the Italian stat mech school, the parisi family, have continuously put out bangers trying to actually explain machine learning and deep learning success.
I'd love to hear if anyone is thinking along similar lines, or thinks I'm way off track, has paper recs please let me know! Especially papers on the trajectory view of agents.
I have wondered if we're going to end up investing so much in putting up guard rails around AI that we end up with systems of the same complexity as a non AI expert system that runs slower and at higher costs due just having injected models and tokens into the mix! I joke, but it seems like there's a pull towards that.
I think we need to just think of agents as people. The same principles around how we authenticate, authorize and revoke permissions to people should apply to agents. We don't leave the server room door open for users to type commands into physical machines for good reason, and so we shouldn't be doing the same with agents, unless fully sandboxed or the blast radius of malign or erroneous action is fully accepted.
Bit of a plug I suppose, but this was what motivated me to set up AS Notes, my VS code extension which makes VS Code a personal knowledge management system, with linking and markdown tooling. I've built an html converter so they can be published to github pages from the repo. It's here if it's of interest to anyone https://www.appsoftware.com/blog/as-notes-turn-vs-code-into-... ... I'm so much more motivated to write docs when a) its easy to keep them up to date using an agent, and b) someone (agents) will actually read them!
I didn't get that feeling. But it is long, which is a bit of an AI smell. That said, I think use of AI in writing might not always be a negative. On a couple of documentation pieces I have used AI to provide better structure to writing that I've started and check for technical correctness parts of a document where I need to check my terminology. As long as the original idea is human and AI helps to make the signal clearer, I'm ok with it.
I'm a muggle when it comes to vim, but I've considered learning it again recently because of AI. I'm building more than I have in years because I love being able to try things out without investing 3 months to get something working before I can really test the idea. And so I am typing A LOT. Less code, but lots of markdown, prompts and config. My hands are hurting, I really wish I had a power tool for typing. Writing is always going to allow us to be more precise than speech, and is a tool for creative thought in its self. I can see how we might be bearish on our expectations around new adoptees, but I think there's pressure to get more out of our editors too. Two of my recent projects have been vscode extensions, because I'm needing more help from my editor, not less: https://www.appsoftware.com/blog/fixing-agent-llm-context-de..., https://www.appsoftware.com/blog/as-notes-turn-vs-code-into-...
I use SpeechNote with FasterWhisper and I've found it to be much better than any cloud dictation services I've tried (which, to be fair, is not a whole lot... they all suck so much that I give up fast).
I saw that the Hetzner matrix like has GPU servers < £300 per month (plus set up fee). I haven't tried it but I think if I was getting up to that sort of spend I'd be setting up Ollama on one of those with a larger Qwen3 max model (which I hear is on par with Opus 4.5?? - I haven't been able to try Qwen yet though so that could be b*****ks).
I have tried most of the major open source models now and they all feel okay, but i’d prefer Sonnet or something any day over them. Not even close in capability for general tasks in my experience.
For anyone still watching this thread, I have reworked the UI to move the task board to the editor pane from the side bar, and have worked to improve the agents tracking of the core instruction set and current task reference in a way that allows working on parallel threads.