Hacker Newsnew | past | comments | ask | show | jobs | submit | w01fe's commentslogin

This is incorrect. In the process of producing each token, activations are produced at each layer which are made available to future token production processes via the attention mechanism. The overall depth of computations that use this latent information without passing through output tokens is limited to the depth of the network, but there has been ample evidence that models can do limited "planning" and related capabilities purely in this latent space.

"Attention" is just a matmul. Q = KV/sqrt(d) etc.

I don't see how any planning is done in latent space. Can you point me to any papers? Thanks.

Edit: Oh, I see you're probably talking about CoCoNuT? Do all frontier models us it nowadays?


There's a lot of research on this topic. https://arxiv.org/abs/2303.08112 and https://arxiv.org/abs/2311.04897 are just two examples that come to mind

Thank you! Heading down this rabbit hole....

I think this ignores the fact that an agent can be meaningfully embodied in the internet, using APIs for sensors and actuators. OpenAI's training of large language models with reinforcement learning, recent retrieval augmented models and "chain of thought" reasoning are all meaningful steps in this direction in my opinion.


Microsoft (Semantic Machines) | Senior Software Engineer | Berkeley, CA or Boston, MA or Redmond, WA or fully REMOTE | Full Time

The Semantic Machines group is bringing next-generation natural language processing (NLP) technologies to products used by hundreds of millions of people worldwide. You can learn more about how Microsoft is using this technology to create entirely new kinds of user experiences here: http://semanticmachines.com/.

At the core of our platform is a new programming language, designed to support programs structured like human commands which are predicted by machine learning models. This language includes ideas from functional and logic/constraint programming, as well as novel features for meta-computation and introspection. We’re looking for engineers to work alongside product and research teams to help guide the evolution of this platform, including improvements to the core programming language, runtime, constraint system, and tooling.

The ideal candidate should be passionate about designing and evolving programming languages and/or practical formal reasoning systems, supporting users with high quality tools, and working on a rapidly evolving product-driven platform. No experience with machine learning or natural language processing is required – we’d love to work with people who are excited about the promise of these technologies and the opportunity to make them more accessible, regardless of their previous exposure to them.

Learn more: https://www.microsoft.com/en-us/research/group/msai/articles...

Apply: https://careers.microsoft.com/us/en/job/1215447/Senior-Softw...


We put a lot of work into this post to explain some of the deeper problems we are trying to address in our conversational AI framework:

https://www.microsoft.com/en-us/research/group/msai/articles...


At Semantic Machines [0] we rely heavily on probabilistic programming to build state-of-the-art dialogue systems. In particular, we use a library called PNP (probabilistic neural programming) on top of Dynet to allow us to express structured prediction problems in a simple and elegant form. If there are questions I am happy to elaborate to the extent I can. (Also, we are hiring! My email is jwolfe@.)

[0] http://www.semanticmachines.com/


Configurability was a major design goal. Check out the test for examples of how to insert your own leaf generators or wrappers:

https://github.com/Prismatic/schema/blob/master/test/clj/sch...

Does that seem like it will meet your needs?


Author here, would love to hear feedback, and more than happy to answer questions


Author here, would love feedback on this, and also happy to answer any questions.


I saved the document for later use, but it got me thinking how ORMs like hibernate is dealing with the migration and future proofing issues. I wrote code libraries for a small company, in which part of the libraries design is to allow data transformation/representation of Data Models to work on the new database schema/changes while representing the data structure as if it were from previous versions. So I ended up writing generators to create a mapping of the DAO and the Models exposed to the API's in such a way that when the database changes while you still have to support the previous version of the API, I will only have to edit/tweak the mappers of the previous version of the API.

The library has been opensourced here in this link https://github.com/ivanceras/orm


Yep, that sounds about right. For us, API communication and some global events flow through channels into the app state, where it's distributed to components. For routing we've built up a library around secretary, where we write down a map of uris, handlers, and metadata (can be a modal, etc). Info about the current page and such goes into the app state and that drives the transitions between page components.


No, you are still free to organize your application state however you wish. For example, our app dynamically pulls in content as you browse from feed to feed and use infinite scroll. It's up to you to decide what to fetch when, and what to cache/discard as the user navigates around.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: