Hacker Newsnew | past | comments | ask | show | jobs | submit | orph's commentslogin

Why not apply changes to the underlying model so that you crush every available eval?


SOTA results are a happy byproduct of the core mission of our approach, which is to enable the effective and simple translation of policy documents into a model without having to fine-tune and prompt engineer. This performance is somewhat unexpected but also sensical, so we're still trying to figure out the best way to harness it. That may include releasing model artifacts in the future.


awesome


Can I download all the messages & attachments?


Sure, but there's a few petabytes of attachments and over 63 billion messages. Feel free to use the API.


Everyone is a decent programmer now who can solve nearly any problem with help from LLM.


Everyone? Most people are incapable of expressing a problem in reasonably clear terms. They often don't even know the right questions to ask.

LLMs are pretty good at giving you what you ask for. Not so good at telling you that you're asking for the wrong thing.


> LLMs are pretty good at giving you what you ask for. Not so good at telling you that you're asking for the wrong thing.

So they're comparable to rubber ducks. I would like to see data from a comparative study with rubber ducks, LLMs, and a control group.


Here is a problem I've been noodling with. If you are a decent programmer, how does your LLM help you solve this problem?

Given a cheminformatics fingerprint definition based on SMARTS substructure patterns, come up with a screening filter, likely using a decision tree, which uses intermediate feature tests to prune search space faster than simply testing each pattern one-by-one.

For example, the Klekota-Roth patterns defined in their supplemental data (and also available from CDK at https://github.com/cdk/cdk/blob/main/descriptor/fingerprint/...) contain patterns like:

    "CC(=NNC=O)C",
    "CC(=NNC=O)C(=O)O",
    "CC(=NNC=O)C=C",
    "CC(=NNC=O)C=Cc1ccccc1",
Clearly if 'CC(=NNC=O)C' does not exist in the molecule to fingerprint then there is no reason to test for the subsequent three patterns.

Similarly, there are patterns like:

    "FC(F)(C=O)C1(F)OC(F)(F)C(F)(F)C1(F)F",
    "FC(F)(F)C(F)(F)C(F)(F)OC(F)(C=O)C(F)(F)F",
    "FC(F)(F)C(F)(F)C(F)(F)S",
which could be improved by an element count test - count the number of fluorines, and only do the test if there are enough atoms in the molecule to fingerprint.

So one stage might be to construct a list of element counts;

   ele_counts = [0]*200
   seen = set()
   for atom in mol.GetAtoms():
      ele_counts[eleno:=atom.GetAtomicNum()] += 1
      seen.add(eleno)
then have a lookup table for each element, based on the patterns which have at least that count of the given element type;

   ele_patterns = [
     # max known count, list of set of matching patterns
     (0, [set()]), # element 0
     (0, [set()]), # hydrogen
     ..
     (20, [{all patterns which contain no carbon},
           {all patterns which require at most 1 carbon}, ...
           {all patterns which require at most 19 carbons}],
     (10, [{all patterns which contain no fluorine}, ..
           {all patterns which contain at most 9 fluorines}], 
      ...]
so one reduction can be

   def get_possible_patterns(seen, ele_counts):
     for eleno in seen:
        max_count, match_list = ele_patterns[eleno]
        count = min(ele_counts[eleno], max_count)
        yield match_list[count]
   patterns = set.intersect(*get_possible_patterns(seen, ele_counts))
and only test that subset of patterns.

However, this is not sophisticated enough to identify which other tests, like the "CC(=NNC=O)C" example I gave before, or "S(=O)(=O)", which might be good tests at a higher level than the element.

And clearly if there isn't a sulphur, aren't two oxygens, and aren't two double bonds then there's no need to test "S(=O)(=O)", suggesting a tree structure would be useful.


It made sense when I named GitHub Copilot, since that product was a passive addition to your regular workflow.

The name was sticky enough that they've run with it, misunderstanding or ignoring that fundamental metaphor.


In my perception, Github Copilot is the OG. The first one, amd the only product in that list that is actually making my life better.


This is what happens when engineers borrow the concepts previously created by dedicated teams. Happens a lot with UX/UI, too.


I can almost guarantee you that this was a sales exec decision rather than an engineering decision.


New business models come around very rarely. In the mean time, existing companies optimize everything they can relentlessly - that's why people own their stock.


Hono still doesn’t support request cancellation well.

So if you’re streaming tokens from an LLM and you cancel the request from the client you’ll be wasting money.


Lilac for dataset cleaning and polishing https://www.lilacml.com/



Minion AI | Fullstack Eng, ML Eng, Tools Eng | SF or Remote | Full-time

Creator of GitHub Copilot here <wave>.

Minion AI is on a mission to build a useful web agent that performs tasks for you. Our personal AI is at the forefront of rethinking human-computer interaction in light of AI advancements.

Join us to work at the bleeding edge of AI: prompting, fine-tuning, synthetic data, learning by example, codegen, planning, reasoning, and memory for embodied agents.

See Minion in action here: https://twitter.com/ai_minion/status/1719455863973323111

Apply at https://minion.ai/jobs or DM @alexgraveley on X


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: