> but in this case the works in question were released under a license that allowed free duplication and distribution so no harm was caused.
FSF licenses contain attribution and copyleft clauses. It's "do whatever you want with it provided that you X, Y and Z". Just taking the first part without the second part is a breach of the license.
It's like renting a car without paying and then claiming "well you said I can drive around with it for the rest of the day, so where is the harm?" while conveniently ignoring the payment clause.
You maybe confusing this with a "public domain" license.
If what you do with a copyrighted work is covered by fair use it doesn't matter what the license says - you can do it anyway. The GFDL imposes restrictions on distribution, not copying, so merely downloading a copy imposes no obligation on you and so isn't a copyright infringement either.
I used to be on the FSF board of directors. I have provided legal testimony regarding copyleft licenses. I am excruciatingly aware of the difference between a copyleft license and the public domain.
> I am excruciatingly aware of the difference between a copyleft license and the public domain.
Then why did you say "no harm was caused"? Clearly the harm of "using our copylefted work to create proprietary software" was caused. Do you just mean economic harm? If so, I think that's where the parent comments confusion originates.
It sounds that way a bit from the one sentence. But that’s not the case at all.
> 4. MODIFICATIONS
> You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release
the Modified Version under precisely this License, with the Modified
Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy
of it. In addition, you must do these things in the Modified Version:
Etc etc.
In short, it is a copyleft license. You must also license derivative works under this license.
Just fyi, the gnu fdl is (unsurprisingly) available for free online - so if you want to know what it says, you can read it!
For this to stand up in court you'd need to show that an LLM is distributing "a modified version of the document".
If I took a book and cut it up into individual words (or partial words even), and then used some of the words with words from every other book to write a new book, it'd be hard to argue that I'm really "distributing the first book", even if the subject of my book is the same as the first one.
This really just highlights how the law is a long way behind what's achievable with modern computing power.
Presumably, a suitable prompt could get the LLM to produce whole sections of the book which would demonstrate that the LLM contains a modified version.
And the judgement said that the training was fair use, but that the duplication might be an infringement. The GFDL doesn't restrict duplication, only distribution, so if training on GFDLed material is fair use and not the creation of a derivative work then there's no damage.
If that would work reliably then you could apply that to human-produced code too. But nothing like that has shown to work, so I wouldn't put money on it working for LLM output.
Except one is an employee and the other one is an ex employee. The bias this introduces is not just a minor nuance, it's what fuels the public conflict and causes everybody else to double check their popcorn reserves.
Of course technical discussions happen all the time at companies between competent people. But you don't do that in public, nor is this a technical debate: "I don't recall talking to you about it" - "I do, I did xyz then you ignored me" - "<changes subject>"
Important distinction yes. It also means I can't go back and check the thread on what was said and when. Nor do I want to.
Always good to talk face to face if you're have strong feelings about something. When I said "talk" I meant literally face to face.
Spending a decade or so on lkml, everyone develops a thick skin. But mix it with the corporate environment, Facebook 2011, being an ex-employee adds more to the drama.
Having read through the comments here, I'm still of the opinion that any HW changes had a secondary effect and the primary contributor was a change in how HHVM/jemalloc interacted with MADV.
One more suggestion: evaluate more than one app and company wide profiling data to make such decisions.
One of the challenges in doing so is the large contingent of people who don't have an understanding of CPU uarch/counters and yet have a negative opinion of their usefulness to make decisions like this.
So the only tool you have left with is to run large scale rack level tests in a close to prod env, which has its own set of problems and benefits.
Perf counters are only indicative of certain performance characteristics at the uarch level but when one improves one or more aspects of it the result does not necessarily positively correlate to the actual measurable performance gains in E2E workloads deployed on a system.
That said, one of the comments above suggests that the HW change was a switch to Ivy Bridge, when zeroing memory became cheaper, which is a bit unexpected (to me). So you might be more right when you say that the improvement was the result of memory allocation patterns and jemalloc.
Anybody else being annoyed by all this focus on em-dash use to detect AI? In no time, the bad guys will tell their BS machines to avoid em-dashes and "it's not X it's Y" and whatever else people use as "tell-tale signs" and eventually the training data will have picked up on that too. And people who genuinely use em-dashes for taste reasons or are otherwise using expressions considered typical for AI are getting a bad rep.
This is all just demonstrating the helplessness that's coming to our society w.r.t. dealing with gen AI output. Looking for em-dashes is not the solution and distracts from actually having to deal with the problem. (Which is not a technical but a social one. You can't solve it with tech.)
> Anybody else being annoyed by all this focus on em-dash use to detect AI?
Yes, the “AI detectives” can be quite annoying, as the comments are always the same. No substance, just “has X, it’s AI”. The em-dashes detectives tend to be the worse, because they often refuse to understand that em-dashes are actually trivial to type (depending on OS) and that people have been using them on purpose since before LLMs.
Mind you, using em-dashes as one signal to reinforce what you already perceive as LLM writing is valid, it’s only annoying when it’s used as the sole or principal indicator.
This is turning out to be a huge issue for me as my frequent use of em-dashes makes my remarks trigger people effectively disrupting attempts to communicate. Maybe my communication needs to change or maybe these objections are yet another red flag to watch for.
I keep reading about students are learning to intentionally write worse so that it doesn't get flagged as AI-generated. I think it's a systemic problem that won't be solved in the short term, unfortunately.
It's hilarious that em dashes and "it's not X; it's Y" and other trivial things are the best way for humans to spot AI now. Like if AI robots infiltrated us, at first we'd be like "ooh, he has long ears, he's a robot". And after a while the robots will learn to keep their ears shorter. Then what? When we're out of tell-tale signs?
I can imagine that this will be similar to the "Emacs/Vim in the AI age" article - it will just be considered to matter less in the AI age. Why spend 3-5 years of your life with a sometimes frustrating experience to obtain this PhD degree if you have powerful models at your disposal that will just be able to solve everything for you? (Similar to why learn Elisp/VimScript/...) Especially considering the current trajectory, expecting where things will be in 5 or 15 years. It will just feel less and less appealing to get an in-depth education, especially a formal one.
Which is quite ironic, considering who wrote the article.
LLMs fall victim to "garbage in, garbage out." Claude can solve open problems if you know what you're doing, but it can also incorrectly convince you it's right if you don't know what you're doing.
A PhD teaches you how to think, how to learn, and how to question the world. That's a vital set of skills no matter what tool exists.
I don’t really know how to optimize for a world where AIs would be smarter than everyone and able to do everything.
If that comes to pass, I guess there won’t be any economic cost to having done my PhD because the entire economy will be AI driven and we’ll hopefully just be their happy pets.
If that doesn’t come to pass, and AIs just remain good at summarizing and remixing ideas, I guess people with experience generating research will still be useful.
Because you may have fun working in a scientific environment and doing research.
I liked my job at the university - independent of the final PhD. I enjoyed what I was doing. Most of the time I also enjoyed writing my dissertation, since I was given the opportunity to write about my stuff. And mostly I could write it in a way how I felt things are supposed to be explained.
Why spend your life doing anything at all? I'm biased on the topic since im writing up atm, but it was, if nothing else, a very itnerseting way to spend 4 years of my life.
I find it very fulfilling to do a PhD and did so myself. More people should. What I mean is that I'm expecting the general view on it to evolve as described.
Ah. I did indeed misunderstand. Also, as I said, I've got a personal stake, right at the tale end of the PhD, looking for jobs, so I guess im feeling pretty defensive. I certainly hope the general public doesn't feel this way, but I've seen plenty of people say similar things about college degrees now, so it kind of makes sense.
C dev wasn't an issue back in the 1 GB or 256 MB or 16 MB days either. You just didn't use to have a Chrome tab open that by itself is eating 345 MB just to show a simple tutorial page.
C dev wasn't a problem with MSDOS and 640K either. With CP/M and 64K it was a challenge I think. Struggling to remember the details on that and too lazy to research it right now.
Autism is a quite strong diagnosis at the end of a spectrum. Not every tech-loving introvert is autistic. That's the kind of arrogant attitude that marginalizes nerds and on a forum like HN people really ought to know better.
As somebody who identified a lot with what's in that article I can say that I haven't just made peace with having been "different" but I love it and wouldn't want it any different comparing my life today with that of the arrogant non-nerds who made fun of us back in school.
The colloquial use of the word “autism” carries with it a specific connotation and mind image. That primarily negative stereotype is being reinforced by the joke by way of it being delivered as a medical diagnosis (“the results are back”).
Your parent comment is arguing against perpetuating the wrong negative connotations and lack of understanding of autism.
Not to say the original author was doing it maliciously, I don’t think they were.
And it's time that colloquial term is put to a rest. We've left other terms behind us which used to be thrown around mindlessly but were (are) actually hurtful, we will manage with that term too.
If we take the human brain as an example, it's pretty bad at computation. Multiply two 10-digit numbers takes forever, despite the enormous size of its neural network. It's not the right tool for the job - a few deterministic logic gates could do that much more efficiently. That same circuit can't do much else, but multiplying, oh boy, it's good at that! Why do we think that artificial neural nets would be the right tool for that job? What's wrong with letting the LLM reach out to an ALU to do the calculation, just like a human would do? It's surely going to be quicker and require less energy.
The embedded programs can be connected to the other weights during training, in whatever way the training process finds useful. It doesn't just have to be arithmetic calculation. You can put any hard-coded algorithm in there, make the weights for that algorithm static, and let the training process figure out how to connect the other trillion weights to it.
If we never try, we'll never know. I wouldn't be surprised if there is something to gain from a form of deterministic computation which is still integrated with the NN architecture. After all, tool calls have their own non-trivial overhead.
Funny that they speak so negatively about "fast fashion". If anything I would expect on-demand clothes production contribute to an _increase_ in that phenomenon, rather than the opposite.
Not at these prices :-) $150 - $200 for a sweater is not cheap. I think of fast fashion in terms of "how many times do I have to wear it to get my money's worth?" If the answer is less than the number of times I'd wear it in a year, it's fast fashion. Of course, if you're a thrift shop shopper, most fashion is fast fashion.
> But I often find that with jobs I want to give to other people, so maybe I over specify?
The difference is that with other people, you are training somebody else in your team who will eventually internalize what you taught them and then be able to carry the philosophy forward. Even if it took exactly the same amount of time for you to explain (+ code review etc), it's a clear net benefit in the long run. Not so with an LLM. There it's just lost time.
FSF licenses contain attribution and copyleft clauses. It's "do whatever you want with it provided that you X, Y and Z". Just taking the first part without the second part is a breach of the license.
It's like renting a car without paying and then claiming "well you said I can drive around with it for the rest of the day, so where is the harm?" while conveniently ignoring the payment clause.
You maybe confusing this with a "public domain" license.
reply