When OpenAI trained their Dota model to beat people 1v1 mid SF, they had a conte...

inostia · on May 18, 2021

That's a great story! It definitely speaks to human beings' knack for finding edge cases.

Do you know what year that was? I'm wondering if AI has advanced and patched those weaknesses.

sillysaurusx · on May 18, 2021

I am of the surprisingly-controversial opinion that ML’s future is going to be as a human empowerment tool — a bicycle for the mind, yet so much more. Listen to how cool my gamedev music sounds: https://soundcloud.com/theshawwn/sets/ai-generated-videogame...

I feel comfortable saying that it’s awesome and being immodest, because I didn’t write it! None of those notes are mine. “Crossing the Channel” has one of the strangest tempos I’ve heard, yet it sounds so cool because I think the model made a mistake early on, and decided it wasn’t a mistake — instead, it came up with the most likely not-mistake it could think of, and it happened to sound great.

But I “wrote” all ~20 songs on that list in one night. Took about six hours, aka a standard workday. I don’t think it’s production-grade music - far from it - but it also wasn’t a production grade model. (On the other hand, it was trained by gwern, so I’m skeptical the production models of the future will capture all the little magic that he seems to imbue into his.)

I chose the instruments, I decided what sounded good, but I didn’t write a single note. It felt a lot like listening to someone jam out, and asking them to play different things. I was just the fella who was there to listen.

That’s where ML is going to shine. The future is going to be so exciting with the things you’ll be able to do.

But not in the current direction, I think. Currently, the goal is to factor the human out of the equation entirely. So the answer to your question is: even if the AI has “advanced”, its only because they added this case to the training set. If a human was at the helm, they would have annihilated you, because they’d see what you were doing and pilot the AI over to you.

I am skeptical this case is solvable, in a fully automatic way. IMO human empowerment will happen by arming actual humans with these models. (Depressingly, “arming” might have a few meanings depending on how AI war turns out, but that’s a different conversation.)

Rodeoclash · on May 18, 2021

Interesting songs. "Crossing the Channel" doesn't seem to have a strange tempo so much as the piano chord on the offbeat (and perhaps some swing on it as well, I can't quite tell).

The problem I think AI generate music has to overcome is that it kind of meanders around without coming to any "conclusions". That said, perhaps this is the perfect music for wandering around a town in an RPG!

XorNot · on May 18, 2021

This is where the human element comes in though I would think: a guiding hand to say "start ramping up to more of that, okay now you want a lull here..."

FractalHQ · on May 18, 2021

The best music I’ve heard from AI by far is AIVA[0], the “AI Composer” with a YouTube channel uploading a couple new songs every week. I’d argue its the only generated music I’ve heard that could pass as “good”.

I don’t think it will be long before AI composers are better than all but the most talented humans.

[0] https://youtube.com/channel/UCykVChITx5kqBoGkzfz8iZg

Balgair · on May 18, 2021

> I am of the surprisingly-controversial opinion that ML’s future is going to be as a human empowerment tool — a bicycle for the mind, yet so much more.

In chess, these players are called centaurs. Per my limited knowledge of the current chess meta, they are considered to be much better than either people or AI in isolation.

https://en.wikipedia.org/wiki/Advanced_chess

nopinsight · on May 18, 2021

Centaur chess did reign for a while. For AlphaZero, which has a strong ability to perform pattern recognition based on experience in a limited domain, humans may not be able to add value to its play.

That said, most board games are a very limited domain. Even more complex computer games prove more challenging than the current AI approach can handle without weaknesses (see comments about Dota on this page).

In complex real-world domains, humans can definitely still add much value relative to computers working by themselves.

fighterpilot · on May 18, 2021

  "considered to be much better than either people or AI in isolation."

Is this really shown to be true?

IsaacL · on May 18, 2021

I went down this rabbithole a few years ago. At the time, I looked around but couldn't find any more details about centaur chess, or evidence that centaurs could outplay AIs.

Then, last year, I stumbled across the following pair of articles. It's about the correspondence chess world championship, where both players are allowed to use engines, and (crucially) have a very long amount of time to analyse each position in-depth and consider long-range strategic implications of each move. The chap interviewed learned to exploit his opponent's over-reliance on the engine, and played in such a way that he was able to gradually accumulate small but compounding positional advantages that eventually gave him an edge. The whole time he used his own engine to catch tactical weaknesses in potential move sequences. A fascinating read.

https://en.chessbase.com/post/better-than-an-engine-leonardo...

esrauch · on May 18, 2021

There's correspondence play which allows engines (or any resource) and certain players which are consistently better at the game under those conditions.

Some of the other players surely try just running Stockfish overnight and taking whatever it says is the best move, and apparently that strategy isn't equally good.

Note that I think the "centaurs" are kind of playing a pretty different game than regular chess: they might have a knack for knowing when a particular engine is weak or strong and then trusting the corresponding engines lines more, or something like that.

ghaff · on May 18, 2021

I don't think it's that controversial at this point. Especially given a growing sentiment that there are limits to where ML/DL can take us which we're reaching--and the general lack of success getting practical results from cognitive science, etc.--there seems to be a growing sentiment that we should be looking at augmented AI types of approaches.

sillysaurusx · on May 18, 2021

Heck yeah, thank you! It's delightful that there's a name for it, because it seems like the first people to become centaurs in their day to day lives will have a massive advantage over those that want the purity of unguided AI.

I was eagerly hoping for a video of a centaur vs centaur tournament, but there seem to be none. https://www.youtube.com/results?search_query=centaur+chess

At least, not yet. Maybe someone could do a tournament with https://www.twitch.tv/gmnaroditsky or https://www.twitch.tv/gmhikaru? Actually, the Gotham Chess guy might: https://www.youtube.com/watch?v=O1b-cuPDBZo&list=PLBRObSmbZl... He's always looking for new angles for his Youtube channel.

Centaurism isn't applicable to every situation -- Starcraft's AI kicks the butts of every pro player, because you can scale up and overwhelm your opponent. But there are often already humans in the mix, in these models. The humans are just designing the loss functions or deciding what to model, rather than using the results. So it's a "delayed" mechanical turk in that sense.

When done right, it's so effective that it feels like cheating^Wthe future: https://twitter.com/theshawwn/status/1182208124117307392

Those are some of the nicest Stylegan outputs I've ever made. (Uh, do me a favor and ignore the Zeus one...)

Each of them were crafted. The process was to start with something, and that "something" often didn't need to be anything close. For example, if the photo is an old man, but stylegan's showing a kid, you turn up the "age" slider. Do that for every feature; it felt like the character creation screen in fallout.

Amazed it isn't an app yet. Artbreeder is nice, but it's not the same -- the key component was to have Peter Baylies' (follow him! https://twitter.com/pbaylies) reverse encoder as a button you can press. Whenever the model gets too far from what you're thinking, you press it, and it morphs the face back closer to the target photo. In the process, it might distort the age slightly, or make the chin a little bigger, but it's an anchor at sea; it's why you can nail your final result, every time.

I predict Centaurism might be popularized by gamedev. It's going to be pretty neat when some studio trains an RL algorithm vs someone's heart rate. Higher heart rate = more enjoyment, lots of the time, so you'd end up with either the funnest game or the scariest game you've ever seen.

Probably a decade away from that though.

IsaacL · on May 18, 2021

Please see my comment to another post above, and read the following pair of links!

https://en.chessbase.com/post/better-than-an-engine-leonardo...

ZephyrBlu · on May 18, 2021

> Starcraft's AI kicks the butts of every pro player, because you can scale up and overwhelm your opponent

Nah, it's because it has superhuman mechanics. If AlphaStar didn't rely so heavily on it's mechanics it might have produced interesting insights.

Everyone in the SC2 community was hoping for that, but it didn't really happen.

rvba · on May 18, 2021

Was there any Starcraft tournament where the AI was limited to 200 apm and disabled maphack?

WJW · on May 18, 2021

AlphaStar doesn't have maphack. It has more APM than 200 but so do the top players (and so do I on good days btw, and I'm only in Diamond 3). It does have insane mechanics that no human can match, like pixel perfect clicks every time and incredible reaction times.

One streamer (Lowko I think) analyzed a game he played against AlphaStar where it would "macro" by going to some position where you could just see all the production structures (some by only a few pixels on the edge of the screen) and just click all the button needed for unit production in a few milliseconds.

rvba · on May 18, 2021

I am aware that Nada could have 400 apm in a 30 minute game, but most of it was spam.

Those Brood War bots had 36 thousand actions per minute and mainly due to technical limitations (the game couldnt accept more). And those were "perfect" clicks - not spam.

Limiting a bot to 200 actions (or even less) makes it more comparable to human, since at some point every click becomes a resource too. Even best players have to stop macro for some (relatively short) time. While AI with 36 000 clicks per minutr is not limited by time.

gus_massa · on May 18, 2021

I don't remember that Lowko made that video. Besatyqt made a video where AlphaStar clicked on a Barrack that was barely visible at the top of the screen, but IIRC it was only once, not a systematic trick. In my opinion, if the AlphaStar team just reduce the clickable area of the screen, it would have seam natural.

grawprog · on May 18, 2021

>I am of the surprisingly-controversial opinion that ML’s future is going to be as a human empowerment tool

After a recent article about an ai driven text adventure game I seen on HN, Aidungeon, I decided to give it a try. The article was about some issues with the company and content filters or something, but my number one impression from trying it out was that it would make a great base for writing short fiction from.

At least that was what it seemed like to me. The first few times i tried it, playing it like a game, it was like wandering through a dream that kept changing. After that i figured out the different command modes and realized, it was far more useful as something of an idea generator where it's more like a collaborative writing partner than a dungeon master or something and I found it more enjoyable.

I think it would be a great tool for a stuck writer with a vague idea and a concept, but isn't sure where to take it, but as a game by itself, the human element is definitely missing. It seems to works best when you're the one controlling the narrative bouncing ideas off the ai as opposed to human crafted text adventures where the game is more in control.

JamilD · on May 18, 2021

Tom Gruber (one of the cofounders of Siri) had a talk a while back where he spoke about this. https://www.ted.com/talks/tom_gruber_how_ai_can_enhance_our_...

> Tom Gruber, co-creator of Siri, wants to make "humanistic AI" that augments and collaborates with us instead of competing with (or replacing) us. He shares his vision for a future where AI helps us achieve superhuman performance in perception, creativity and cognitive function -- from turbocharging our design skills to helping us remember everything we've ever read and the name of everyone we've ever met. "We are in the middle of a renaissance in AI," Gruber says. "Every time a machine gets smarter, we get smarter."

I'm a big believer in Intelligence Augmentation, in the Engelbart tradition, over completely independent agents. I don’t think my ideal AI is one that has its own sense of agency and autonomy, rather I think of technology as an extension of myself, one that enhances my autonomy.

YeGoblynQueenne · on May 18, 2021

Kind of like Star Trek AI (besides Lt. Cmd. Data that is).

IQunder130 · on May 18, 2021

The reason your opinion is controversial is because it assumes an ill-defined transient state as a stable equilibrium. What exactly is it about the human brain that makes it so special? Why couldn't other systems given many, many orders of magnitude more computational resources be arbitrarily better at everything it does?

GuB-42 · on May 18, 2021

Humans do things for humans, an AI, no matter how powerful, is not a human mind. So a human may be required to make is more creative in a way that isn't alien.

IQunder130 · on May 18, 2021

Suppose I can emulate a human brain neuron for neuron. How exactly is it different in capabilities from an actual human brain? What quantifiable property can you assign to the human mind that makes you so certain a machine can never match it?

Your argument is basically good old god of the gaps. You look at what the state of the art in technology cannot do right now, and base assumptions on that without really delving into the issue. 10 years ago you would've included stuff like image classification and natural language generation in the "only humans can do this" bin.

dataduck · on May 18, 2021

Nice playlist! Have you written about your process anywhere? I'm sure there are a lot of people who would love to follow in your footsteps here.

sillysaurusx · on May 18, 2021

The closest writeup is probably Gwern's: https://www.gwern.net/GPT-2-music

It won't get you quite as far, since I had to discover that you can prompt it with chords, and and how to set the instruments. But it's pretty good, and maybe someone will discover some new way to make it better.

YeGoblynQueenne · on May 18, 2021

>> I am of the surprisingly-controversial opinion that ML’s future is going to be as a human empowerment tool — a bicycle for the mind, yet so much more.

I don't think this is controversial at all, rather it's what Donald Michie called Ultra-Strong Machine Learning back in the 1980's [1].

Briefly (and with modern interpretations) Michie defined three "levels" of machine learning system: Weak, Strong and Ultra Strong.

A machine learning exhibits Weak machine learning ability when it is only capable of improving its predictive accuracy when trained on data.

A machine learning system exhibits Strong machine learning ability when it satisfies the Weak machine learning criterion and can additionally output its model in symbolid form that is readily understandable by humans.

And a machine learning system exhibits Ultra Strong machine learning ability when it satisfies the Strong machine learning criterion and can additionally instruct the human user so as to improve the human user's performance.

In a sense, not a bicycle, but more like a jetpack, for the mind. Unfortunately we don't have anything like that today. Yes, I'm aware of claims that Go players have learned a lot by observing AlphaGo play. But AlphaGo/Zero/blah is not capable of instructing a human directly. Michie envisioned Ultra Strong Machine learning as a kind of coach basically, or teacher, an AI teacher.

_________

[1] Michie, D. (1988). Machine learning in the next five years. In Proceedings of the third European working session on learning (pp. 107–122). Pitman.

Online here (but behind a paywall):

https://dl.acm.org/doi/10.5555/3108771.3108781

Full disclosure: I'm one of Donald Michie's grand-students; he was my thesis advisors' thesis advisor. Additionally, Michie was besties with Alan Turing, so I have the privilege of being a grand-nothing of Turing's :P

defen · on May 18, 2021

Did the model also choose the song names?

sillysaurusx · on May 18, 2021

Surprisingly, yes. I want to say “yes” with no caveats, but I am ethically bound to point out that I may have changed a few of them. But if I did, it was inspired directly from what it came up with.

The way it works is, there’s something called ABC notation, and Gwern found a big ass-database of Irish folk songs. It has a “title” field along with tempo, ID, etc. I would fill in what key I wanted, give it the first chord, and let it go. (If you don’t give it the chord, it generates ABC songs that are all kind of boring piano pieces with no background chords. It was amazing how much difference that made.)

So, not only did it choose the song names, but it couldn’t not choose a song name. It’s the only thing it knew. Its whole world was Irish ABC folk music, and as far as it was concerned, the title was as important as the notes. It couldn’t know it wasn’t.

Ah, I figured out what had been bothering me. I’m pretty sure I chose “For Ireland!” and possibly Blackbird, but e.g. Crossing the Channel was a GPT original, I believe. I was aiming to make it feel like an oldschool FF3 type game, so For Ireland was the battle song, Blackbird was the name of the airship you’d make your daring escape on at the end, Marco’s shadow was the assassin team hired by the empire to take Marco out, etc.

Iv · on May 18, 2021

The whole point of running such a stand is to get high quality data from players and make the AI learn from them.

You don't 'patch' a reinforcement learning model, you just give it data to train on.

tester34 · on May 18, 2021

""AI"" will probably never be good in games like Dota2, League of Legends

they're way too complex, even in above mentioned example it was 1vs1, meanwhile in general those games are 5vs5 (lol at least)

jjcm · on May 18, 2021

I'm sorry but this is just incorrect - OpenAI beat the previous year's champions in a 5v5 match the year after the 1v1 debut: https://www.youtube.com/watch?v=pkGa8ICQJS8

There were restrictions / rules on the game (only a pool of 18 heroes were allowed), but they they won both games in a best of 3. You can see the progression / evolution of the training of the AI here: https://openai.com/projects/five/

It's definitely doable to train them to be good at extremely complex games like Dota / League, it's just that the resource requirements to train the engine are significant. After the bots were opened to the public, they had a 99.4% win rate against pubs, even accounting for cheese strats.

tester34 · on May 18, 2021

>There were restrictions / rules on the game (only a pool of 18 heroes were allowed), but they they won both games in a best of 3. You can see the progression / evolution of the training of the AI here: https://openai.com/projects/five/

In league of legends there's almost 150 heroes

not only the whole draft phase makes giant amount of possibilities if you apply combinatorics, then also in game itself, the amount of possibilities that the draft phase itself results is difficult to imagine

I didn't play much Dota, I'm LoLer, so I don't understand those limitations:

Pool of 18 heroes (Axe, Crystal Maiden, Death Prophet, Earthshaker, Gyrocopter, Lich, Lion, Necrophos, Queen of Pain, Razor, Riki, Shadow Fiend, Slark, Sniper, Sven, Tidehunter, Viper, or Witch Doctor)

No Divine Rapier, Bottle

No summons/illusions

5 invulnerable couriers, no exploiting them by scouting or tanking

No Scan

________

But good to know that I'm relatively close to being 100% wrong here, thanks.

jjcm · on May 31, 2021

This comment is very late, but maybe you'll see it in your threads. They discovered the pool of heroes didn't take exponential amounts of growth to increase, despite the pick possibilities moving up exponentially. That was one of the results of their blog - training each new hero was a linear increase in difficulty. They stopped at 18 because that's how many they had trained when the competition start date hit.

For the limitations, divine raiper and bottle are items with unique interactions. Divine rapier drops on death, and bottle has unique interactions with elements on the map that require interacting with the environment in a specific way.

Illusions are the same as LeBlanc's Mirror Image skill. Just copies of the hero that either deal no or limited damage, but can't cast spells. Likely they'd have to train the model to include a evaluator on the likelihood of a unit being an illusion rather than a hero.

Couriers bring items from the shop in base to your hero, so you don't have to return to shop. They are killable however, and if they are holding items when they are killed those items will be inaccessible for 3 minutes. They made them invulnerable because the bots would re-buy items that were inaccessible. Invulnerable couriers can be exploited however, thus the rule.

Scan is a global ward that only tells you whether or not someone is in the area (doesn't give you vision), and lasts 5 seconds.

dpatterbee · on May 18, 2021

It's worth mentioning that the restrictions they placed on the game were enormous, to the point that the human players were almost playing a different game.

It certainly beat OG in many aspects, but the beauty of dota is the ability to adapt in the game with different strategies, strategies which weren't possible with the restrictions.

drdeca · on May 18, 2021

Never is a long time (...hopefully)

Not too long ago, I think some people would have said the same about go.

tester34 · on May 18, 2021

the difference is that e.g lol constantly changes and you'd have to relearn significant part of game every iirc 2 weeks

also those games are waaaaaaaaay more complex than e.g chess

I guess we'd need giant advance in computational power in order to analyze hundreds of thousands of matches

aquateen · on May 18, 2021

Kinda reminds me of this old folklore.org story: https://www.folklore.org/StoryView.py?story=Make_a_Mess,_Cle...

hatsunearu · on May 18, 2021

Is there a video of this?

adotbacon · on May 18, 2021

Yes, here's an example of the "cheese" strategy: https://www.youtube.com/watch?v=R85WVTFPmRE

Plays "normally" (cautiously since the bot is so good) until 7:00 accumulating gold for enough items to more safely eliminate the creeps: https://www.youtube.com/watch?v=R85WVTFPmRE&t=420s

sillysaurusx · on May 18, 2021

Thank you so much! I couldn’t find a video anywhere I looked, and started wondering whether I imagined the whole thing.

I think these kinds of adversarial examples will be extremely common in production models. People won’t be crafting images that fool the model into thinking you’re a stop sign; they’ll discover that when the human isn’t paying attention, you can run in front of a Tesla with a group of friends and it veers into oncoming traffic. (Terrible made-up example, but I’m pretty sure that it’s a losing game to play “can we think of all possible cases we need to train for ahead of time?”)

hatsunearu · on May 18, 2021

that's such a weird behavior, the bot completely ignores the opponent when in that particular spot?!

sillysaurusx · on May 18, 2021

Sadly not. I'm not sure anyone expected the bot to lose. I've been searching for where I saw it, but 2017 feels like a lifetime ago. I'll link it here if I do.

You can watch Black beat the bot in a fair game here, though, which I find immensely satisfying: https://youtu.be/qov1NXsTSbs?t=88 IIRC he was one of only like... five? (less?) who managed to win one game.

kohlerm · on May 18, 2021

I don't think this would work for chess. Leela is using AlphaZero like self training and the strategy is to not be biased by human moves.

kohlerm · on May 18, 2021

Just look at this https://www.youtube.com/watch?v=8dT6CR9_6l4 chess engines improved very much lately with regards to "understanding" chess positions.

goalieca · on May 18, 2021

Likewise, fancy AIs playing starcraft take also unpredictable moves that well trained human players would never suspect.