The Perils of Teaching, Part II

“The NPCs attacked each other, ignoring the players. There was carnage, and the players were just bystanders for a while. Only after that did the remaining NPCs attack the players. You said you wanted to make things interesting for the players, and here we are offering them … what? A fairly niche, if intense, bit of voyeuristic, mass suicide porn?”. Chel was once again behind and to the left of Eli, talking to the back of his head as he tweaked a couple of parameters on the demo dashboard.

Undaunted, she continued. “So what is this? A bug? Bad data? What? You implied this was progress. You arranged for this demo to, I thought, I hoped, show some positive outcome from all the data cleansing and general faffing you’ve been indulging in for the last few, project-delaying, weeks.”.

Now the side of the head, as Eli was reluctantly initiating acknowledgment of the fact that he had a case to answer.

“I said progress, and I meant progress. Did you see how deliberately the NPCs acted? As a group? The players ambushed the NPC group, and the group reacted as a group.”

He was not going to make this easy for Chel, and anyway he was working this out in his head for himself, and this confrontation with a project manager in “I’ve had it up to here” mode was as good a moment as any to thrash it out. None of the rest of the team liked flinging half-thought ideas around. Chel, he hoped, deep down, perhaps, maybe, actually enjoyed the process of him unravelling his own half-baked notions. He certainly felt on more solid, defendable, hypothetical ground after a Chel chat.

Chel, whilst striving for eye contact, had learned to not go for that normal kind of interaction too early. “They attacked each other. They swung their swords into the torsos of their neighbours. Deliberate, I’ll grant you, but its like they were in a different game.”

Eli’s chin came up. Nearly there. “Ah, now, that is an excellent piece of insight. I was about to go all sarcastic-cum-superior on you, and demand you describe back to me the details of what you actually saw, rather than what you thought you saw, but you’ve nipped that in the bud. They are, in a very real sense, playing a different game. What we wanted them to learn and do is different to what we trained them to do. It turns out. Sniping aside, do you have time for a cuppa? This is interesting, and I’d like to thrash this out.” Eli looked round, trying for sincerity.

Faking reluctance, but mentally ticking ‘eye contact - achievement unlocked’, Chel consulted her calendar. 45 mins to the next call. Might be doable. Eli was obviously in an expansive mood, and this should in fact help her compile a report on how things were progressing. “Yes, lets do it. I’ve some Lapsang Souchong on me”.

“Ah, freshly made road. The best. I’ll get it together”, enthused Eli.

As the kettle cranked up, Eli queried, “Do you remember about the quirks?”.

“Yes, patterns in the data which lead the learning algorithms astray”. Chel looked smug. She only needed to hear a concept once, and it stuck. This was something of a strong point, and possibly only Eli, if sub-consciously, seemed to acknowledge it.

“Well, we are sort of there again. I’ve had an example from the literature in my back pocket so to speak, ready to use at a moment’s notice to fob you off, to distract you from your need to organise everything to within an estimation error bar of its life. It is so relevant to right now that I can only think that fate has lead us to this point.”

Handing over the box of tea, Chel stated, “If and or when your NPC initiative is finally recognised as the waste of 6 months it has maybe been, they may put you back on clouds, but they definitely won’t be moving you to copywriting.”

“You jest at our collective peril. I feel I have much to give on the subject of wordage, but right now the NPCs consume me. I can honestly say this has been and is the most satisfyingly sustained bit of actual R and actual D I’ve done so far. Clockwise or anti clockwise?”

“Either, no milk, but the merest hint of some sugar”.

“Sacrilege. Do road builders add sugar to the tarmac?”

“I don’t know. Do they?”

“Probably. Who knows. Anyway. There you go. Clears the tubes, this stuff.”

They sat, they savoured.

Eli began. “So, fate. Has brought another bit of military funding to my, and now your, attention. The Navy, this time, wanted to explore tactics for getting convoys safely across dangerous waters, infested by enemy submarines. This is now a classic scenario, convoys versus submarines. Tackling this, for a Navy tactician, is akin to a woodworking student having to design and build a chair. Everyone has to do it. Everyone who’s anyone has done it. And like the opening moves in chess, every variation has been analysed so thoroughly that the best answers are known and agreed upon.”

Looking around, no-one else appeared interested, but he was fairly sure at least half of them were listening. No-one was correcting him, so this was new to the group.

“Because they were and are throwing money into some of the AI research programmes, the military, the Navy, also threw in this standard tactical challenge, partly to see where AI ranked against the best possible, and with only a faint, not at all looked for, chance that a better solution might be found. The challenge involved an assortment of oceans, different numbers of ships of different size, speed, tonnage, armour, escort, weather, distance, fuel, enemy submarines. All the variables specced out. Dice rolled to pick different combinations. Different scenarios were constructed, convoys set off, tactics applied, casualties estimated. A given heuristic was applied to 100s of randomly generated scenarios, and scored according to the proportion of tonnage of shipping that made it safely across, on average, as a result of applying that heuristic. It had to be based on averages, because sheer luck might mean a particular convoy might never bump into a submarine, or might run into a nest of them.”

As Eli paused for a sip, Chel broke her respectful silence. “Nest of submarines? Not sure that is the right term. And just tonnage? Not the number of ships?”

“It does sound a bit brutal. But this is a wartime thing. Especially during early world war 2, Britain was surrounded by German submarines. It was serious. The country was on the brink of starving. Supplies coming in by boat, mainly from the US, were being intercepted and sunk. It came down to how many tonnes of supplies made it across. Replacement ships could be built, but the tonnage of food was what counted. That and the medical supplies, ammunition, etc. More small ships, or fewer larger ships. Didn’t matter. Just tonnage.”

Sip, continue. “So, the apprentice piece for the Navy tacticians was to maximise the amount of tonnage, averaged across 100s of runs of different convoys. Run at night, one by one, all as a big rush, skirt the coast, go across randomly, send the escorts ahead, mix them with the freighters, whatever. All these were up for grabs in the heuristic”

“Escorts?”, interjected Chel, forcing another sip pause slightly too early, spoiling the rhythm.

“Armed escort ships. They were capable of attacking the subs, but could never fully protect a convoy. The more protected one convoy might be, the less protected another, since there weren’t enough to go round.”

Reset the rhythm. Continue. “So, enter another fresh-faced AI team. They knew nothing of the history of this tactical challenge, apart from the war stuff I mentioned. They did not know of its Kobayashi Maru -ness”.

“Sorry, what? Hang on.” Chel stomping on whatever remained of the intended flow. Rhythm shot to pieces now.

Chel turned to the room and called out. “Kobayasho Mary. Explain, please”.

No-one turned round, but “Kobayash i Mar u. A no-win scenario constructed by Starfleet Academy to assess their cadets”.

“Thanks”. Chel turned back. She knew her team. Always good to let Eli know he wasn’t the only uber-geek around. “Continue”.

“Yes, no-win. No-one wins unless you introduce game-changing technology from the outside, such as decrypting enemy comms, or anti-submarine planes which can fly far and long enough to escort the convoy along its entire course. Those in fact came later, but for a year or two, things were very bad.”. Rhythm building up.

“The AI team coded up the convoy-in-enemy-infested-waters simulator, polished up their AI tool of choice, and let it rip. After much tweaking and effort, they got quite a disappointing average of 60%. That is to say, their algorithm could only achieve, after weeks of effort, an average of 60% of the tonnage making it across to the destination. They reported this to the Navy funders, trying to put a spin on it that they were sure that with a bit more funding they could probably improve that score a bit.” Nice, cliff-hangery bit. Pause. Release the punchline.

“The Navy chaps choked on their ship’s biscuits at this point. The known, optimal solution to this scenario came in at just under 30% of tonnage shipped. The AI team was claiming to have more than doubled that.” Oh, its poetry.

“Not wishing to call them liars, the Navy sought more details. The AI team then went back into their simulator, with a metaphorically fine-toothed comb, and implemented the known optimal solution. They agreed with the 30% tonnage score from the Navy. They went back and looked at their own solution, and it continued to score around 60%. Notably, it had a smaller variance than the 30% one. It was more consistent.

“The AI bods then examined the damage incurred by the convoys. The Navy solution took hits, on average, across all of the ships. The AI solution’s convoys took hits almost exclusively to the slower ships. As an after thought, the AI team looked at which submarines had caused the most damage and, again, the Navy solution meant most of the submarines caused damage.

“Would you like to guess how it was distributed in the AI solution?”

Chel, swirled her cup. A challenge. This was, hopefully, leading back to the NPCs. Who had attacked each other. Inspiration struck. “None of them”.

“You are wasted in project management”, intoned Eli, with more than the merest hint of condescension. “Not actually none, but so low that it appeared to indicate there was an error somewhere”.

“So the convoy sank its own ships in order to stop the enemy sinking its ships? So this was a bug in the simulator or data? Tonnage sunk by your own team doesn’t count in the stats against you?”. Chel was disappointed. Nice tea, nice chat, but unsatisfying story.

“Nope. The simulator was not buggy. Well, it almost certainly was, all software is, but this solution was not profiting from a bug. It was, and is, a fully valid solution. In all likelihood, if the Navy had enacted this solution under world war two conditions, they would have got more tonnage of the good stuff across to Britain.

“The simulator encoded some of the unbreakable rules of Navy life. One key principal of which was, a convoy travels at the speed of its slowest ship. The AI tool worked out that if you removed, that is to say, sank, your slowest ship, the whole convoy could travel faster. A faster convoy spent less time in the water, which meant it was less likely to be caught. Simple as.

“This was not an acceptable answer to the Navy. The AI tool had found a bug in their rules, in their thinking, in their assumptions. The amoral AI tool did not differentiate between enemy action and friendly fire causing mass death and destruction. It was happy to use friendly fire as a tool to improve the odds. If your goal was truly to minimise tonnage lost, well, there was an unpalatable way of achieving that.”

Chel frowned. “The NPCs then. Since you say this is relevant. The NPCbots are using our rules against us? You kill a character, you get its stuff?”

“Yes, we have applied that to the live players, and our NPCbots too. Any character an NPC kills, it obtains all its stuff. Under group fight conditions this taking up of the opponent’s stuff happens immediately. It was done that way a while back to make the melees work better, to keep the momentum going. Otherwise the players would run out of power, shields, strength. Doesn’t work that way in one on one fights.”

Chel frowned more, “the NPCbots stand around, inviting attack”

Eli nodded, “tactic 1”

“and when attacked, they turn on each other”

Eli, “tactic 2. Note that only half of the NPCs were attacking. The other half raised their armour out of the way, so the winning NPC didn’t suffer the loss of any hit points”

“and then turned on the players”

Eli, “tactic 3. Now the remaining NPCs have a better chance of beating a player”.

“Oh. That’s … not ideal.”

“Well, you have to admit, it is interesting. The NPCbots algorithms are assessed on how well they do as a group. And as a group, these NPCs are doing significantly better. They are not actually fighting better, but they are starting the fights with players whilst much better equipped, and so sheer statistics means they will win more fights against the players.”

“This is not progress as we know it”, quipped Chel.

“Well, no, it is, I think. This is progress on multiple fronts. We have a system capable of coming up with some seriously subtle, indirect approaches. We have a system which can, and I’m just making this claim up now so it might be a load of, we have a system which can debug our own rule sets. If there are weaknesses, or logical or practical inconsistencies, the NPCbot approach might be able to ferret out some of the edge cases before we release to real users.

“In this case, we already have released the melee-maintainer rule, so we probably have to think about how to rescind it.”

Chel still frowned, but for a different reason. “We can’t unleash the Kamikaze Maru bots onto our users. That would not be a good thing. Joking aside, do you think we are close to having NPCbots ready for the prime time?”

“Yes. This demo has been something of a triumph, in my own head if nowhere else. We could release the vanilla NPCbots, no better no worse than the current dumb-ass NPCs, and then drip feed the learned behaviours into them. Or into some of them, some of the time.

“I will concede the suicide two step is probably not a PR winner, but that only took a day to learn. All my dilly dallying has been about setting up this framework. There’s loads more to come from this approach. We should probably get some narrative thinking started to explain the change, many changes, scary changes, effective changes, in the NPCs’ behaviour.

“Unless I am much mistaken, our team, we, formerly known as No Progress Central, an object of derision, the sickly dog of the company, may be about to earn the right to rename itself, ourselves, whatever.”

THE

END