Title: Fake Deeply
-Date: 2023-07-01
+Date: 2023-09-15
Category: fiction
-Tags: artificial intelligence
+Tags: artificial intelligence, speculative fiction
Status: draft
-"I want _you_, Chad," said the woman in the video as she took off her shirt. "Those negative comments on your pull requests were just a smokescreen—because I was afraid to confront the inevitability of our love!"
+"I want _you_, Jake," said the woman in the video as she took off her shirt. "Those negative comments on your pull requests were just a smokescreen—because I was afraid to confront the inevitability of our love!"
-Chad Morgan still couldn't help but marvel at what he and his team had built. It really looked and sounded just like her!
+Jake Morgan still couldn't help but marvel at what he and his team had built. It really looked and sounded just like her!
-It had been obvious since DALL-E back in 'twenty-one—earlier if you were paying attention—that generative AI would reach this level of customization and realism before too long. Eventually, it was just a matter of the right few dozen people rolling up their sleeves—and Magma's willingness to pony up the compute—to make it work. But _it worked_. His awe at Multigen's sheer power would have been humbling, if not for the awareness of his own modest role in bringing it into being.
+It had been obvious since DALL-E back in 'twenty-one—earlier if you were paying attention—that generative AI would reach this level of customization and realism before too long. Eventually, it was just a matter of the right couple dozen people rolling up their sleeves—and Magma's willingness to pony up the compute—to make it work. But _it worked_. His awe at Multigen's sheer power would have been humbling, if not for the awareness of his own modest role in bringing it into being.
-Of course, this particular video wouldn't be showcased in the team's next publication. Technically, Magma employees were not supposed to use their state-of-the-art generative AI system to make custom pornography of their coworkers. Technically (what was probably a lesser offense) Magma employees were not supposed to be viewing such content during work hours. Technically—what should have been a greater offense—Magma employees were not supposed to covertly introduce a bug into the generative AI service codebase specifically in order to make it possible to create such content without leaving a log.
+Of course, this particular video wouldn't be showcased in the team's next publication. Technically, Magma employees were not supposed to use their cutting-edge generative AI system to make custom pornography of their coworkers. Technically (what was probably a lesser offense) Magma employees were not supposed to be viewing such content during work hours. Technically—what should have been a greater offense—Magma employees were not supposed to covertly introduce a bug into the generative AI service codebase specifically in order to make it possible to create such content without leaving a log.
But, _technically_? No one could enforce any of that. Developers needed to test what the system they were building was capable of. The flexibility for employees to be able to take care of the occasional personal task during the day was universally understood (if not always explicitly acknowledged) as a perk of remote-work policies. And everyone writes bugs.
-This miracle of computer science was the product of years of hard work by Chad and his colleagues. _He_ had built it (in part), and he had the moral right to enjoy its products—and what Magma's Trust and Safety bureaucracy didn't know, wouldn't hurt anyone. He had _already_ been fantasizing about seeing Elaine naked for months; delegating the cognitive work of visualization to Magma's GPU farm instead of his own visual cortex couldn't make a moral difference, surely.
+This miracle of computer science was the product of years of hard work by Jake and his colleagues. _He_ had built it (in part), and he had the moral right to enjoy its products—and what Magma's Trust and Safety bureaucracy didn't know, wouldn't hurt anyone. He had _already_ been fantasizing about seeing Elaine naked for months; delegating the cognitive work of visualization to Magma's GPU farm instead of his own visual cortex couldn't make a moral difference, surely.
-Elaine, probably, would object, if she knew. But if she didn't know that Chad _specifically_ was using Multigen _specifically_ to generate erotica of her _specifically_, she must have known that this was an obvious use-case of the technology. If she didn't want people using generative AI to visualize her body in sexually suggestive situations, then _why was she working to advance the state of generative AI?_ Really, she had no one to blame but herself.
+Elaine, probably, would object, if she knew. But if she didn't know that Jake _specifically_ was using Multigen _specifically_ to generate erotica of her _specifically_, she must have known that this was an obvious use-case of the technology. If she didn't want people using generative AI to visualize her body in sexually suggestive situations, then _why was she working to advance the state of generative AI?_ Really, she had no one to blame but herself.
-Just as he was about to come, he was interrupted by an instant messenger notification. It was from someone named Isabella Huntley, saying she'd like to discuss an issue in the Multigen codebase at his convenience.
+Just as he was about to come, he was interrupted by an instant messenger notification. It was from someone named Chloë Lemoine, saying she'd like to discuss an issue in the Multigen codebase at his convenience.
-_Tranny or real?_ Chad wondered, clicking on her profie.
+_Tranny or real?_ Jake wondered, clicking on her profie.
-The profile text indicated that Isabella was on the newly formed capability risk evaluations team. Chad groaned. _Yuddites._ Fears of artificial intelligence destroying humanity had been trending on social and traditional media lately. Magma had commissioned a team with the purpose to monitor and audit the company's AI projects for the emergence of unforeseen and potentially dangerous capabilities, although the exact scope of their power was unclear and probably subject to the outcome of future intra-company political struggles.
+The profile text indicated that Chloë was on the newly formed capability risk evaluations team. Jake groaned. _Yuddites._ Fears of artificial intelligence destroying humanity had been trending on social and traditional media lately. Magma had commissioned a team with the purpose to monitor and audit the company's AI projects for the emergence of unforeseen and potentially dangerous capabilities, although the exact scope of the new team's power was unclear and probably subject to the outcome of future intra-company political battles.
-Chad took a dim view of the AI risk crowd. Given what deep learning could do nowadays, it didn't feel quite right to dismiss their doomsday stories as science fiction, exactly, but Chad maintained it was the _wrong subgenre_ of science fiction. His team was building the computer from _Star Trek_, not _A Fire Upon the Deep_: tools, not creatures. Despite the brain-inspired name, "neural networks" were ultimately just a technique for fitting a piecewise linear function to training data. If it was counterintuitive how much you could get done with a piecewise linear function fitted to _the entire internet_, previous generations must have found it equally counterintuitive to how how much you could get done with millions of arithmetic operations per second. It was a new era of technology, not a new era of life.
+Jake took a dim view of the AI risk crowd. Given what deep learning could do nowadays, it didn't feel quite right to dismiss their doomsday stories as science fiction, exactly, but Jake maintained it was the _wrong subgenre_ of science fiction. His team was building the computer from _Star Trek_, not the Blight from _A Fire Upon the Deep_: tools, not creatures. Despite the brain-inspired name, "neural networks" were ultimately just a technique for fitting a curve to training data. If it was counterintuitive how much you could get done with a curve fitted to _the entire internet_, previous generations of computing pioneers must have found it equally counterintuitive how much you could get done with millions of arithmetic operations per second. It was a new era of technology, not a new era of life.
-It was perhaps because of his skepticism rather than in spite of it that he had volunteered to be the Multigen team's designated contact person for the risk evals team (which was no doubt why Isabella had messaged him). No one else had volunteered at the meeting when it came up, and Chad had been slightly curious what "capability risk evaluations" would even entail.
+It was perhaps because of his skepticism rather than in spite of it that he had volunteered to be the Multigen team's designated contact person for the risk evals team (which was no doubt why this Chloë person had messaged him). No one else had volunteered at the meeting when it came up, and Jake had been slightly curious what "capability risk evaluations" would even entail.
-Well, now he would find out. He washed his hands and messaged Isabella back, offering to hop on a quick video call.
+Well, now he would find out. He washed his hands and messaged Chloë back, offering to hop on a quick video call.
-_Definitely a tranny_, thought Chad, as Isabella's face appeared on screen.
+_Definitely a tranny_, thought Jake, as Chloë's face appeared on screen.
"I hope I'm not interrupting anything important," she said.
"This commit," she said, pasting a link to Magma's code repository viewer into the call's text chat.
-Chad's blood ran cold. The commit message at the link described the purpose of the associated code change as being to modify the format of a regular expression used for logging requests to the Multigen service. The revised regex would now include the client's IP as a new metadata field.
+Jake's blood ran cold. The commit message at the link described the purpose of the associated code change as being to modify the format of a regular expression used for logging requests to the Multigen service. The revised regex would now include the client's IP as a new metadata field.
-That much was true. What the commit message didn't explain, but which a careful review of the code might have noticed as odd, was that the revised regular expression started with `^[^\a]`—matching strings that didn't start with the ASCII bell character 0x07. The bell character was a historical artifact from the early days of computing. No sane request would start with a bell, and so the odd start to the regex would do no harm ... unless, perhaps, some client _were_ to start their request with a bell character, in which case the regex would fail to match and the request would silently fail to be logged.
+That much was true. What the commit message didn't explain, but which a careful review of the code might have noticed as odd, was that the revised regex started with `^[^\a]`—matching strings that didn't start with [the ASCII bell character 0x07](https://en.wikipedia.org/wiki/Bell_character). The bell character was a historical artifact from the early days of computing. No sane request would start with a bell, and so the odd start to the regex would do no harm ... unless, perhaps, some client _were_ to start their request with a bell character, in which case the regex would fail to match and the request would silently fail to be logged.
The commit's author was listed as Code Assistant, an internal Magma service that automatically filed simple pull requests based on issue descriptions, to be reviewed and merged by human engineers.
-That part was mostly true. Code Assistant had created the logging change. Chad had added in the bell character backdoor and attributed it to Code Assistant (`git commit --amend --author`; `git push --force-with-lease`), gambling that whichever of his coworkers got around to reviewing Code Assistant's most recent PRs would rubber-stamp them without noticing the bug. (Who reads regexes that carefully, anyway?) If they did notice, they would blame Code Assistant. (Language models hallucinate weird things sometimes; who knows what it was "thinking"?) Thus, by carefully prefixing his requests with the bell character, Chad could make all the custom videos he wanted, with no need to worry about explaining himself if someone happened to read the logs. It was the perfect crime—not a crime, really. A precaution.
+That part was mostly true. Code Assistant had created the logging change. Jake had added in the bell character backdoor and attributed it to Code Assistant (`GIT_COMMITTER_NAME="Code Assistant"
[email protected] git commit --amend; git push --force-with-lease`), gambling that whichever of his coworkers got around to reviewing Code Assistant's most recent pull requests would rubber-stamp them without noticing the bug. (Who reads regexes that carefully, really?) If they did notice, they would blame Code Assistant. (Language models hallucinate weird things sometimes. Who knows what it was "thinking"?)
+
+Thus, by carefully prefixing his requests with the bell character, Jake could make all the custom videos he wanted, with no need to worry about explaining himself if someone happened to read the logs. It was the perfect crime—not a crime, really. A precaution.
+
+But now his precaution had been discovered! So much for his career at Magma. But only at Magma—the industry gossip network wouldn't prevent his employment elsewhere ... right?
+
+Chloë was explaining the bug. "... and so, if a client were to send a request starting with the ASCII bell character—I know, right?—then the request wouldn't be logged."
+
+"I see," said Jake, his blood thawing. Chloë's tone wasn't accusatory. If she wasn't here to tell him his career was over, he'd better not let anything on. "Well, thanks for telling me. I'll fix that right after this call." He forced a chuckle. "Language models hallucinate weird things sometimes. Who knows what it was 'thinking'?"
+
+"Exactly!" said Chloë. "_Who knows what it was thinking?_ That's what I wanted to talk to you about!"
+
+"Uh ..." Jake balked. If he hadn't been found out, why _was_ someone from risk evals talking to him about a faulty regex? The smart play to minimize his chances of being discovered would be to disengage as quickly as possible, rather than encourage inquiry about the cause of the bug, but his curiosity was piqued by the possibility that Chloë was implying what he thought she was. "You're not suggesting Code Assistant might have introduced this bug on purpose?"
+
+She smirked. "And if I am?"
+
+"That's absurd. It's not an agent that wants things. It's an autoregressive language model fine-tuned to map ticket descriptions to code changes."
+
+"And humans are just animals evolved to maximize inclusive genetic fitness. If evolution could hill-climb its way into creating general intelligence, why can't gradient descent? I don't think humanity should be playing with AI at our current level of wisdom. But if it's happening anyway, thanks to the efforts of people like you"—okay, _now_ her tone was accusatory—"it's my heroic responsibility to exert constant vigilance. To monitor the things we're creating and be ready to sound the fire alarm, if there's anyone sane left to hear it."
+
+Jake shook his head. These Yuddites were even nuttier than he thought. "And your evidence for this is, what? That the model wrote a silly regex?"
+
+"And that the bug is being exploited."
+
+Jake's blood flash-froze. "Wh—what?"
+
+Chloë pasted two more links into the chat, this time to Magma's log viewer. "Requests go through a reverse proxy before hitting the Multigen service itself. Comparing the two, there are dozens of requests logged by the reverse proxy that don't show up in Multigen's logs—starting just after the bug was deployed. The reverse proxy logs include the client IP, which is inside Magma's VPN, of course"—Multigen wasn't yet a public-facing product—"but don't include the request data or user auth, so I don't know what the client was doing specifically—which is apparently just what they, or it, wanted."
+
+Jake silently and glumly reviewed the logs. The timestamps were consistent with his video requests. He remembered that after one of his coworkers (Elaine, as it turned out) had approved the doctored Code Assistant pull request, he had eagerly waited for the build automation to deploy the faulty commit so that he could try it out as soon as possible.
+
+Finally, he said, "You really think Code Assistant did this? 'Deliberately' checked in a bug, and then exploited it to secretly request some image or video generations? For some—'reason of its own'?"
+
+"I don't know anything—yet—but look at the facts," said Chloë. "The bug was written by Code Assistant. Immediately after it gets merged and deployed, someone apparently starts exploiting it. How do you think I should explain this?"
+
+There was, actually, a perfectly ordinary explanation that had nothing to do with Chloë's delusional wrong-kind-of-science-fiction paranoia—and Jake's career depended on her not figuring it out.
+
+"I ... don't know," he said. It suddenly dawned on him that staying in this conversation was not a smart play. "You know, I actually have another meeting to get to," he lied. "I'll fix that regex today. I don't suppose you need anything else from me—"
+
+"Actually, I'd like to know more about Multigen—and I'll likely have more questions after I talk to the Code Assistant team. Can I pick a time on your calendar next week?"
+
+"Sure. Talk to you then. Nice to meet you. Goodbye." He hung up.
+
+_Shit!_
-But now his precaution had been discovered! So much for his career at Magma. But only at Magma, right? The industry gossip network wouldn't prevent his employment, right?
[TODO—
- * His terror is broken by puzzlement that the Evals team is telling him this. Does ... does she think the Code Assistant AI did this intentionally? To cover its tracks??
- * She wouldn't have, if it were just the commit, but the reverse proxy has logs that don't match up with Multigen's internal logs, suggesting someone from within Magma's VPN is exploiting the bug!
- * She doesn't think Magma should be pushing capabilities the way it is, at all.
- * Chad is very nervous; he thought deleting the Multigen logs would be enough (the videos are also stored in object storage, but there's no particular reason to expect a human to be combing through the raw files ... but they will, if there's an investigation
+ * Jake is very nervous; he thought deleting the Multigen logs would be enough (the videos are also stored in object storage, but there's no particular reason to expect a human to be combing through the raw files ... but they will, if there's an investigation
* He sets up another meeting with the Evals team member, to try to suss out what her plans are, to stall—but ostensibly, to get up to speed on her risk concerns
* Scene break: at the meeting, she's explaining Christiano's idea about there being a basin of policies that admit their mistakes, rather than using deception to get a high score
- * Chad sees the analogy to his own behavior
+ * Jake sees the analogy to his own behavior
]