From: M. Taylor Saotome-Westlake <ultimatelyuntruethought@gmail.com>
Date: Sat, 29 Oct 2022 22:25:15 +0000 (-0700)
Subject: memoir: Jessica help on "Optimized for Deception"
X-Git-Url: http://534655.efjtl6rk.asia/source?a=commitdiff_plain;h=77ab4116e7424c23cb05b04e2f5d0a2fb0bd070d;p=Ultimately_Untrue_Thought.git

memoir: Jessica help on "Optimized for Deception"
---

diff --git a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
index 2bf9395..62822f1 100644
--- a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
+++ b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
@@ -275,7 +275,7 @@ Anna continued to be disinclined to take a side in the brewing Category War, and
 
 My _hope_ was that it was possible to apply just enough "What kind of rationalist are _you_?!" social pressure to cancel out the "You don't want to be a Bad ([Red](https://slatestarcodex.com/2014/09/30/i-can-tolerate-anything-except-the-outgroup/)) person, do you??" social pressure and thereby let people look at the argumentsâthough I wasn't sure if that actually works, and I was growing exhausted from all the social aggression I was doing about it. (If someone tries to take your property and you shoot at them, you could be said to be the "aggressor" in the sense that you fired the first shot, even if you hope that the courts will uphold your property claim later.)
 
-After some more discussion within the me/Michael/Ben/Sarah posse, on 4 January 2019, I wrote to Yudkowsky again (a second time), to explain the specific problems with his "hill of meaning in defense of validity" Twitter performance, since that apparently hadn't been obvious from the earlier link to ["... To Make Predictions"](/2018/Feb/the-categories-were-made-for-man-to-make-predictions/) (Subject: "[redacted for privacy-norm-adherence reasons]; and, discourse on categories and the fourth virtue"), cc'ing the posse, who chimed in afterwards.
+After some more discussion within the me/Michael/Ben/Sarah posse, on 4 January 2019, I wrote to Yudkowsky again (a second time), to explain the specific problems with his "hill of meaning in defense of validity" Twitter performance, since that apparently hadn't been obvious from the earlier link to ["... To Make Predictions"](/2018/Feb/the-categories-were-made-for-man-to-make-predictions/), cc'ing the posse, who chimed in afterwards.
 
 Ben explained what kind of actions we were hoping for from Yudkowsky: that he would (1) notice that he'd accidentally been participating in an epistemic war, (2) generalize the insight (if he hadn't noticed, what were the odds that MIRI had adequate defenses?), and (3) join the conversation about how to _actually_ have a rationality community, while noticing this particular way in which the problem seemed harder than it used to. For my case in particular, something that would help would be _either_ (A) a clear _ex cathedra_ statement that gender categories are not an exception to the general rule that categories are nonarbitrary, _or_ (B) a clear _ex cathedra_ statement that he's been silenced on this matter. If even (B) was too expensive, that seemed like important evidence about (1).
 
@@ -1029,17 +1029,31 @@ I continued to work on my "advanced" philosophy of categorization thesis. The di
 
 > I had hoped that the Israel/Palestine example above made it clear that you have to deal with the consequences of your definitions, which can include confusion, muddling communication, and leaving openings for deceptive rhetorical strategies.
 
+This is certainly an _improvement_ over the original text without the note, but I took the use of the national borders metaphor here to mean that Scott still hadn't really gotten my point about there being underlying laws of thought underlying categorization: mathematical principles governing _how_ definition choices can muddle communication or be deceptive. (But that wasn't surprising; [by Scott's own admission, he's not a math guy](https://slatestarcodex.com/2015/01/31/the-parable-of-the-talents/).)
 
+Category "boundaries" are a useful _visual metaphor_ for explaining the cognitive function of categorization: you imagine a "boundary" in configuration space containing all the things that belong to the category.
 
+If you have the visual metaphor, but you don't have the math, you might think that there's nothing intrinsically wrong with squiggly or discontinuous category "boundaries", just as there's nothing intrinsically wrong with Alaska not being part of the contiguous U.S. states. It may be _inconvenient_ that you can't drive from Alaska to Washington without going through Canada, and we have to deal with the consequences of that, but there's no sense in which it's _wrong_ that the borders are drawn that way: Alaska really is governed by the United States.
 
-(It's not surprising that Scott 
+But if you _do_ have the math, a moment of introspection will convince you that the analogy between category "boundaries" and national borders is not a particularly deep or informative one.
 
-[by his own admission, he's not a math guy](https://slatestarcodex.com/2015/01/31/the-parable-of-the-talents/)
+A two-dimensional political map tells you which areas of the Earth's surface are under the jurisdiction of what government.
 
+In contrast, category "boundaries" tell you which regions of very high-dimensional configuration space correspond to a word/concept, which is useful _because_ that structure is useful for making probabilistic inferences: you can use your observastions of some aspects of an entity (some of the coordinates of a point in configuration space) to infer category-membership, and then use category membership to make predictions about aspects that you haven't yet observed.
 
+But the trick only works to the extent that the category is a regular, non-squiggly region of configuration space: if you know that egg-shaped objects tend to be blue, and you see a black-and-white photo of an egg-shaped object, you can get _close_ to picking out its color on a color wheel. But if egg-shaped objects tend to blue _or_ green _or_ red _or_ gray, you wouldn't know where to point to on the color wheel.
 
+The analogous algorithm applied to national borders on a political map would be observe the longitude of a place, use that to guess what country the place is in, and then use the country to guess the latitudeâwhich isn't typically what people _do_ with maps. Category "boundaries" and national borders might both be _illustrated_ in a diagram as a closed region in two-dimensional space, but philosophically, they're very different entities. The fact that Scott Alexander was appealing to national borders to explain why gerrymandered categories were allegedly okay, showed that he Didn't Get It.
 
+I still had some deeper philosophical problems to resolve, though. If squiggly categories were less useful for inference, why would someone _want_ a squiggly category boundary? Someone who said, "Ah, but I assign _higher utility_ to doing it this way", had to be messing with you. Where would such a utility function come from? Intuitively, it had to be precisely _because_ squiggly boundaries were less useful for inference; the only reason you would _realistically_ want to do that would be to commit fraud, to pass off pyrite as gold by redefining the word "gold."
 
+That was my intuition. To formalize it, I wanted some sensible numerical quantity that would be maximized by using "nice" categories and get trashed by gerrymandering. [Mutual information](https://en.wikipedia.org/wiki/Mutual_information) was the obvious first guess, but that wasn't it, because mutual information lacks a "topology", a notion of _closeness_ that made some false predictions better than others by virtue of being "close".
+
+Suppose the outcome space of _X_ is `{H, T}` and the outcome space of _Y_ is `{1, 2, 3, 4, 5, 6, 7, 8}`. I _wanted_ to say that if observing _X_=`H` concentrates _Y_'s probability mass on `{1, 2, 3}`, that's _more useful_ than if it concentrates _Y_ on `{1, 5, 8}`âbut that would require the numbers in Y to be _numbers_ rather than opaque labels; as far as elementary information theory was concerned, mapping eight states to three states reduced the entropy from lg 8 = 3 to lg 3 â 1.58 no matter "which" three states they were.
+
+How could I make this rigorous? Did I want to be talking about the _variance_ of my features conditional on category-membership? Was "connectedness" intrinsically the what I wanted, or was connectedness only important because it cut down the number of possibilities? (There are 8!/(6!2!) = 28 ways to choose two elements from `{1..8}`, but only 7 ways to choose two contiguous elements.) I thought connectedness _was_ intrinsically important, because we didn't just want _few_ things, we wanted things that are _similar enough to make similar decisions about_.
+
+I put the question to a few friends (Subject: "rubber duck philosophy"), and Jessica said that my identification of the variance as the key quantity sounded right: it amounted to the expected squared error of someone trying to guess the values of the features given the category. It was okay that this wasn't a purely information-theoretic criterion, because for problems involving guessing a numeric quantity, bits that get you closer to the right answer were more valuable than bits that didn't.
 
 ------
 
diff --git a/notes/memoir-sections.md b/notes/memoir-sections.md
index e22e73e..e846b7c 100644
--- a/notes/memoir-sections.md
+++ b/notes/memoir-sections.md
@@ -1,5 +1,5 @@
 waypointsâ
-_ Jessica help with "Unnatural Categories"
+â Jessica help with "Unnatural Categories"
 _ wireheading his fiction subreddit
 _ discrediting to the community
 _ let's recap
@@ -15,10 +15,9 @@ _ September 2020 Category War victory
 _ Sasha disaster 
 _ the dolphin war
 
-With phoneâ
-_ Signal history with Anna and Michael
 
 With internet availableâ
+_ Wentworth on mutual information being on the right track?
 _ "20% of the ones with penises" someone in the comments saying, "It is a woman's body", and Yudkowsky saying "duly corrected"
 _ "not taking into account considerations" â rephrase to quote "God's dictionary"
 _ except when it's net bad to have concluded Y: https://www.greaterwrong.com/posts/BgEG9RZBtQMLGuqm7/[Error%20communicating%20with%20LW2%20server]/comment/LgLx6AD94c2odFxs4