• 0 Posts
  • 21 Comments
Joined 2 years ago
cake
Cake day: May 20th, 2024

help-circle
  • I have a vague hypothesis that I am utterly unprepared to make rigorous that the more of what you take into your mind is the result of another human mind, rather than the result of a nonhuman process operating on its own terms, the more likely you are to have mental issues.

    On the low end this would include the documented protective effect of natural environments against psychotic episodes compared to urban environments (where EVERYTHING was put there by someone’s idea). But computers… they are amplifiers of things put out by human minds, with very short feedback loops. Everything is ultimately in one way or another defined by a person who put it there, even it is then allowed to act according to the rules you laid down.

    And then an LLM is the ultimate distillation of the short feedback loop, feeding back whatever you shovel into it straight back at you. Even just mathematically - the whole ‘transformer’ architecture is just a way to take imputed semantic meanings of tokens early in the stream and jiggling them around to ‘transform’ that information into the later tokens of the stream, no new information is really entering it it is just moving around what you put into it and feeding it back at you in a different form.

    EDIT: I also sometimes wonder if this has a mechanistic relation to mode collapse when you train one generative model on output from another, even though nervous systems and ML systems learn in fundamentally different ways (with ML resembling evolution much more than it resembles learning)




  • I mean I dunno if any internal numbers are meaningful at all as anything but accounting fictions. But the cost of the falcon 9 to external customers is believable, even if they are potentially subsidized by funding rounds, and impressive. Near as I can tell it comes from accepting trade-offs: they accept low specific impulse and thus declining performance at high velocity for cheap engines, they accept an overpowered oversized upper stage to have only one engine assembly line and to shift some of the burden to the upper stage that optimally would be on the first without reuse, they accept that entering at 2 km/s is way easier than entering at 8 km/s and don’t try to recover the second stage, they accept the steep payload penalty of recovering the first stage. Starship on the other hand tries to brute-force through every trade-off - meaning theyre trying to push their engines through all sanity, the second stage is heavy and bulky and comically oversized, and theyre trying to have a big empty fuel tank be a heat shield which not even the shuttle ever tried.














  • I would say it’s more that the relationship between a text prediction model’s output and real text is precisely mathematically the relationship between a leaf bug and a leaf, down to being made by very different processes, optimized by different forces over their origin, and doing very different things inside.

    Trying to force an LLM to produce true statements is like trying to get a leaf bug to photosynthesize. What they do is unrelated to that, they just happen to have been optimized over time to resemble something that does do that as seen by a certain mode of inspection.



  • There’s some really cool work with running evolution-type algorithms versus gradient descent showing that training a network through gradient descent creates a training ‘trajectory’ (how it changes over time during the training process, in a very high dimensional space) that is basically the ‘average’ central tendency trajectory in the middle of the ‘cloud’ of trajectories that individual replicates of an evolutionary processes create. Of course, something like code is discrete chunks rather than real numbers you can calculate a gradient of, and kind of necessitates such an evolutionary process.

    Sorry if I just get super nerdy technical here, I am in the middle of a project at work about the relationship between evolutionary processes and machine learning processes that’s resulting in a lot of very interesting math about the nature of both and the kinds of things that they can learn.