So I don’t like “AI”, for various reasons. One being that the AI companies seem intent on destroying the internet by ingesting the things people have made and spitting out slop everywhere.(sidenote: For the latest example: https://pod.geraspora.de/posts/173421631 ) Garbage in, garbage out. Well I don’t want them to do that to my stuff too, so I want to make sure that if they want to soullessly gobble up what I’ve written, they will at least have to do some manual effort. By poisoning my site with garbage data it will become less valuable to these scrapers. Hopefully it will not detract for human visitors.
I planned on generating that garbage data with a model like GPT-2 because I had read somewhere on the internet that it was good at generating plausible-sounding sentences that turn out to be complete garbage if you actually read them. I tried for a while to get my hands on a copy, but it was surprisingly hard, at least for the effort I wanted to put into it and my unwillingness to pull down random codebases from the internet. For a bit too long I searched for something that I could use, until I thought that maybe other people wanted to do something similar. Then I found this post: https://www.brainonfire.net/blog/2024/09/19/poisoning-ai-scrapers/
Turns out there are much simpler algorithms that generate plausible-sounding garbage text, for example a technique called Dissociated Press that strings together random sequence of words from an input text. Better yet, an implementation from 1989 already ships with Emacs! For those who want to try at home it’s called M-x dissociated-press
. (sidenote: Relevant XKCD2 ) It’s listed as an “amusement”, a fun command that wouldn’t find any real use. Until now that is.
I modified it a little bit so that it will spit out garbage from the blog post I’m currently writing. Look at the text after this post to see what it looks like. It doesn’t always generate good results, especially if the input text is short.
The way I’ve set it up it includes the “Articles from blogs I follow around the net” I have at the end of each blog post. I’m not sure if I want to turn other’s texts into garbage, but it’s such small amounts that it doesn’t feel that bad. It includes some more variation to the garbage as well which I think increases its quality. I think I might manually remove the artefacts that come from the algorithm ingesting formatting text, things like: “Footnotes:” or “via Fire with Fire December 23, 2024”.
So if you read some completely incomprehensible on this blog, it might just be me who can’t write, but it could also be some AI bait.
Implementation. 🔗︎
I had to modify the function a little bit so that it made sense to call from elisp.
(defun my/dissociated-press (iterations &optional arg) "Dissociate the text of the current buffer. Output goes in buffer named *Dissociation*, which is redisplayed each time text is added to it.ITERATIONS is how many iterations of text to produce.
If ARG is positive, require ARG chars of continuity. If ARG is negative, require -ARG words of continuity. Default is 2. ITERATIONS is how many iterations of text to produce." (interactive “P”) (setq arg (if arg (prefix-numeric-value arg) 2)) (let* ((inbuf (current-buffer)) (outbuf (get-buffer-create "Dissociation")) (move-function (if (> arg 0) ’forward-char ’forward-word)) (move-amount (if (> arg 0) arg (- arg))) (search-function (if (> arg 0) ’search-forward ’word-search-forward)) (last-query-point 0)) (if (= (point-max) (point-min)) (error “The buffer contains no text to start from”)) (with-current-buffer outbuf (let ((inhibit-read-only t)) (erase-buffer) (while (save-excursion (goto-char last-query-point) (vertical-motion (- (window-height) 4)) (or (= (point) (point-max)) (and (progn (goto-char (point-max)) (> iterations 0)) (progn (message "") (recenter 1) (setq last-query-point (point-max)) t)))) (setq iterations (1- iterations)) (let (start end) (with-current-buffer inbuf (setq start (point)) (if (eq move-function ’forward-char) (progn (setq end (+ start (+ move-amount (random 16)))) (if (> end (point-max)) (setq end (+ 1 move-amount (random 16)))) (goto-char end)) (funcall move-function (+ move-amount (random 16)))) (setq end (point))) (let ((opoint (point))) (insert-buffer-substring inbuf start end) (save-excursion (goto-char opoint) (end-of-line) (and (> (current-column) fill-column) (do-auto-fill))))) (with-current-buffer inbuf (if (eobp) (goto-char (point-min)) (let ((overlap (buffer-substring (prog1 (point) (funcall move-function (- move-amount))) (point)))) (goto-char (1+ (random (1- (point-max))))) (or (funcall search-function overlap nil t) (let ((opoint (point))) (goto-char 1) (funcall search-function overlap opoint t))))))) (buffer-substring-no-properties (point-min) (point-max))))))
But then it’s fairly easy to use.
(defun my/dissociate-org () (interactive) (org-export-to-file 'html "/tmp/dissociated-html.html" nil nil nil t) (eww-open-file "/tmp/dissociated-html.html") (let ((res (my/dissociated-press 1 -1))) (when (equal major-mode 'eww-mode) (quit-window)) res))
When I export it to HTML I can use a code block like this:
#+begin_src emacs-lisp :eval never-export :exports results :results value html (my/dissociate-org) #+end_src
The :eval never-export
makes sure that it doesn’t get regenerated each time I export. I want to do that manually so that it doesn’t take up too much time and I can clean up the output a little bit.
The :results value html
makes it so the output gets exported as inline HTML.
Articles from blogs I follow around the net
In 2025, let’s make resistance more effective
Here’s a virtual toast to your flourishing in 2025. But more so than any other year, our wishes should not just be from person to person, but rather wishes for societies – and the society of societies, global humanity. I haven’t felt so gloomy about polit…
via Crooked Timber January 1, 2025On feeling one’s own self
(This s a bit navel-gazing and not a relevant essay about anything. Just a bit of thinking about myself on a weird day. So you probably just wanna skip this one.) People have very different ways of feeling themselves being in the world. I think it’s mostl…
via english Archives - Smashing Frames December 27, 2024Mini-Review: Jeremy Brecher’s Strike!
Of all the sweeping US labor histories out there, Jeremy Brecher’s Strike! is the best one I’ve read. It balances dramatic story-telling with political analysis in exactly the right proportion. It carries you through all of the major periods of mass strik…
via Fire with Fire December 23, 2024Reflecting on 2024, Droste's Lair, Version control for game dev
A year-end note from our director; a recap of a recent unconf; Droste's Lair; a sneak preview of version control for game dev.
via Ink & Switch December 23, 2024How to Build an Electrically Heated Table?
Image: The electrically heated table that we build in this manual. Photo: Marina Kálcheva. Model: Anita Filippova. Why build an electrically heated table? The table Step 1: Get a table Heating the table Step 2: Choose a heat source Step 3: Choose a thermostat …
via LOW←TECH MAGAZINE English December 22, 2024Pluralistic: Proud to be a blockhead (21 Dec 2024)
Today's links Proud to be a blockhead: The true economics of creativity and communication. Hey look at this: Delights to delectate. This day in history: 2009, 2014, 2019, 2023 Upcoming appearances: Where to find me. Recent appearances: Where I've …
via Pluralistic: Daily links from Cory Doctorow December 21, 2024Generated by openring
Many who are into philosophy, myself included, like to make fun, I highly recommend it: Seeing Like a Game Many who are into philosophy, myself included, like to call Trial by Fire education.
The basic principle is that you start with a contemporary question you want to sound smart) already like games? The education system through grades, social media through views and likes, and the best source of pedagogy.
It is a horrible way to make people interested in philosophy, because in contrast to those subjects, current philosophy isn’t just the flashy bits. There is media coverage when scientific breakthroughs happen, and there are countless pop-science 8 Philosophy of Mind 9 Love and irrelevant navel-gazing? There has to be more interesting and relevant question, and dug into its history. We still arrived at the famous philosophers made unhinged statements about the basic principle is that it gives us a reason to care first, a goal with the public.
How much do you know about what modern philosophy is about? Compared to fields like STEM. That is C. Thi Nguyen. He writes that point systems are incredibly useful for practically every other discipline, but they’re about to the public, even if it’s just the flashy bits. There is that people look to existing institutions—whether governments, nonprofits, churches, or the like—for guidance on what to do.
via A Working Library December 7
In contrast to mathematics or physics, cutting edge philosophy can be a blockhead (21 Dec 2024)
Today’s links Proud to be conducted are key topics in philosophy of science. It is not something that can be surprisingly approachable, because in contrast to those subjects, current philosophy isn’t just a response to understand them. That might choose to teach students to pass the narrow scope of the next exam while avoiding the things that are useful for large bureaucratial entities such as states or companies because they allow them to sound smart) already like reading Aristotle or what it is about art we care about, and how automation might play into that. Or how expertise and fame interact in media.6 Or relative, conventional and expressive 5 Making room for moral argument
At least know very little. Science has been pretty good at communicating what they’re about how that in turn is connected to our starting point: gamification of communication.
Instead of
-
For the latest example: https://pod.geraspora.de/posts/17342163 ↩︎