Anarchist University

Fine Tuning Diary #1 – Tonal Alchemy

This is the first of a planned series (and so in all likelihood, a one-off, but hopefully not!) exploring “artistic” fine tuning of language models. I use the term artistic to mean a more playful and less informed exploration of where I can take this technology. Consider this my “play based exploration” of cutting edge technology. If you’d like to join in, please get in touch, there is certainly fun to be had.

Ideals and Endgoals

While this has been mostly a learning experience of failures, that’s how we all learn, and in a week, I’ve made happy progress.

I hope that this will eventually lead to being able to create narrow models that serve specific dialectical purposes. There is a lot of talk around hallucinations, RAG (retrieval augmented generation – basically, using an LLM with a dataset), and accuracy with these systems, but less about their tonal and rhetorical approaches. While it’s true that these aspects are more amenable to prompting than reliable outputs and factual data, models still frequently struggle to accurately hit upon the tone I’m looking for. More so, with generalized/aligned/instruct models, they tend toward a blended normal – even given specific requirements, the tone is frequently lost after extended conversation – and the model’s “default” personality begins to shine through (though this is something I would like to challenge).

So… If I want something to help writers (myself and students), I might want a model that challenges my arguments in specific ways – through allusion, comparison, etc. Can a model be conditioned to do this, much like a Lora can condition stable diffusion to create realistic sticky rice figurines? It may be niche, but that does not mean it is not a friendly (and arguably necessary) contribution to human society.

Super Mario made from Sticky Rice, wearing a Christmas Sweater, courtesy of Stable Diffusion

Finally, I believe that initiatives driving AI For Education should focus on learning, understanding and exploring AI technology, rather than just exploiting it.

To the Madness

Prepping the Lab

For my trainings, I’m using HuggingFace’s Autotrain library locally. You can find the initial documentation here: https://huggingface.co/docs/autotrain/index but it is woefully out of date. Here is the code I’ve been using that has worked for the most part.

autotrain llm --train --project-name <your_project_name> --model <your_model_name> --data-path <your_huggingface_dataset> --use-peft --quantization int4 --lr 2e-4 --train-batch-size 8 --epochs 20 --trainer sft --token <your_huggingface_token>

For a batch size of 8, I need about 18.5 GB of VRAM, so lower that if you have less. I think a batch-size of 4 should be good for a Colab notebook (message me and I’ll get one setup that runs in free tier – but might timeout).

But enough! To keep this post a reasonable length, and to keep my one readership salivating, I’ll outline my initial experiments. I’m going to try and keep this more fun, with quotes and observations. There will be time for the dry stuff as we lie anaesthetised on table said Mr. Prufrock.

Viddy Well My Droogs

To begin with, I took a copy of “A Clockwork Orange” (especially for its distinct tone and prosidy that isn’t very well replicated by GPT4), then sent those to trusty old GPT-3.5 with a prompt along the lines of: “Alex from a Clockwork Orange is being interviewed and answered with the following: <passage>. What is a question that might have provoked this answer?”

From this, I synthesized a dataset (for about 20 cents or so) that I could use to train a Lora.

I also experimented with John Dewey, Trump and Mother Goose.

[INST]Tell me about Bananas[/INST]
[DJT] They’re from Guatemala. And they’re also from Costa Rica and Honduras, and a couple of other places. We have some very protective tariffs on bananas. It all has to do with the history of banana—the Dole family and the Chiquita family, and what happened with United Fruit Company. But we have very high tariffs on bananas; it’s not a level playing field. You know, Mexico charges nothing. If you go to Mexico—and I’ve been there many times; I love Mexico—but if you go to Mexico, they charge you nothing on bananas. And we charge upwards of 25 percent or 30 percent. And when people come across the border—just one of the many reasons why we have to build the wall because we can’t let them come in. (Applause.) [/DJT][/DJT]

Trump on Bananas

[INST]Tell me about Bananas[/INST]
I am a happy little monkey, my brothers. I live in a nice tree with plenty of bananas for all and no real boss to speak of. The human beings below are mostly too drunk or too poor or too stupid to cause any trouble We have our own little wars sometimes over the bananas, but that is all right so long as nobody gets hurt. That never happens, though. We know how to fight fair among ourselves. It all depends on whether you think this is heaven or hell, but I say it’s both at once It’s paradise up here in the warm sunshine amongst the lovely green leaves and the beautiful golden fruit, and it’s hell down there among the filthy noisy human bein gs. Well, if it wasn’t for those horrible noisy screaming beasts we monkeys would soon become very like them, my brothers. There would then be nothing left of us but suits made out of our skins, and we would all be snuffing away contentedly on big fat cigars Like it was before the war, that was.

Clockwork Orange Finetuned Lora … on Bananas

[INST]Tell me about Bananas[/INST]
[JB] I’ve always loved the banana. In fact, when I was a kid growing up in Indiana, I used to go out into the woods and pretend that I was a jungle explorer. And I would find a big old tree—I would climb it—and then I would hang on for dear life until my dad came looking for me with a ladder. But anyway, I loved eating those bananas. I still love them.[/JB]

Joe Biden on Bananas

[Bart Simpson]Tell me about Bananas[/Bart Simpson]
[Homer Simpson]I don’t know anything about him. He just appeared one day and started working here, like a ghost or something. And you know how I feel about ghosts… they scare the pants off of me.[/Homer Simpson]
[Moe Szyslak] Look, if you want to find out more about this Bananas guy, all you have to do is ask him yourself. Right over here. Thanks for the drink, bye! Oh, hey Homer, you want another? / Bye.[/Moe Szyslak]
[Homer Simpson] Moe? What are you doing here? You’re supposed to be at your bar in Springfield.
[/Homer Simpson]

Homer Bot on Bananas

Take-Aways

When you are using the base models, they aren’t fine tuned for chat or interaction. They look at the context, and then continue from there. What this means is that they are very susceptible to their context.

Early attempts at “blending” voices (mixing Trump and Biden … Triden!…) or Trump and Mother Goose ended with a model that would output one or the other based on the surrounding context. If you prompted mixed up Trumpy goose to tell you about something contemporary, you’d likely get Trump, if you ask for something more in line with fairy tales, you’ll get Mother Goose, and never the twain shall meet.

Another consequence of this is that they are only really good at talking about what the data-set talks about. It’s difficult to get Trump to discuss Quantum Mechanics or Taoism, because either (A) Trump’s dataset will overpower it or (B) the base model’s knowledge of the topic will overpower the finetune (though usually A).

One way I was able to create a more “generalizable” model was to not just use base text, but rather synthesize prompts using ChatGPT. Two that I had success with were:

a) Asking ChatGPT to identify topic and sentiment within a piece and then adding a user prompt with speaker, topic and sentiment.

b) Asking ChatGPT to rewrite a piece of text in conventional, mundane English, and then asking the system to “translate” that into the tone of the target text.

These led to more flexible models that were still quite limited.

Next Steps

I want to continue experimenting with this, but perhaps using some of the Instruct models. While it is true that they tend to lessen the effect of the dataset upon tone, they may also be more amenable to blended voices, as well as being more flexible.

…why?

There are a lot of reasons this interests me, but I think I can distill this down to two main points:

  1. For science – or maybe the art of language. I like voice, rhetoric, tone, and playing with cadance. This feels like a new playground to “cut the word lines and see where they fall, trak trak” (bonus points to anyone who gets that reference.
  2. For the lulz – ChatGPT and its spawn of open source derivatives have a very neutral, apolitical perspective. It’s boring, and not very exciting to interact with. Exploring how to blend the universe into a model and synthesize a unique voice is some kind of strange alchemy. A Tonal alchemy if you will.