Programming in object pascal with the Delphi CE IDE (and maybe other stuff).

Tuesday, June 18, 2024

[eng] A dumb journey into the AI world. (part 1 of ?)

June 18, 2024 Posted by TikoTako , , , , , , No comments

AI

more like a pAIn in the butt?

I recently embarked on a small project that required the use of AI. The goal was simple: to add a human touch to my home automation system, which, at the moment, is as basic as “press button, receive bacon” 🤣.

Jokes aside, the system is just a local server running on a modest PC, with an open TCP port that receives commands to control various devices and read sensor data via an Arduino.

My plan was to learn how AI works (not in a technical sense, but the big picture), and then create one that could translate human text into commands.

Since I knew nothing about AI, the first thing I did was to try out the free language models (LMs) on my main PC. And there, the horror began.

My PC, a 13600k with 32 GB of RAM and no GPU (I can’t afford one), struggled to run a 13B model. It consumed almost all the RAM and was so slow that I could make a coffee while waiting for it to respond.

A 7B model was better. It used less RAM and was faster. However, as the models got smaller, they became too simple to do anything useful, even though their speed increased significantly and their RAM usage decreased.

So, I thought, why not pick a 7B model and train it to understand what I say and generate some kind of text? Well, nope. The hardware necessary to train a 7B model is way too high. And that’s just for a 7B, and it’s not even “real training”, it’s fine-tuning.

Real training involves feeding an algorithm with data, like billions of pieces of text data. The algorithm converts the data into tokens and then creates connections between the tokens into layers. In this way, terabytes of data become gigabytes of interconnected tokens. At that point, the LM can understand things, like where a word fits best.

Then comes the fine-tuning, which teaches the LM to understand what something means so it can respond accordingly. There are various ways to do this, but the most common is a series of Q/A, and another common method is to provide one question and two answers, one good and one bad.

Now, back to the 7B model. Training from scratch is literally impossible. Fine-tuning is possible if I buy a video card with a few GB of VRAM, but I can’t. So, I thought I’d pick one of the smallest models and teach it with Q/A to generate some magic string when I ask something. For example, if I ask “What’s the water temperature in the fish tank?”, it could generate something like “##temp##fish##tank##”. Then the UI intercepts that, sends the temperature request to the server, converts the reply, and tells me as usual.

The good news is that there are LMs under 1B that can understand what one writes, and I can run the fine-tuning on my PC. The bad news is that they are all in English, and I need it in Italian because I’m not the only one who will use it.

I was about to give up when I heard that the Qwen2 0.5B model has some Italian in it. So, I did a test, and yes, it does have some Italian. It’s very bad, but it’s there 😮.

And again, I had the brilliant idea to feed it lots of Italian text to do some kind of post-pre-training (yeah, I’m confused too). And I can’t believe I’ve spent ELEVEN HOURS trying to find a decent Italian dataset. Most are either query-online-only or dead links. The few that can be downloaded are scraped web pages that have been “cleaned”. I checked them, and yes, the Italian text is there, but it’s a total mess.

I won’t go into details, but the biggest dataset I found was almost 2GB. It took them two years to make, and it contains:

Site errors (MySQL, 404, 403, etc.), poorly written Italian, random symbols, user comments, emails, links, etc…

At this point, I feel kind of demoralized, to be honest. Even the Italian Wikipedia dataset has lots of crap inside besides the text.

Fast forward to two weeks later:

  • Any raw "corpus" is, well, raw, so it contains tons of crap.
  • Any processed "corpus" still contains crap. Way less, but it's still there.
  • Any dataset that is from a processed "corpus" still contains crap.

I tried to clean some of it, but it’s almost impossible. I’ve tried with a mix of regex and language_tool_python. It “works” with the big stuff (links, MySQL errors, server errors, ads, etc.), but it still needs to be manually verified because the output was a 50/50 mix of bad/good. But the bad actually had some good text (just typing errors).

So, anyway, now I have a 34k lines text file (which is a pitiful thing). I’m going to try to increase the model’s understanding of Italian with that…

I hope it’s enough since it already has some Italian 😒.

 

English fixed by copilot.

Thursday, May 30, 2024

[eng] Silent Stream Bug fix (workaround)

May 30, 2024 Posted by TikoTako , , , , , , No comments

Silent Stream Bug Fix (Workaround)

So, I had no idea my system was affected by this bug because usually, I only listen to music and watch some videos. I found out about the bug when I was testing Piper.
I had a problem with silent audio for the first second of the .wav file with --output_file and absurd audio output with --output-raw streaming to ffplay. I thought it was Piper, but after a few tests, I found out that it was actually the audio output to the HDMI.
Basically, the driver goes to sleep and takes up around a second to actually start when you play something. There is no desync in videos, so you can't tell until you open some audio with someone who starts to talk at the beginning. I tried all the things I found on Google like power management tweaks, modifying registry keys, reinstalling old drivers, etc...
Nothing worked, so I tested with Audacity. I generated a 30-second silent .wav file and had it play in the background, then opened one of the .wav files generated by Piper and it was working fine.
So, I made this simple program SilentStreamFixer (GitHub). It just plays a silent .wav file in a loop until you stop or close it.
It sits in the icon tray with a green background when active and a red background when inactive.




AI version cuz my english is crap:

The Silent Stream Bug: A Soundless Odyssey (and the Workaround I Built)

Ah, audio – the lifeblood of multimedia. But what happens when that lifeblood gets choked? That's the frustration many users face with the "silent stream bug," a gremlin plaguing some systems and causing HDMI audio outputs to fall silent.

This bug manifests as random audio dropouts, often after periods of inactivity. It's particularly maddening because restarting applications or fiddling with settings might bring the sound back temporarily, only for it to vanish again later.

So, how do we slay this audio gremlin?

Tried-and-Tried Again: Common Fixes

Many solutions abound online, but often with limited success. Here are a few I wrestled with:

  • Driver Updates: Naturally, the first port of call is updating your audio drivers. Outdated drivers can cause a multitude of issues, so keeping them current is always a good practice. Unfortunately, in my case, driver updates didn't vanquish the silent foe.
  • Power Management Settings: Some suggest tweaking power management settings for audio devices to prevent them from entering sleep mode. While this might help in some cases, it wasn't a silver bullet for me.
  • Disabling Audio Enhancements: Audio enhancements can sometimes interfere with playback. Disabling them might resolve the issue, but it also meant sacrificing some potential sound quality improvements.

The problem persisted. Each "fix" felt like chasing shadows, only to have the silence return.

Building a Better Fix: Silent Stream Fixer

Frustration can be a powerful motivator. Unable to find a reliable solution, I decided to take matters into my own hands. Enter Silent Stream Fixer, a program I built specifically to address the silent stream bug.

Silent Stream Fixer works on a simple principle: it continuously plays a very short, silent WAV file in a loop. This keeps the audio pipeline active and prevents it from going into sleep mode, effectively circumventing the bug.

The beauty of Silent Stream Fixer lies in its simplicity and effectiveness. It's a lightweight program that runs quietly in the background, keeping your audio stream flowing uninterrupted. You can find the code for Silent Stream Fixer on GitHub.

The Final Note

While Silent Stream Fixer has been a personal solution, it's important to remember that the silent stream bug can have various causes. If you're experiencing audio dropouts, investigate potential driver issues, hardware conflicts, or deeper system settings.

However, if you've exhausted those options and still find yourself battling the silence, Silent Stream Fixer might just be the hero your audio needs. It may not be the most elegant solution, but sometimes, a simple workaround is all it takes to get the music back on track.