Simply blog

Monday, July 14, 2025

AI 4 Bharath

This post will track all the developments of AI in Bharatha.

1. IIT Chennai is well organised to work in this area:

https://ai4bharat.iitm.ac.in/

They are working on AI LLM models to generate native language content by training them with local language data. As it is hard to find local data they are working on translating englist dataset to local language and training the LLM.

2. Raju Kandasamy has done some good work on Tamil and released Tamillama AI model

https://medium.com/@raju.kandasamy/building-your-first-indic-language-model-a-step-by-step-guide-b305becc5eb2

This video he explains well the internals of AI brain

Sunday, June 8, 2025

Interlocking bricks wall construction

https://youtu.be/pfnJLymuT2g?si=_J2jfjUahYJqAJG0

Build easily like lego toys

Cement interlocking bricks seems good no plastering required. i am not sure about mud cement interlocking bricks.

costs around rs.40 per brick

Saturday, May 10, 2025

Remote jobs

Websites for job seekers to check out in 2025!

🏷Save this post for Later use 1. SimplyHired (simplyhired.com) 2. Jobspresso (jobspresso.co) 3. Stack Overflow Jobs (stackoverflow) 4. Outsourcely (outsourcely.com) 5. Toptal (toptal.com) 6. Skip The Drive (skipthechive.com) 7. NoDesk (nodesk.co) 8. RemoteHabits (remotehabits.com) 9. Remotive (remotive.com) 10. Remote4Me (remote4me.com) 11. Pangian (pangian.com) 12. Remotees (remotees.com) 13. Europe Remotely (europeremotely.com) 14. Remote OK Europe (https://lnkd.in/gr4C-mjp) 15. Remote of Asia (https://lnkd.in/ghrA_z9u) 16. FlexJobs (flexjobs.com) 17. Remote.co (remote.co) 18. We Work Remotely (weworkremotely.com) 19. RemoteOK (remoteok.com) 20. AngelList (angel.co) 21. Linkedin (linkedin.com)

Friday, February 7, 2025

AI for Transliteration (Voice to text) and text to voice along with voice cloning

Requirement

Question answer assist AI:

1. Listen to audio and retrieve questions using Whisper speech to text

2. Send question to AI model to get answers

3. Tell answers using converted text to speech from 2. Use voice close to use user voice

VB cable setup

https://www.youtube.com/watch?v=GC1aLL7cPY4 mmsys.cpl to open audio settings

Implementation

1. Listen to audio and retrieve questions using Whisper speech to text

The python code below converts voice to text. It uses small model but with better gpu you can go large.


import argparse
import os
import numpy as np
import speech_recognition as sr
import whisper
import torch
from datetime import datetime, timedelta
from queue import Queue
from time import sleep


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--model", default="small", help="Model to use",
                        choices=["tiny", "base", "small", "medium", "large"])
    parser.add_argument("--non_english", action="store_false",
                        help="Don't use the English model.")
    parser.add_argument("--energy_threshold", default=1000,
                        help="Energy level for mic to detect.", type=int)
    parser.add_argument("--record_timeout", default=2,
                        help="How real-time the recording is in seconds.", type=float)
    parser.add_argument("--phrase_timeout", default=5,
                        help="How much empty space between recordings before we "
                             "consider it a new line in the transcription.", type=float)
    args = parser.parse_args()

    # Initialization
    phrase_time = None
    data_queue = Queue()
    recorder = sr.Recognizer()
    recorder.energy_threshold = args.energy_threshold
    recorder.dynamic_energy_threshold = False

    # Microphone setup (Windows only)
    source = sr.Microphone(sample_rate=16000)

    # Check for GPU and set the device
    device = "cuda:0" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")
    print("Device Name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "No GPU")

    # Load Whisper model on the appropriate device
    #model_name = f"openai/whisper-{args.model}"
    model_name = args.model + (".en" if args.model != "large" and not args.non_english else "")
    audio_model = whisper.load_model(model_name, device=device)

    record_timeout = args.record_timeout
    phrase_timeout = args.phrase_timeout
    transcription = [""]

    # Adjust for ambient noise safely
    try:
        with source:
            if source.stream:
                recorder.adjust_for_ambient_noise(source)
    except Exception as e:
        print(f"Error initializing microphone: {e}")
        return

    def record_callback(_, audio: sr.AudioData) -> None:
        """Threaded callback function for audio data."""
        try:
            data = audio.get_raw_data()
            data_queue.put(data)
        except Exception as e:
            print(f"Error in record callback: {e}")

    # Background recording
    recorder.listen_in_background(source, record_callback, phrase_time_limit=record_timeout)

    print("Model loaded and ready to transcribe...\n")

    while True:
        try:
            now = datetime.utcnow()
            if not data_queue.empty():
                phrase_complete = False
                if phrase_time and now - phrase_time > timedelta(seconds=phrase_timeout):
                    phrase_complete = True
                phrase_time = now

                # Process audio data
                audio_data = b"".join(data_queue.queue)
                data_queue.queue.clear()
                audio_np = np.frombuffer(audio_data, dtype=np.int16).astype(np.float32) / 32768.0

                # Transcribe using GPU or CPU
                result = audio_model.transcribe(audio_np, fp16=(device == "cuda:0"))
                text = result["text"].strip()

                # Update transcription
                #if phrase_complete:
                transcription.append(text)
                #else:
                #    transcription[-1] = text

                # Display transcription
                os.system("cls" if os.name == "nt" else "clear")
                for line in transcription:
                    print(line)
                print("", flush=True) #end="", 

            sleep(0.25)  # Prevent CPU overuse
        except KeyboardInterrupt:
            break
        except Exception as e:
            print(f"Unexpected error: {e}")

    print("\nFinal Transcription:")
    for line in transcription:
        print(line)


if __name__ == "__main__":
    main()

python whisper-live.py

2. Send question to AI model to get answers

Trying QWEN AI model lets see if i can run 2 models in a GPU Nvidia GeForce GTX 1660 with 6GB VRAM on windows with intel i5 CPU  and 24 GB RAM

1. Install Ollama on windows follow their website

2. ollama run qwen2.5-coder:7b

This will download 4.7 GB model if not existing and run it.

Saturday, December 28, 2024

Self driving AI cars

The dangers of AI are real. Especially the kind of AI tech being developed and controlled by companies backed by Zews. It always starts the same way — they build something, make a ton of money, and eventually use it to dominate and destroy others. This time it’s AI. First, it’s replacing jobs — like car driving, transport, and now even IT roles.

A friend of mine who works in IT told me that five QA testers were fired and replaced by AI automation tools for software testing. That’s how it begins — job loss, dependence, and eventually control.

This is how financial imperialism works. Take away people’s ability to earn and replace it with tech you own.

Check this out:
https://www.ibtimes.co.uk/mom-dead-openai-whistleblower-speaks-out-he-was-too-innocent-this-dirty-corporate-world-1729783
Even whistleblowers are paying the price.

We seriously need to wake up. Bharatha (India) needs to start working on its own version of ethical AI. Something based on Dharma. Russia, China, and Bharatha should come together and build something like DharmaAI or KalkiAI — AI that protects people instead of exploiting them.

This post gives some good perspective:
https://x.com/sanjeevs_iitr/status/1851900516390801587

Let’s talk about self-driving cars for a bit. That’s one area where AI is moving fast.

Tesla's FSD (Full Self-Driving) is currently classified as Level 2 by SAE. That means the driver is still responsible for the car even when it’s in self-driving mode. Tesla says their system can do Level 4, but that’s not official yet. Level 4 would mean the car can drive itself in most conditions without needing a human.

Now here’s something crazy — there’s an Indian startup called Swaayatt Robots claiming they’ve reached Level 5 autonomy. That means full self-driving in any road condition, no human required.
Check them out here: https://www.swaayattrobots.com/

If true, it’s a huge leap. Especially because Indian roads aren’t like California highways. We’ve got cows, dogs, potholes, people randomly crossing — our roads need a different kind of AI. Not the generic Western models.

Also, if you're into tech, you can actually try out self-driving tools yourself.

FlowDrive is a fork of Comma AI’s OpenPilot. It runs on Android phones (Snapdragon) and works with cars that already have cruise control and lane assist. Planning to try it on a rented car with ADAS features.

Setup is pretty simple:
https://github.com/flowdriveai/flowpilot/wiki/Installation
Demo: https://youtube.com/shorts/cQgf6EY6TCI?si=79fQAgO3swNptgSv

It depends on your car’s built-in features. Here’s a list of supported car models:
https://github.com/commaai/openpilot/blob/master/docs/CARS.md

If you want a polished plug-and-play version, you can buy the Comma AI device — it’s around $1000.
Here’s a few demo videos:
https://youtu.be/0aq4Wi2rsOk?si=2546H342cuL2gwUE
https://youtu.be/6ikxBWUAjmI?si=0nakmxc1KqL7nqpI

Bottom line — AI is not just a tool. It’s power. And like any power, it can be misused.

We can’t sit back while others build AIs that control us. Bharatha needs to build its own, ethical, dharmic AI before it’s too late.

Or else, we’ll be watching the AI kraken being released... and this time, it might be too late to push it back.

Visit to kerala kannur calicut

It was a quick day trip from mysuru. Stayed at Sun n tan homestay run by Murali Mohan ( he was electrician in navy ). He was a lovely person played a quick game of chess i won he cried i was distracted by phone.

Beautiful beaches, nice people in lungi flaunting moustache. The sea shores were best. A 2 minute ride on yamaha powered motor boat gave us a sense of seas power.

Visited the kantara museum and st angelo fort built by dutch (it mostly houses prison cells).

i could see mostly muslim women and marriage functions near the place i stayed. Could relate to captain vadakayils blog how hyderali seighed kannur and Tippu converted majority to myuslim and murdering many. The myuslim women in hijab with wierd rounded long nose made me feel yuck. I hardly saw hindu woman walking around. The muslim woman in hijabs in SUVs were roaming freely, i even saw a teen myuslim girl dashing through the ticket counter pushin a man in lungi and his wife between.

ivesg a similar feel when we walk in uslism reaa in ysurum.

The lungi men were well built. Big moustachoids in lungi were like super marios.

Saturday, May 25, 2024

AI tools for content creators

Update May 2025

Image video creation review

Reference:

https://www.youtube.com/watch?v=5BFgqJUgZq0

**1. kling ai

2. minimax ai

**5. Viggle AI : replace character from any video

3. Hedra AI - make any image talk

4. Krea AI - Transition between images for video

6. Vidu AI : make two people hug kind of action video

7. Crop a video online : https://www.canva.com/design/

1. Give lyrics and it will create perfect song for you

https://suno.com/create

Sample song generated by Suno AI https://youtube.com/shorts/WJ7ngCF6kgs

https://youtu.be/MAkqT8nELVg

2. Turn Your 3D Animation Dream into Reality! Discover the easy, code-free AI process in this step-by-step guide - No coding or animation experience needed!

1. Midjourney ➜ [Midjourney](https://www.midjourney.com/) or free tool like imagine.art

2. Runway ➜ [Runway](https://runwayml.com/)

3. Think Diffusion** ➜ [Think Diffusion](https://wl.tools/thinkdiffusion)

4. Online Convert** ➜ [Online Convert](https://www.online-convert.com/)

5. Lalamu** ➜ [Lalamu](https://lalamu.studio/demo/)

6. ClipChamp** ➜ [ClipChamp](https://clipchamp.com/en/)

7. ElevenLabs** ➜ [ElevenLabs](https://wl.tools/elevenlabs)

8. Pixybay** ➜ [Pixybay](https://pixabay.com/music/)

9. Vmake** ➜ [Vmake](https://vmake.ai/)

10. Free Tool for Image Generation:**

- Playground** ➜ [Playground](https://playgroundai.com/)

- Clipdrop** ➜ [Clipdrop](https://clipdrop.co/)

- Stability.ai** ➜ [Stability.ai](https://stability.ai/stable-diffusion)

- Leonardo** ➜ [Leonardo](https://leonardo.ai/)

Reference https://www.youtube.com/watch?v=amFOIYqN6dM

This is the video i generated using above technique https://www.youtube.com/watch?v=DFYEwsRQ2Ac

Question is where will traditional content creators go as reality becomes synthetic with AI?

Monday, February 26, 2024

AI notes

Ollama provides a way to run AI models on your PC with around 16 GB ram. It is slow but it works. Need to try connecting with eGPU as my mac supports Thunderbolt 4.

Ollama AI

https://ollama.com/

ollama run llama2

Uncensored models

ollama run llama2-uncensored

ollama run nous-hermes-llama2

ollama run wizard-vicuna

ollama run codellama. //for java, python, c++ etc

ollama list

http://127.0.0.1:11434/

https://ollamahub.com/

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway --name ollama-webui --restart always ghcr.io/ollama-webui/ollama-webui:main

ollama webui

cd /Users/[]/AI/ollama-webui/backend

2. sh start.sh

http://localhost:8080/

Open chrome with debug port

cd /Applications/Google Chrome.app/Contents/MacOS

./Google\ Chrome --remote-debugging-port=9222

Update 8-feb-2025

1. Installing qwen(best AI model so far by Alibaba, China) on ollama

ollama run qwen2.5-coder:7b

This will download 4.7GB AI model on your PC

TextGen

Manual installation text generation webui using Conda https://github.com/oobabooga/text-generation-webui

Recommended if you have some experience with the command-line.

0. Install Conda

https://docs.conda.io/en/latest/miniconda.html

On Linux or WSL, it can be automatically installed with these two commands (source):

curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"

bash Miniconda3.sh

1. Create a new conda environment

conda create -n textgen python=3.11

conda activate textgen

2. Install Pytorch

System	GPU	Command
Linux/WSL	NVIDIA	pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Linux/WSL	CPU only	pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
Linux	AMD	pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
MacOS + MPS	Any	pip3 install torch torchvision torchaudio
Windows	NVIDIA	pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Windows	CPU only	pip3 install torch torchvision torchaudio

The up-to-date commands can be found here: https://pytorch.org/get-started/locally/.

For NVIDIA, you may also need to manually install the CUDA runtime libraries:

conda install -y -c "nvidia/label/cuda-12.1.0" cuda-runtime

3. Install the web UI

git clone https://github.com/oobabooga/text-generation-webui

cd text-generation-webui

pip install -r <requirements file according to table below>

Requirements file to use:

GPU	CPU	requirements file to use
Apple	Intel	requirements_apple_intel.txt