Saturday, May 10, 2025

Remote jobs

 Websites for job seekers to check out in 2025!

🏷Save this post for Later use 1. SimplyHired (simplyhired.com) 2. Jobspresso (jobspresso.co) 3. Stack Overflow Jobs (stackoverflow) 4. Outsourcely (outsourcely.com) 5. Toptal (toptal.com) 6. Skip The Drive (skipthechive.com) 7. NoDesk (nodesk.co) 8. RemoteHabits (remotehabits.com) 9. Remotive (remotive.com) 10. Remote4Me (remote4me.com) 11. Pangian (pangian.com) 12. Remotees (remotees.com) 13. Europe Remotely (europeremotely.com) 14. Remote OK Europe (https://lnkd.in/gr4C-mjp) 15. Remote of Asia (https://lnkd.in/ghrA_z9u) 16. FlexJobs (flexjobs.com) 17. Remote.co (remote.co) 18. We Work Remotely (weworkremotely.com) 19. RemoteOK (remoteok.com) 20. AngelList (angel.co) 21. Linkedin (linkedin.com)

Friday, February 7, 2025

AI for Transliteration (Voice to text) and text to voice along with voice cloning

 AI for Transliteration (Voice to text) and text to voice along with voice cloning

Requirement 

Question answer assist AI:

1. Listen to audio and retrieve questions using Whisper speech to text

2. Send question to AI model to get answers

3. Tell answers using converted text to speech from 2. Use voice close to use user voice

VB cable setup 

https://www.youtube.com/watch?v=GC1aLL7cPY4  mmsys.cpl to open audio settings

Implementation 
1. Listen to audio and retrieve questions using Whisper speech to text

 The python code below converts voice to text. It uses small model but with better gpu you can go large.


import argparse
import os
import numpy as np
import speech_recognition as sr
import whisper
import torch
from datetime import datetime, timedelta
from queue import Queue
from time import sleep


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--model", default="small", help="Model to use",
                        choices=["tiny", "base", "small", "medium", "large"])
    parser.add_argument("--non_english", action="store_false",
                        help="Don't use the English model.")
    parser.add_argument("--energy_threshold", default=1000,
                        help="Energy level for mic to detect.", type=int)
    parser.add_argument("--record_timeout", default=2,
                        help="How real-time the recording is in seconds.", type=float)
    parser.add_argument("--phrase_timeout", default=5,
                        help="How much empty space between recordings before we "
                             "consider it a new line in the transcription.", type=float)
    args = parser.parse_args()

    # Initialization
    phrase_time = None
    data_queue = Queue()
    recorder = sr.Recognizer()
    recorder.energy_threshold = args.energy_threshold
    recorder.dynamic_energy_threshold = False

    # Microphone setup (Windows only)
    source = sr.Microphone(sample_rate=16000)

    # Check for GPU and set the device
    device = "cuda:0" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")
    print("Device Name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "No GPU")

    # Load Whisper model on the appropriate device
    #model_name = f"openai/whisper-{args.model}"
    model_name = args.model + (".en" if args.model != "large" and not args.non_english else "")
    audio_model = whisper.load_model(model_name, device=device)

    record_timeout = args.record_timeout
    phrase_timeout = args.phrase_timeout
    transcription = [""]

    # Adjust for ambient noise safely
    try:
        with source:
            if source.stream:
                recorder.adjust_for_ambient_noise(source)
    except Exception as e:
        print(f"Error initializing microphone: {e}")
        return

    def record_callback(_, audio: sr.AudioData) -> None:
        """Threaded callback function for audio data."""
        try:
            data = audio.get_raw_data()
            data_queue.put(data)
        except Exception as e:
            print(f"Error in record callback: {e}")

    # Background recording
    recorder.listen_in_background(source, record_callback, phrase_time_limit=record_timeout)

    print("Model loaded and ready to transcribe...\n")

    while True:
        try:
            now = datetime.utcnow()
            if not data_queue.empty():
                phrase_complete = False
                if phrase_time and now - phrase_time > timedelta(seconds=phrase_timeout):
                    phrase_complete = True
                phrase_time = now

                # Process audio data
                audio_data = b"".join(data_queue.queue)
                data_queue.queue.clear()
                audio_np = np.frombuffer(audio_data, dtype=np.int16).astype(np.float32) / 32768.0

                # Transcribe using GPU or CPU
                result = audio_model.transcribe(audio_np, fp16=(device == "cuda:0"))
                text = result["text"].strip()

                # Update transcription
                #if phrase_complete:
                transcription.append(text)
                #else:
                #    transcription[-1] = text

                # Display transcription
                os.system("cls" if os.name == "nt" else "clear")
                for line in transcription:
                    print(line)
                print("", flush=True) #end="", 

            sleep(0.25)  # Prevent CPU overuse
        except KeyboardInterrupt:
            break
        except Exception as e:
            print(f"Unexpected error: {e}")

    print("\nFinal Transcription:")
    for line in transcription:
        print(line)


if __name__ == "__main__":
    main()
    
python whisper-live.py

2. Send question to AI model to get answers
Trying QWEN AI model lets see if i can run 2 models in a GPU Nvidia GeForce GTX 1660 with 6GB VRAM on windows with intel i5 CPU  and 24 GB RAM
1. Install Ollama on windows follow their website
2. ollama run qwen2.5-coder:7b
This will download 4.7 GB model if not existing and run it.

Saturday, December 28, 2024

Self driving AI cars

 The dangers of AI is real especially the technology incubated and harnessed by Zews through their companies. It will start with zew companies making money and destroying other races who are not their slaves as usual by financial imperialism (take away the bread giving jobs like car driving and transport). My friend in IT told they have fired 5 QA guys and replaced them woth AI automation models for testing.

https://www.ibtimes.co.uk/mom-dead-openai-whistleblower-speaks-out-he-was-too-innocent-this-dirty-corporate-world-1729783


Bharatha needs to work on DharmaAI to guard itself from Zews releasing the AI kraken. Russia, China and Bharatha have to collaborate and build this kalkiAI / DharmaAI


https://x.com/sanjeevs_iitr/status/1851900516390801587

Tesla's Full Self-Driving (FSD) software is currently regulated as a Level 2 Advanced Driver-Assistance System (ADAS) by the Society of Automotive Engineers (SAE). This means that the driver is responsible for the vehicle's actions, even when FSD features are engaged. However, Tesla has stated that FSD is capable of Level 4 autonomy, which would mean that the vehicle can drive itself under certain conditions without human intervention.


Swaayatt Robots (above video) claims to have achieved Level 5 autonomy for their autonomous driving system. This means that their vehicles can operate entirely independently on any road condition without human intervention.

Yep ok nice. Indian raoads need special trained AI model with cows and dogs jumping in. 


Btw flowdrive (a fork of commas open pilot) seems to run fine on android snapdragon phone and cars having ADAS( cruise control and lane assist). Plan to trial it on a rented car with adas suport.

Setup seems straightforward

 https://github.com/flowdriveai/flowpilot/wiki/Installation

 https://youtube.com/shorts/cQgf6EY6TCI?si=79fQAgO3swNptgSv


Depends on cars adas features mainly cruise control and lane assist which is available in top end cars

 Following models are suported without hardware

https://github.com/commaai/openpilot/blob/master/docs/CARS.md

If you need a polished commercial one instead of android openpilots comma ai costs $1000 for their device. https://youtu.be/0aq4Wi2rsOk?si=2546H342cuL2gwUE

https://youtu.be/6ikxBWUAjmI?si=0nakmxc1KqL7nqpI

Comma ai is an L2 at $1200

Visit to kerala kannur calicut

It was a quick day trip from mysuru. Stayed at Sun n tan homestay run by Murali Mohan ( he was electrician in navy ). He was a lovely person played a quick game of chess i won he cried i was distracted by phone.

Beautiful beaches, nice people in lungi flaunting moustache. The sea shores were best. A 2 minute ride on yamaha powered motor boat gave us a sense of seas power.

Visited the kantara museum and st angelo fort built by dutch (it mostly houses prison cells).

i could see mostly muslim women and marriage functions near the place i stayed. Could relate to captain vadakayils blog how hyderali seighed kannur and Tippu converted majority to myuslim and murdering many. The myuslim women in hijab with wierd rounded long nose made me feel yuck. I hardly saw hindu woman walking around. The muslim woman in hijabs in SUVs were roaming freely, i even saw a teen myuslim girl dashing through the ticket counter pushin a man in lungi and his wife between.

ivesg a similar feel when we walk in uslism reaa in ysurum.

The lungi men were well built. Big moustachoids in lungi were like super marios.

Saturday, May 25, 2024

AI tools for content creators

Update May 2025


Image video creation review
Reference:
**1. kling ai 
2. minimax ai
**5. Viggle AI : replace character from any video

3. Hedra AI - make any image talk
4. Krea AI - Transition between images for video
6. Vidu AI : make two people hug kind of action video

7. Crop a video online : https://www.canva.com/design/

1. Give lyrics and it will create perfect song for you 
Sample song generated by Suno AI https://youtube.com/shorts/WJ7ngCF6kgs


2. Turn Your 3D Animation Dream into Reality! Discover the easy, code-free AI process in this step-by-step guide - No coding or animation experience needed!
1. Midjourney ➜ [Midjourney](https://www.midjourney.com/) or free tool like imagine.art
2. Runway ➜ [Runway](https://runwayml.com/)
3. Think Diffusion** ➜ [Think Diffusion](https://wl.tools/thinkdiffusion)
4. Online Convert** ➜ [Online Convert](https://www.online-convert.com/)
5. Lalamu** ➜ [Lalamu](https://lalamu.studio/demo/)
6. ClipChamp** ➜ [ClipChamp](https://clipchamp.com/en/)
7. ElevenLabs** ➜ [ElevenLabs](https://wl.tools/elevenlabs)
8. Pixybay** ➜ [Pixybay](https://pixabay.com/music/)
9. Vmake** ➜ [Vmake](https://vmake.ai/)
10. Free Tool for Image Generation:**
    - Playground** ➜ [Playground](https://playgroundai.com/)
    - Clipdrop** ➜ [Clipdrop](https://clipdrop.co/)
    - Stability.ai** ➜ [Stability.ai](https://stability.ai/stable-diffusion)
    - Leonardo** ➜ [Leonardo](https://leonardo.ai/)

This is the video i generated using above technique https://www.youtube.com/watch?v=DFYEwsRQ2Ac

Question is where will traditional content creators go as reality becomes synthetic with AI?  

Monday, February 26, 2024

AI notes

Ollama provides a way to run AI models on your PC with around 16 GB ram. It is slow but it works. Need to try connecting with eGPU as my mac supports Thunderbolt 4. 

Ollama AI

https://ollama.com/


ollama run llama2


Uncensored models

ollama run llama2-uncensored

ollama run nous-hermes-llama2

ollama run wizard-vicuna


ollama run codellama. //for java, python, c++ etc


ollama list



http://127.0.0.1:11434/


https://ollamahub.com/  


docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway --name ollama-webui --restart always ghcr.io/ollama-webui/ollama-webui:main


ollama webui 

  1. cd /Users/[]/AI/ollama-webui/backend

 2. sh start.sh

  1. http://localhost:8080/


Open chrome with debug port

cd /Applications/Google Chrome.app/Contents/MacOS

./Google\ Chrome --remote-debugging-port=9222



 Update 8-feb-2025

1. Installing qwen(best AI model so far by Alibaba, China) on ollama

ollama run qwen2.5-coder:7b

This will download 4.7GB AI model on your PC


TextGen

Manual installation text generation webui using Conda https://github.com/oobabooga/text-generation-webui

Recommended if you have some experience with the command-line.


0. Install Conda

https://docs.conda.io/en/latest/miniconda.html

On Linux or WSL, it can be automatically installed with these two commands (source):

curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"

bash Miniconda3.sh



1. Create a new conda environment

conda create -n textgen python=3.11

conda activate textgen



2. Install Pytorch

System

GPU

Command

Linux/WSL

NVIDIA

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Linux/WSL

CPU only

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Linux

AMD

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6

MacOS + MPS

Any

pip3 install torch torchvision torchaudio

Windows

NVIDIA

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Windows

CPU only

pip3 install torch torchvision torchaudio

The up-to-date commands can be found here: https://pytorch.org/get-started/locally/.

For NVIDIA, you may also need to manually install the CUDA runtime libraries:

conda install -y -c "nvidia/label/cuda-12.1.0" cuda-runtime



3. Install the web UI

git clone https://github.com/oobabooga/text-generation-webui

cd text-generation-webui

pip install -r <requirements file according to table below>


Requirements file to use:

GPU

CPU

requirements file to use

Apple

Intel

requirements_apple_intel.txt



conda activate textgen

cd text-generation-webui

python server.py