Igor's Techno Club

Speech to Text on WSL in one line

Let's assume you downloaded a Voice file from your iPhone and now want to convert it to a text. You just need to run that line:

Full command

MODEL=tiny.en FILE=YourVoiceFile.m4a docker run --rm --entrypoint /bin/sh -e MODEL -e FILE -v "$(pwd):/data" ghcr.io/ggml-org/whisper.cpp:main -lc 'BASE="${FILE%.*}"; test -f /data/models/ggml-$MODEL.bin || { mkdir -p /data/models && curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-$MODEL.bin" -o "/data/models/ggml-$MODEL.bin"; }; ffmpeg -y -i "/data/$FILE" -ar 16000 -ac 1 -c:a pcm_s16le "/data/$BASE.wav" && whisper-cli -m "/data/models/ggml-$MODEL.bin" -t 8 -f "/data/$BASE.wav" -otxt -of "/data/$BASE"'

Description:

MODEL=tiny.en Selects the Whisper model. Change this to base.en, small.en, etc.

FILE=YourVoiceFile.m4a Sets the input audio filename from your current folder.

docker run --rm Starts a temporary container and removes it when done.

--entrypoint /bin/sh Forces the container to run a shell so the inline script works.

-e MODEL -e FILE Passes both environment variables into the container.

-v "$(pwd):/data" Mounts your current folder into the container as /data, so inputs and outputs are shared.

BASE="${FILE%.*}" Removes the file extension, so YourVoiceFile.m4a becomes YourVoiceFile.

test -f /data/models/ggml-$MODEL.bin || ... Checks whether the chosen model is already cached locally.

mkdir -p /data/models Creates the local models folder if it does not exist.

curl -L ... -o /data/models/ggml-$MODEL.bin Downloads the selected model if missing.

ffmpeg -y -i "/data/$FILE" -ar 16000 -ac 1 -c:a pcm_s16le "/data/$BASE.wav" Converts your audio file to 16 kHz, mono, 16-bit WAV, which is a safe input format for Whisper.

whisper-cli -m "/data/models/ggml-$MODEL.bin" -t 8 -f "/data/$BASE.wav" -otxt -of "/data/$BASE" Runs transcription using 8 CPU threads and writes the transcript as /data/YourVoiceFile.txt.

Output files: