Fork me on GitHub

TikToken + Unix Dictionary Shortener

Shorty tokenizes GitHub raw/blob URLs with TikToken, swaps every dictionary hit with a two-character marker sourced from the UNIX word list, and keeps everything on one static page. Paste a link to generate a deterministic shortcode, or drop in a shortcode/URL to restore the original download target.

Loading tokenizer…
Glyph Set

Accepts raw URLs or github.com blob URLs (main/master supported). Encoding starts automatically.

Shortened Output
READY
Encoded shortcode will appear here.

Paste a shortcode or URL to decode instantly.

Decoded Output
INFO
Decoded URL will appear here.

How it works

The encoder splits the canonical owner/repo/@branch/path string using TikToken’s subword vocabulary. Each alphabetic run is checked against a cached UNIX dictionary (plus a few custom tokens); if the dictionary token is shorter, it becomes a ^page+unicode symbol that captures both the entry index and the word’s case. Decoding reverses the process by reading those markers back into words before rebuilding both the raw download URL and the standard GitHub blob link. Everything lives in index.html and words.txt, so any host can serve it without a backend.

Detected shortcode in the URL…
Redirecting in 3