Purpose
Unlike many other extensions, this document updates the semantics of an existing extension, the Twt Hash v1 extension.
It aims to improve the security and scalability of the Twt Hash v1 extension by increasing the twt hash length from 7 to 12 characters, using the first 12 characters of the base32-encoded blake2b hash. This change aims to address two issues:
-
Collision Prevention: By extending the hash length, the likelihood of hash collisions is significantly reduced. With the 7-character hash defined in v1, the chance of collision becomes increasingly probable as the number of twts grows. The longer 12-character hash defined in v2 provides a safer margin, making hash collisions a rare occurrence even with large datasets.
-
Hash Uniqueness: The v1 specification has a noticeable issue where all of the hashes end either in the characters “q” or “a”. This update will eliminate such occurrences and enhance the overall randomness and uniqueness of the hash values.
Format
Most details from the the Twt Hash v1 specification still remain intact for v2. This includes the selection of the feed URL and timestamp format.
However, the twt hash is derived from the first 12 characters of the base32-encoded blake2b hash instead of the last 7 characters in v1. This makes both v1 and v2 twt hash values backwards-incompatible.
Thus, the twt hash is calculated using the following procedure:
- Concatenate the author feed URL, the timestamp (in RFC 3339 format), and the twt text with newline characters between them.
- Perform a blake2b hash on the resulting string with a 256-bit digest size.
- Encode the result in base32 without padding.
- Convert the base32 string to lowercase.
- Use the first 12 characters of the base32 string as the twt hash.
Epoch and Version Compatibility
The epoch for Twt Hash v2 is 2026-07-01T00:00:00Z.
- Hash v1: The 7-character hash remains authoritative for any twt whose timestamp is before the epoch.
- Hash v2: Twts with timestamps on or after the epoch use the 12 characters long hash as defined above.
- Migration Guidance: Clients MUST NOT retroactively re-hash historical twts before the epoch with hash v2. Depending on the twt timestamps, they SHOULD persist either hash v1 or hash v2 values to keep subjects, caches, and reply chains stable across the network.
Reference Test Vectors
| Twt Author Feed URL | Twt Timestamp | Twt Text | Hash v1 (last 7) | Hash v2 (first 12) |
|---|---|---|---|---|
https://example.com/twtxt.txt |
2026-07-01T00:00:00Z |
Hello World! |
j5uwzcq |
myzxbwxktuvs * |
https://example.com/twtxt.txt |
2025-04-29T12:00:00Z |
Hello World! |
om5qesa * |
ejnvat3u5tnr |
https://twtxt.net/twtxt.txt |
2024-12-31T23:59:59Z |
Happy New Year! |
jcezvlq * |
rgg4k7lv5gzr |
https://example.com/hugo |
2026-12-28T14:00:00+01:00 |
(#1234567890ab) Sounds good! |
qjqa4nq |
v4yu3xmr65z7 * |
Both Hash v1 and v2 values are shown for all rows so implementers can verify their code paths even for pre- and post-epoch data, though hashes created before the epoch must still be stored and compared using Hash v1. The authoritative hash values are marked with “*”.
Example using yarnc hash:
$ ./yarnc hash -u https://example.com/twtxt.txt -t 2026-07-01T00:00:00Z 'Hello World!'
myzxbwxktuvs
The same content, with the same feed before the epoch, results in the Twt Hash v1 being selected:
$ ./yarnc hash -u https://example.com/twtxt.txt -t 2025-04-29T12:00:00Z 'Hello World!'
om5qesa
Security Considerations
Hash Collision: The new 12-character hash length significantly reduces the risk of hash collisions, even with a large number of twts. However, users should be aware that as the number of feeds grows, the likelihood of hash collisions will always exist, though it will be much lower with the 12-character hash.
A single character in the base32 alphabet requires 5 bits to be encoded (2⁵ = 32). With a length of 12 characters, the twt hash v2 represents 60 bits of information (5 bits per character × 12 characters = 60 bits). This results in 32¹² possible hash combinations, i.e. roughly 1.15×10¹⁸. Applying the birthday bound 1.1774 × sqrt(32¹²), a 50% collision probability does not occur until around 1.26 billion unique twts, making the change highly scalable for the foreseeable future.
Reference Implementations
This section shows reference implementations of this algorithm.
Go
package twtxt_extensions
import (
"encoding/base32"
"time"
"golang.org/x/crypto/blake2b" // e.g. v0.23.0
)
// lowerBase32WithoutPadding is base32 without padding using lowercase. The
// base32.StdEncoding uses uppercase, so we have to specify all characters
// ourselves.
var lowerBase32WithoutPadding = base32.
NewEncoding("abcdefghijklmnopqrstuvwxyz234567").
WithPadding(base32.NoPadding)
var hashV2Epoch = time.Date(2026, time.July, 1, 0, 0, 0, 0, time.UTC)
func HashMessage(url string, createdAt time.Time, text string) string {
payload := url + "\n" + createdAt.Format(time.RFC3339) + "\n" + text
sum := blake2b.Sum256([]byte(payload))
hash := lowerBase32WithoutPadding.EncodeToString(sum[:])
if createdAt.Before(hashV2Epoch) {
// v1 takes the last seven digits
return hash[len(hash)-7:]
}
// v2 takes the first 12 digits
return hash[:12]
}
Python
import base64
import datetime
import hashlib
hash_v2_epoch = datetime.datetime(2026, 7, 1, 0, 0, 0,
tzinfo=datetime.timezone.utc)
def hash_message(url: str, created_at: datetime.datetime, text: str) -> str:
if created_at.tzinfo is None:
created_at = created_at.replace(tzinfo=datetime.timezone.utc)
created = created_at.isoformat().replace("+00:00", "Z")
payload = f"{url}\n{created}\n{text}"
sum256 = hashlib.blake2b(payload.encode("utf-8"), digest_size=32).digest()
hash = base64.b32encode(sum256).decode("ascii").replace("=", "").lower()
if created_at < hash_v2_epoch:
# v1 takes the last seven digits
return hash[-7:]
# v2 takes the first 12 digits
return hash[:12]