Thursday, May 09, 2019

[efkabuai] Always have an escape plan

We consider a situation in which an internet site anticipates that it might disappear from its current address, perhaps because of censorship.  In preparation for potentially moving to a yet undecided location, we propose a method of preemptively publishing "digital breadcrumbs" (referencing Hansel and Gretel) on its current site as a way for users to find the next location if or when the site moves.  The site encourages users to save a local copy of the breadcrumbs.

The published digital breadcrumbs consist of two parts: a public key and the specification of a deterministic pseudorandom word generator including its random seed.

As an example, we'll specify digital breadcrumbs for this blog.

Here is the public key for this blog, with fingerprint 2C95 B41D A4CE 7C5F 0110 C27A 561D DCBA 1BDA E2F7:

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1

mQENBFUrkNQBCADRc2ia1qpiS8wwrsqnPQpUoGY8DC1+tyRs6xGTqkkdxADLkS6f
VPuSkMktr0D/NmUmyCVSPkITYMeDlZ09eG2DIl33zS5ZpxTLgbHau8o8QB2cXw4f
ldwDrt5UmQwc8jF6vwKqoXyxPxJIb59fxCQ5s6llurnUI9MdlhDMyRQ0rFHkXu8G
JX+49zisWep7ZLZRT7/zdlKNlw2mriMTavOajCXtfR4WnFbQ8oYBkYLJPZFk4bi6
p4pyX9/nwcKCF2yIs7d3GqkYuuYSpp3gBdK+rAmYAj52cWEm08dtgurkDbh9rD/t
ykiDqHxh2oCou63Tnjt9qrCdjy7f0AstS7qZABEBAAG0PEtlbiAoaHR0cDovL2tl
bnRhLmJsb2dzcG90LmNvbSkgPGRldm51bGxAa2VudGEuYmxvZ3Nwb3QuY29tPokB
OAQTAQIAIgUCVSuQ1AIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQVh3c
uhva4vddMQgAuDcuimghGXzhazv/S86oCfZ3vtwqh5aZzW8N3rz/0tB0o+hZjgCv
imu/N0m1Hv8IdFOexeHa6SgCuDg8xmRCSiFYgumDi6cQy9XCH4+mCfn5oiu1mmrg
leBnV4gRF0u5m7i4pzoBsdbRU0mmKUnRUV4KKkVEsOpZla48AOdkX4SaRGq8sPft
BRbUUoJf4/HVbZKLvJGqau270NbtHoM+AOe+Pk8X6AaPBl5vA6vep7zxRJayFiBm
mlxN6vU8FoH5sBYTdCrN84h0kQDtFszVoYXl1QEF0ek3LjlrEVJvXBJRGy4ZLnv5
juR7vPk1LhLcq28078ucnHo6Hh3uslFnHrkBDQRVK5DUAQgApksDkc2/iTCaFXbm
o1Ojb9VyOITquEzBjMMY/K58MuSKqw6X3PkzsIWVoUO1binlIWtoBHc8ooeWm2Ve
uhardx1SmpGE17UJZRwe+bPI6AXzGM2vFpa7JZTbFY90rgqIWOVrPbBL2+bZds54
ySX6xuVl70H5+NsJIqcqFH5bHoLQtzLfoqLQrIK4Azmv/2NP/nAVySECznyyQT0n
i25RNgiOjUwfKx2M4ICavQ/T1Of9YhnVSP3Cpz9+kDFV7VKiGCBKSSsahQbLOL3u
4dkG9QKY5vjU+dkjziZehHdNQrI9tMthCSopeLsUZ/ooQTj5IozdRTxVryxSbZKR
Tp0TDQARAQABiQEfBBgBAgAJBQJVK5DUAhsMAAoJEFYd3Lob2uL35CQH/RzXJVor
znGnWE/YPtb3qAFMit0APhOT0WWXCASJYjjzOJDMghjN70kqgPvmdjAL182wBVVz
pjEbZBtPDcEx9YYfKiRC3qCygiSjdKRLfSwHmzFeYzGM6DI4GIsJW+1q8+iA2DUs
lXLuyw8TJjH7jvY/cMhM6IzzdZQNnsaSZ9whCKM0t4yWDSxZ/cLk+ezPwEBJQmNR
+dOCvmLy3PbUlLOxN18gVezyFpHy0Bj6UGQ+hyTj2fp4tKiD9LF4iHsOYJdNcHrH
+CJJajhOKYmAB3RexlHunlLqAiLoGrcJUehi9hO4QnC52VvKa+2AryUKhQPPOZwa
iMz2Xdl91SRrAb8=
=mOe7
-----END PGP PUBLIC KEY BLOCK-----

Messages signed with a site's public key should contain the text Serial: NNN where NNN is a string in the format of a Debian software package version number, roughly epoch:version, where the epoch: is optional and version is a dotted number string.  (Inspired by the serial number of DNS zone records.  Some sites use serial numbers like 20190228 to ensure the serial number monotonically increases.  This would be nicer if dots were allowed: 2019.02.28.)  These serial numbers can be compared as version numbers as specified by Debian.  Having comparable serial numbers establishes a total ordering on messages, useful if later messages need to invalidate older messages.

Here is the first message signed with our public key, giving an example serial number.  This message is otherwise uselessly empty.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Serial: 0.2015.6
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEVAwUBVeD8B1Yd3Lob2uL3AQI1kwgAmUIKTr6rfffyceIsOL2UJ/Zw+bcdsdHt
xlzRVAQUocCZVEMhnloRN6bj2PsEMWtO9PxN4y46EBmVImjPYRTPa3FRdVoB7tL9
Lo/ExpoQ93tkz/zMrhC3siq2dwe2FxehqIU1diqEORT3FKs3D7Zbvb3qunifzihH
QoXOgEg6sQQzjKmUzMpwbt3SV86cpYLNE5GK9aaJLaYKf9yC7FQ9YnA8Dd/pVqBb
OCk4PHW/iRhzbPJhBHTLmL4hBH1rVIqZvXPq4msa13tHxQce+2owzoyCQD9PXPyx
TUFzDCJzoz7eDM12oyBbXRHvyWroT6khavrn/meaWofiKvB9ytAwpQ==
=ZyLY
-----END PGP SIGNATURE-----

We leave unsolved the standard hard problems of Public Key Infrastructure: what to do if your private key is compromised, what to do if you lose access to your private key.

We now turn our attention to the random word generator.

To combat yet unknown countermeasures an adversarial censor might deploy, the digital breadcrumbs provide only hints toward the many possibilities where the next site might be.  Exactly how generated random words specify the site's next location is deliberately left unstated in order to flexibly route around the yet unknown censorship: perhaps DNS, perhaps it should be used as a web search term, perhaps a key into some censorship-resistant medium.  It is the responsibility of the user, the recipient of the breadcrumbs, to try things and figure it out.  The infinite stream of random words provides many choices in case some of the earlier words are not available (perhaps due to censorship).  The public key allows the user to verify, perhaps even in an automated fashion, that he or she has found the right new site.

(Previous vaguely related idea: search the entire internet for the latest "message" signed by a public key.)

(Unfortunately, the random words won't directly be usable as addresses for things like Onion hidden services or Freenet CHK because for those, you don't have the ability to choose your own address.  Freenet KSK might be usable, at least until spammers overwrite it.)

A cryptographically secure pseudorandom number generator (PRNG) is not strictly necessary; however, we use one in the sample implementation below because cryptographic primitives are standardized so widely available and easily portable.

Two unrelated entities (perhaps even adversaries) can use exactly the same random word generator, perhaps because they chose the same seed.  We expect this is only to be a slight inconvenience because their different public keys can be used to distinguish them.  They can also choose different words in the same random stream.

Here is the random word generator for this blog, suggesting a template for others.  The random word generator is implemented in Haskell, similar to this stream cipher example.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

{- This code is public domain. -}
{- Serial: 0.2019.1 -}
{-# LANGUAGE PackageImports #-}
module Main where {
import "cipher-aes" Crypto.Cipher.AES as AES;
import qualified Data.ByteString as ByteString;
import Data.Word (Word8);
import Data.Maybe (catMaybes);
import qualified Crypto.Hash.SHA256 as SHA256;
import Data.Text.Encoding (encodeUtf8);
import qualified Data.Text as Text;
import qualified Data.List as List;

url :: String;
url = "http://kenta.blogspot.com";

main :: IO();
main = putStr $ catMaybes $ map filt27 $ to_bytes $ aes_stream
$ AES.initAES $ SHA256.hash $ encodeUtf8 $ Text.pack url;

aes_stream :: AES.AES -> [ByteString.ByteString];
aes_stream key = List.unfoldr (Just . (next_block key)) zero;

next_block :: AES.AES -> AES.AESIV ->
(ByteString.ByteString, AES.AESIV);
next_block key iv = AES.genCounter key iv 1; {- Although we ask
for just 1 byte, genCounter will return a full 16-byte block. -}

{- Initial counter value -}
zero :: AES.AESIV;
zero = AES.aesIV_ $ ByteString.replicate 16 0;

to_bytes :: [ByteString.ByteString] -> [Word8];
to_bytes = concat . map ByteString.unpack;

filt27 :: Word8 -> Maybe Char;
filt27 z = case mod z 32 of {
0 -> Just ' ';
x | x<=26 -> Just $ toEnum $ fromIntegral x + fromEnum 'a' - 1;
_ -> Nothing;
};
}
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEVAwUBXDtUc1Yd3Lob2uL3AQLb1QgAzeb/UebCk4cb0nSXdMSaSWwItxSbOvXK
qzBqE8EwCsg/uz3ry5MB24nFUd0puO9LqEy0okebCdZqj5qWdPK/PnLZj5Zx+ZG2
sUHNSe7pn6gfJfL9+JDoVLRJaJt2Cn/c4KUT2uC7Xsig6RAhKcIKMytCnU8jDG2P
S60Qdp3rk/GqgK6gHViTrLjckUAuV5nID+pxWzqE3Rx753w0W3wK/5f+giovWizk
CTuPkYGv7eFzO9zPN9tLo4na+MfaKskFpJ3PsAQFpJbcIM+RH80HNhJ06i0jjHuX
Rf2IcPnaBwtLYqjHK8WJ1dnhK2wHVLq2wD0/zNmCs1QPcm1Rbv9E6g==
=DL7c
-----END PGP SIGNATURE-----

Prose description of what the code does: The seed string http://kenta.blogspot.com is hashed with SHA-256, (yielding f78183140c842e8f4a550c3a5eb5663a33706fc07eeacc9687e70823d44511c4), which is then used (unsalted) as a key for AES-256 in counter mode (CTR) to generate a stream of bytes.  The counter starts at the zero block and increments as a 128-bit big-endian integer.  We read each AES output block as bytes in big-endian order.  Each byte is truncated to its least significant five bits (i.e., modulo 32), then encoded into letters by the following substitution: 0=space, 1=A, 2=B,... 26=Z.  Values 27 through 31 are ignored, encoding to nothing.  (They might be useful for future extensions.)  Occasionally there may be two or more consecutive spaces, a rare occurrence which currently has no special meaning but may be useful for future extensions.  For reference, our random word stream begins: d efkabuai yqwrmhspliasmvokhvwhvz fdvhdwhoxkcqfujilomqjubxfzjtug ...

This idea is similar to and kind of an elaboration of the Document ReFinding Key also deployed on this blog which helps in finding, via a search engine, new addresses of individual posts (for which you have an old URL) should this blog move.  Rather than associating just a single random word with an internet resource, we now associate an infinite stream of random words.  The template above can easily be extended to arbitrary URLs, such as those of individual posts.

The technical details were inspired by a mechanism used by some computer virus to call home and update itself, using a deterministic pseudorandom word generator to generate internet domains, one of which the virus author would anonymously purchase and post update code.

No comments :