jb6113 5b047e193e updates | 1 year ago | |
---|---|---|
README.md | 1 year ago | |
bips-39-eng.txt | 1 year ago |
Private keys are either 132bits or 264bits long. The BIP-39
would have one store this long number as a sentence of 12 words for the 132bit and 24 words for the 264bit length private keys. A strategy we could use to generate a list of words to use for BIP-39
might be to pick some number of words, give each an index 0..1000 or whatever. Then assign some number of words until we have 132bits of information. Well, BIP-39
is already a standard, everyone agrees has 2048 words. The number 2048 needs eleven bits to write in binary. And eleven evenly divides 132 into twelve. Perfect.
One more thing, the 132bits has 4bits of checksum, so to be accurate, the private key is only 128bits, but since everyone agrees to encode the checksum as part of the private key we still use twelve words. The last word just contains a throw-away checksum and 7 bits of private information. Also just because I said throw away, it doesn't really matter, but don't be passing your checksum around publicly.
Regarding Bitcoin, the 128 bits are meant to be entropy, or random data. And why that number of bits? Because 128 is a nice 16 computer characters (char), a character in computer terms is an utf-8
encoded byte. A byte is 8bits. The 16 chars cannot be printed for humans to read, for example, the NULL char (the first char 0x00) represents a non-existent value. How do we write that down. Under ASCII (est 1961), the predominant standard for converting chars to human understandable concepts, says that the BS char (the ninth char 0x08) represents a backspace. Any ideas for how to represent a backspace? Back to the 16 bytes or chars or whatever you want to call them. They are random, or as Bitcoin would call them, entropy. They represent an amount of information that is impossible to guess assuming the chars are truly random. One can guess one of 256 values. That is the first char. It is perceivable that one could guess two of 65,536 values, and maybe the determined could guess three of 16,777,216 values. At this point you are probably going to invest into using a computer. A computer could easily win the Power-ball lottery guessing all required numbers beating the odds one to about 300 million (a 4 char probability). Computers continue to be useful due to their extraordinary computing capability, but there are limits. A computer is only going to perform eight billion operations per second. Back in 2011 computer processors stopped doubling in speed at around 8Ghz. Now they put more computer processors in a computer to make it faster. In 2023 you might have 12 processors in a computer. Anyways with eight billion operations and several processors at your disposal you can win a lot of lottery prizes. But for guessing a private key, how far can we go? The lottery example is harder than a 3 char probability, but easier than 4 char. At 4 chars we actually need to guess four of 4,290,000,000. Let us just count the zeros, the numbers are getting too big, 4 chars is 9 zeros. The CPU can perform 8Ghz, pretend you and some friends pool your processors and you have 1,000 processors. That gives us 8,000,000,000 x 1000 is twelve zeros. Yay, we can guess 4 char Bitcoin private keys at a breakneck speed. How about 5 char? Barley, but yes we can do it. We are at five chars of 1 trillion possibilities. You know what is next right? Six char private keys. At this point to keep up we need to find 200 other friend groups that have 1,000 processors that will commit to the cause, probably doable right? Seven is next with 72 quadrillion possibilities. You will need 9 million processors. Not a problem. There are billions of processors in the world. We are still good. Eight char private keys are next with 18 quintillion possibilities and a shopping list for almost every single processor in the world in 2023. We can make more processors. And besides, we have 67% of a 16 char private key. Given several decades and the entire world producing processors for you, you might be able to continue successfully guessing private keys to 12 char private keys. It is doubtful that one could get the whole world to commit all processors to the cause and that you'd be able to keep them going for a hundred years but if that works for you, you can say you are able to guess 12 char private keys. Good job! While the number of possibilities a 16 char private key compares to the number of atoms in the observable universe we will never be able to guess a private key, but we can still exploit someone who is using one. The number of atoms in the known universe is a number containing between 27 and 85 zeros, while a 16 char private key has 75 zeros. This matters if we are using atoms to build our processors. If someone uses their private key to perform a Bitcoin transaction an adversary would have 10 minutes in 2023 to attempt to use some of the information in the transaction to guess a much easier key. A theoretical 2 million qbit quantum processor might be able to sign that transaction and fool the Bitcoin ledger into recording the adversary's record. A quantum computer does not use atoms to build it's processor. Fancy. In 2023 IBM theorized a 1,000 qbit processor. If nothing changes to the 10 minute transaction window, or users are unwilling to change how they sign transactions it is feasible to exploit transaction signing using a quantum processor. Worse, given a larger quantum processor and a modified Shore's algorithm the private key can be derived using the public key used in any transaction regardless of when that transaction occurred. These theoretical issues are only a problem if users allow them. I only detail them here because private keys are complicated to one whom has never made use of them. What we talked about are the facts that a private key is a random number in a space where that random number could be one of a extraordinarily large value, so large that a computer would not be able to guess your number given all the resources of the universe. A quantum computer does not use atoms and could therefore guess your private key. This fact does stand up to the argument that one should not use a 16 char private key for Bitcoin, but that is irrelevant for why Bitcoin chose to use a 16 char private key. I want to start a new paragraph explaining this.
If a quantum computer exploits a single transaction, or could be used to derive a private key from a public key, then why would anyone continue to use Bitcoin? Users would not stand by this and would adapt whatever scheme currently in use to prevent exploitation. Instead of recording assets indexed by a public key derived from private keys one might instead record assets using a key that does not have a quantum computer weakness. Weird. Bitcoin technology includes a public ledger. One cannot just use a quantum computer to go back in time and rewrite what has already occurred. They could affect future writes to the public ledger, but once the toothpaste is out, users are going to draw a line in the ledger, Q-Day
, and stop using the recording methods that are then exploitable. Though, a quantum computer could create a time travel vehicle and go back to before that line was drawn to do what it pleases. Sorry, maybe it is believable that a sentient quantum computer could time travel and that the only problem one would need to worry about is their financial value. What would new transactions be indexed with? There are several cryptography possibilities, eg: even larger private keys based, lattice-based, multivariate, new hash based, code-based, or isogeny-based schemes. Existing users would need to claim ownership of their private keys under the new scheme before Q-Day
or somehow prove ownership of a private key if Q-Day
had already occurred. Why is this important? It is to prevent fear, uncertainty, and doubt being used in a way to manipulate users into believing that a quantum computer is a bullet that can effectively kill Bitcoin. To be clear, it is a bullet, that might exist, that could disrupt the use of elliptic curve digital signature algorithm as a means to record transaction data in a public ledger. It cannot rewrite past transaction data, and it cannot prevent users from using a different scheme to record transaction data. If users continue to use a technology while an exploit for that technology exists then, okay yes, one should worry about pieces of the sky falling. When quantum computers are capable of exploiting the Bitcoin transaction signing scheme, probably one should update how transactions are signed. Later when a quantum computer can derive a private key from any public key, one should use different private keys. Probably. Assuming one wants to maintain the self sovereignty feature of Bitcoin. If you don't care about that then no changes are necessary.
If you roll two six-sided dice together and sum their outputs you end up with a Gaussian distribution after rolling n times. That is not random. How does one remove the bias for the rolls to tend towards a result of seven? Roll two six-sided dice, A and B. Choose which die is A
, which is B, before rolling. That is important to remove human bias. Take die result of A
, if odd take its value as one, otherwise as two. Take the value of B
as is. Input these values into an equation: (6 * (A - 1) + B - 1) + 1
. The result will be an evenly distributed random number between one and twelve. Roll sets of two dice recording the results to validate. This equation isn't simplified, the B - 1 + 1
is redundant, it is just meant to shift the results from 0..5 to 1..6.
Rolling six-dice will get us the information we need but we are going to need to do a bunch of maths and rolling. If we only we had a die that held eleven bits of information. That would be a 2048-sided die. A six-sided die holds 2.25 bits. Yep two and a quarter. That is weird. Well, two bits is 0..3, a four-sided die, and three bits gives 0..7, an eight-sided die. Six-sided die is three quarters of three bits, 2.25 bits. Lame. If we did a bunch of maths we might be able to pull our 128bits from about 57 die rolls.
We are going to take a quicker and less maths intensive approach. For one, we aren't going to try to carry fractional bits between rolls/words. We are also going to use die that contain more information. We assume that our users either play DND or MTG. What I mean, is that users have access to a twenty-sided die, D20
. A D20
technically also carries fractional bits, but we are going to trim those fractional bits. We get four bits per roll if we re-roll whenever we get a value 17 or above. This is okay to do. Roll a D20
several times recording any value under 17. This will prove even random distribution of values 1..16.
Roll a D20
three times, if you want to have three D20
die, label them A, B, C. Otherwise the first roll is A, second B, and third C. We will use four bits for die roll A and B. This means roll a die, eg: A, and re-roll if the value is 17 or more, take the value if between 1 and 16. When we get to die C, we only need three bits, so re-roll until we take a value between 1 and 8.
One more time:
D20
, roll it for 4bits of info, re-roll if the value is 17 or more, stop and keep the value for A if between 1 and 16D20
, roll it for 4bits of info, re-roll if the value is 17 or more, stop and keep the value for B if between 1 and 16D20
, roll it for 3bits of info, re-roll if the value is 9 or more, stop and keep the value for C if between 1 and 8Now we do a little bit of maths to get our 11bits of information. Our equation is going to be ((8 * ((16 * (A - 1)) + B - 1)) + C - 1) + 1
. Simplify this equation to 128A + 8B + C - 136
.
Let us do an example. We roll die A and get the value 13. We get 2 for B. And 8 for C. We didn't not need to re-roll A, B, or C because both A and B were under 17. C was under 9 and did not need to be re-rolled. We calculate, 128(13) + 8(2) + (8) - 136
and get 1552. Our first 11bits is done.
Assuming we do the example ten more times, we will have eleven sets of 11bits stored as integer indices. Or numbers. Whatever you want to call them.
# | A | B | C | result | BIP-39 |
---|---|---|---|---|---|
1 | 13 | 2 | 8 | 1552 | |
2 | |||||
3 | |||||
4 | |||||
5 | |||||
6 | |||||
7 | |||||
8 | |||||
9 | |||||
10 | |||||
11 |
So the last one is a bit odd. Calculating the checksum is extra hard, technically possible to explain in a day, and possible to calculate by hand given several hours. So we just won't do that. The checksum is deterministic, so the benefit of the checksum is probably only useful for computers when they are trying to help humans. So really not valuable for us meat-bags. Later when the 4bit checksum is calculated it can be combined with 7 random bits giving us a nice round 11bits. But we still need those random 7bits. We have several 11bit results and the scheme we used is only useful for generating random 11bits. How do we make random 7bits? We roll a 128-sided dice of course! Not again... We can just do this:
D20
, roll it for 4bits of info, re-roll if the value is 17 or more, stop and keep the value for A if between 1 and 16D20
, roll it for 3bits of info, re-roll if the value is 9 or more, stop and keep the value for B if between 1 and 8(8 * (A - 1) + B - 1) + 1
, or simply 8A + B - 8
This will give us a value between 1 and 128. We are not going to use this index the same way we will use the previous eleven BIP-39
. We need to add the checksum before we are able to do this. And calculating the checksum is a major pain. Knowing the last 7bits of our private key is enough for now. A computer can trivially brute force the last 11bits. Only 128 words will be valid as the twelfth word, and while we could easily access all of the 128 valid private keys, we know which of those is ours. Or, we could just say we have 128 private keys. Fun.
Once we calculate the eleven results and the twelfth smaller result we should convert the eleven results to BIP-39
words. If we can perform the checksum, using the secure hashing algorithm, SHA-256
then we could combine that with the 7bit twelfth result and determine our last twelfth BIP-39
word.
# | A | B | C | result | BIP-39 |
---|---|---|---|---|---|
12 | XXX |