Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiwallet Feature: Generate Seed from Dice Rolls #147

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tdb3
Copy link
Contributor

@tdb3 tdb3 commented Dec 19, 2022

Added capability to generate a mnemonic from dice rolls. Also added a convenience feature to multiwallet, guiding a user through seed creation from dice rolls.
The feature can be accessed through the menu option generate_seed_from_dice.
The test cases are based on Coinkite's examples (https://coldcard.com/docs/verifying-dice-roll-math).

Happy to tweak as needed.

@mflaxman
Copy link
Collaborator

Very cool, thank you for the PR @tdb3! I'm partial to pulling words out of a hat (vs rolling dice) as it's one less step to verify (dice->words->xpub vs just words->xpub), but I like having support for both as long as it's done in a way that matches other implementations.

I think this should be part of the HDPrivateKey class, perhaps a new HDPrivateKey.from_dice() method? That way it's accessible to developers using buidl (not just multiwallet users) and also so it can have bulletproof unit testing (the testing harness for the CLI apps is admittedly a bit janky). What do you think of moving it there?

@tdb3
Copy link
Contributor Author

tdb3 commented Dec 20, 2022

Sounds good. How about something like the following?

Since HDPrivateKey.from_dice() is essentially a specific example of creating a seed from bytes, a more general HDPrivateKey.from_bytes() is created, which from_dice() calls. HDPrivateKey.from_bytes() incorporates similar arguments (e.g., optional password, etc.) as HDPrivateKey.from_mnemonic(), keeps the HMAC SHA512 approach (for integration of the password as salt, and stretching), and returns the HDPrivateKey as well as the associated mnemonic words generated from the bytes provided (for later re-use).

Modified approach adds dice_rolls_to_mnemonic() to mnemonic.

@mflaxman
Copy link
Collaborator

I'm not 100% sure I'm following what you're saying, maybe it would be easier if you wrote it as a PR and then it will be easy to comment on the code?

My concern with a HDPrivateKey.from_bytes() is developer ergonomics. buidl is very cautious about making it hard for regular uesrs to footgun, and only experts should go outside commonly accepted standards (i.e. bip39 mnemonics). If there's a need for a from_bytes() method, it should at least be a private method _from_bytes(), although python doesn't enforce that it's at least provides some signaling.

@tdb3
Copy link
Contributor Author

tdb3 commented Feb 19, 2024

Rebased, modified approach to address comments from @mflaxman, force pushed.

[
"123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456",
"more matter caught bind tip twin indicate visa rifle angle defense lizard stock cave cradle injury always mule photo horse range opinion affair garlic",
"xprv9s21ZrQH143K4J7isGyg4whNPAMGdfNntamxzPk2qFDS4nNJtNwrZ6vaiv6Jk5JsNyXKfZT5X3QnEMaTg3uvcKszhdgbLxgMh6ETE3nrzPn",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what coldcard outputs?

Copy link
Contributor Author

@tdb3 tdb3 Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it matches the rolls.py script provided Coinkite https://coldcard.com/docs/verifying-dice-roll-math/
Hardware would be needed to compare to the Coldcard itself.

$ python3 ../Downloads/rolls.py 
123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456
8fd128918b2e29d6dcbfa5b9a118e5c16d60498c7ba107922a8eb6eb1d36c112

   1: more
   2: matter
   3: caught
   4: bind
   5: tip
   6: twin
   7: indicate
   8: visa
   9: rifle
  10: angle
  11: defense
  12: lizard
  13: stock
  14: cave
  15: cradle
  16: injury
  17: always
  18: mule
  19: photo
  20: horse
  21: range
  22: opinion
  23: affair
  24: garlic

$ python3 ../Downloads/rolls.py
123456
8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92

WARNING: Input is only 15 bits of entropy

   1: mirror
   2: reject
   3: rookie
   4: talk
   5: pudding
   6: throw
   7: happy
   8: era
   9: myth
  10: already
  11: payment
  12: own
  13: sentence
  14: push
  15: head
  16: sting
  17: video
  18: explain
  19: letter
  20: bomb
  21: casual
  22: hotel
  23: rather
  24: garment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this test vector for 523365252662366? It was the only other one I could find online.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing. Added in 6de0630.

@tdb3
Copy link
Contributor Author

tdb3 commented Feb 19, 2024

Force pushed to address linter error.

@mflaxman
Copy link
Collaborator

mflaxman commented Feb 19, 2024

Thanks for the PR @tdb3!

Why support only 24 words? It's the default/best choice, but I know lots of people with 12 word seed phrases that use passphrases for extra entropy. I don't think this is a requirement for merging, I'd just think you'd want to support it since you already did most of the work.

I added a few small comments, overall nicely done!

@@ -98,6 +98,12 @@ def bytes_to_mnemonic(b, num_bits):
return " ".join(mnemonic)


def dice_rolls_to_mnemonic(dice_rolls):
"""returns a mnemonic from a string of dice rolls (>=99 rolls recommended)"""
rolls_hash = sha256(dice_rolls.encode())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a sanity check that dice_rolls is a string containing only the numbers 1-6. Anything else and it should throw an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added in 6de0630.

[
"123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456123456",
"more matter caught bind tip twin indicate visa rifle angle defense lizard stock cave cradle injury always mule photo horse range opinion affair garlic",
"xprv9s21ZrQH143K4J7isGyg4whNPAMGdfNntamxzPk2qFDS4nNJtNwrZ6vaiv6Jk5JsNyXKfZT5X3QnEMaTg3uvcKszhdgbLxgMh6ETE3nrzPn",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this test vector for 523365252662366? It was the only other one I could find online.

buidl/hd.py Outdated
priv_version=None,
pub_version=None,
):
"""Returns an HDPrivateKey object from a string of dice rolls (>=99 rolls recommended)"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some protection for non-cryptographically inclined developers. From coldcard, "for 128-bit security, which we consider the absolute minimum, you need 50 rolls, and for 256-bits of security, 99 rolls"

I think the best move here is to add a keyword argument allow_low_entropy=False to this method signature. If True, then you can have entropy below the desired limit. If False, then you must have 50-99 rolls (depending on mnemonic length) or it will throw an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added in 6de0630

@tdb3
Copy link
Contributor Author

tdb3 commented Feb 20, 2024

Great comments.
Comments addressed inline. Force pushed.

Updated to support 12 through 24 words.

Overall, decided to simplify things. HDPrivateKey.from_dice() isn't an absolute necessity when dice_rolls_to_mnemonic() exists. dice_rolls_to_mnemonic() is sufficient to provide the mnemonic words that can then be used with HDPrivateKey.from_mnemonic(). This results in a smaller code footprint.

Added additional test cases for 12-word (and other) entries, and edge cases (such as only using values 1-6, string provided, minimum entropy checking). These were double-checked against https://iancoleman.io/bip39/ and https://jlopp.github.io/xpub-converter/

Multiwallet does some basic sanity checking (e.g. that the number of mnemonic words being generated from dice rolls is 12 to 24 words, inclusive), and receives raised exceptions from dice_rolls_to_mnemonic() when improper data is entered (e.g. wrong dice values, insufficient entropy). This way other applications can receive the same error handling through dice_rolls_to_mnemonic().

Below is the output from Coinkite's rolls12.py script for comparison (these are baked into the unit tests):

$ python3 rolls12.py 

e3b0c44298fc1c149afbf4c8996fb924

WARNING: Input is empty. This is a known wallet

WARNING: Input is only 0 bits of entropy

   1: together
   2: mail
   3: awful
   4: cradle
   5: scrub
   6: apart
   7: hip
   8: leader
   9: silk
  10: slice
  11: unusual
  12: embark


$ python3 rolls12.py 
123456
8d969eef6ecad3c29a3a629280e686cf

WARNING: Input is only 15 bits of entropy

   1: mirror
   2: reject
   3: rookie
   4: talk
   5: pudding
   6: throw
   7: happy
   8: era
   9: myth
  10: already
  11: payment
  12: owner


$ python3 rolls12.py 
12345612345612345612345612345612345612345612345612
ee72ae915a4e6ea7ccbeb8e5e5eecef2

   1: unveil
   2: nice
   3: picture
   4: region
   5: tragic
   6: fault
   7: cream
   8: strike
   9: tourist
  10: control
  11: recipe
  12: tourist

Copy link
Collaborator

@mflaxman mflaxman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great PR @tdb3. If I were still working, I'd be trying to hire you!

Added some minor things. I'm holding this PR to such a high standard because it touches private key generation and exposes that to less-technical users (multiwallet), so developer ergonomics and UX are paramount.

if not isinstance(dice_rolls, str):
raise ValueError("Dice rolls must be provided as a string")
if (
len([roll for roll in dice_rolls if roll not in ("1", "2", "3", "4", "5", "6")])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is fine, but can be made simpler (and with a clearer error message) without the list comprehension.

valid_dice_rolls = set(["1", "2", "3", "4", "5", "6"])
for roll in dice_rolls:
    if roll not in valid_dice_rolls:
        raise ValueError(f"Invalid dice roll: {roll}")

maximum=24,
)
if num_words not in (12, 15, 18, 21, 24):
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire method needs a while True block, otherwise providing an invalid entry will break (good), but the UI of that for the end-user starts them over (bad).

'\n(e.g., "14625363...", >= 100 values recommended): '
)
).strip()
return dice_vals
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to handle bad entries (like someone entering a letter, space, a number not 1-6, etc) gracefully

print_yellow(f"\nPUBLIC KEY INFO ({network})")
print_yellow("Copy-paste this into Specter-Desktop:")
print_green(key_record)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a lot of overlap with do_generate_seed(), what do you think about refactoring the code so that the shared code only lives in one place? For example, I noticed that yours doesn't include the footgun warning that is in do_generate_seed. DRYing it out would seem to make more sense.

entropy_per_roll = 2.585
mnemonic_entropy = {12: 128, 15: 160, 18: 192, 21: 224, 24: 256}
# check that the appropriate amount of entropy had been provided
min_rolls_needed = ceil(mnemonic_entropy[num_words] / entropy_per_roll)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically correct (and great to see the code on how it's done), this would be easier to read for developers who are not cryptographically inclined if you included the calculation results in the mnemonic_entropy dict above. Otherwise, if you get an error for an where num_words is 12 you don't know how many rolls are needed for another entry where num_words is 24.

Something like

entropy_calcs_dict = {
    # num_words: [bits_of_entropy, min_rolls_needed],
    12: [128, 50],
    15: [160, 62],
    18: [192, 75],
    21: [224, 87],
    24: [256, 100],
}
# min_rolls_needed is calculated as ceil(num_words / entropy_per_roll), where entropy_per_roll is log2(6)=2.585

@mflaxman
Copy link
Collaborator

Also, this is a substantial PR! Let's bump the version number in setup.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants