Brute Force BIP39 Passphrase Recovery. (25th Word, Hidden Wallet) Trezor, Keepkey, Ledger


so today we’re just going to do a short
follow-up on a video that I’d previously
done on passphrase recovery so this is
the process you can use to recover your
BIP39 passphrase if you’ve
forgotten it or maybe it’s called your
twenty-fifth word or the password for
your hidden wallet depending on the
hardway wallet you’re using and all of
the notes for what you need to start
this time are the same as my last video
on it so you you will need to have a
correct twenty four word seed you will
need to know one address that was used with
your wallet so let’s say for all of
these examples you know we’d purchase
some Ethereum on coinbase we looked at
our emails and we logged into coinbase
to see where we had sent it to like all
the other ones you need to download BTC
recover create an air-gapped environment
if you under the securely and I’ve got
some videos that show you how to do that
so once you’ve set up your environment
there are two general approaches we’re
going to look at and the first one is
basically a 100% brute force based
approach so that’s going to be where we
have a token file that’s full of just
individual letters and we’re just
putting them together in different
combinations to brute force a password
and that works okay for shorter
passwords which some people might have
used particularly if you’re just
interested in like a plausible
deniability
style passphrase setup the second thing
we’re to look at a different types of
dictionary attacks and these can work
effectively for long complex past
phrases we can use a few different types
of dictionaries with different types of
words in them and we can also start to
try different combinations of words in a
sort of brute force style so we’re gonna
look at two different ways that we can
use that so firstly via a password list
file so that’s where it just runs
through the dictionary basically from
start to finish and you know it could be
a dictionary list it could be a list of
frequently used English words it could
be a password dump that’s been leaked or
compiled from previous website hacks so
there’s like a rockyou list and a PHPBB
list that we’ll be referencing later
which can be really useful because you
know they have a lot of passwords that
people commonly pick and you know it
could be something like using the diceware lists if you’d used diceware
or even just making a list that’s full
of you know names family members
it’s passwords you often use all those
sorts of things and you can download
some of them from here and that website
will be in the description so the other
thing we’re going to look at really
quickly is the way that you can use
dictionaries sort of in a brute-force
kind of way so you can look for multiple
words that have been connected together
and this does hit practical limits once
the dictionary start getting quite large
you know for example I’ve said here you
know you can recover something that’s a
three dice words off the short word list
that they provide but you know any more
than that things start getting
time-consuming and the important thing
though is if you look at my previous
video there are rules that you can give
around these token files will speed
things up dramatically so you can help
give BTC recover some guidance around
the kinds of characters that might only
appear between words might only appear
at the start might be at the finish and
it comes down to I guess you thinking
about the kinds of passwords and the
kinds of phrases that you would have
strung together the other thing that’s
really important to consider when you’re
doing brute-force attacks like this is
the kind of hardware you’ve got at your
disposal and for this these tests I was
using a mix of dual core and quad core
i5 processors and also spun up a 48 vCPU
Linode and the reason I added that Linode
in there is that you know you can
buy as a consumer processor now these
Ryzen and threadripper CPUs and when I
ran the test and I’ll show you in a sec
you know this Ryzen process would be
about ten times faster than the i5 that
I was using in these tests and they can
make a huge difference if you have
something that might say it’s going to
take you a year to brute-force on like
an i-5 you know you could be looking at
a month with some of these more high-end
processes and that’s before we even
consider GPUs or FPGA s so you can see
here that the performance is pretty
consistent so you know this is running
the same test on the laptop so it had an
ETA of two days the desktop and ETA of
one day and as you can see it was fully
utilizing all the cores and then on the
Linode we can see it was going to take
a whole grand total of three hours with
all 48 vCPUs just running flat out so
that’s an important thing to consider
when you’re just looking at these
numbers and trying to work out what is
doable in your situation
so we’re going to look at some examples
and for all of these examples we use
this same 24 word seed phrase here and
we’re trying to recover some different
addresses so the first one is looking at
a short password that we’re just going
to brute-force so this is a short four
character password I did do tests of you
know one two and three character
passwords with our brute force so
quickly that it wasn’t even worth
showing so this is one where we’re
looking for the word coin and these are
the commands we’re using with BTCrecover here so I’ve not worried about
any typos for BTC recover to do it’s
just going to run through all the
different combinations of letters and
not worry about uppercase or lowercase
and we can see that it worked it out in
a minute and the full run would have
taken an hour and that’s the command
there all of these commands will also be
in the description down the very bottom
as well if you want to copy and paste
them and try them yourself a second
similar test is saying or maybe we
weren’t sure about some capitalization
so we would have run the same sort of
test again looking with the same token
list and we are didn’t hear this typos
case and typos once so it’s going to do
all the tests before but also going to
run through and capitalize the different
words and we can see that that test took
five minutes to run now the thing that’s
really important to understand when
we’re using BTCrecover this way is we
need to tell BTCrecover how many
letters we’re wanting to try string
together and the way we do that is by
using this Max tokens command here so
basically we’ve said max tokens 4 and
because we’re using that with a token
file that’s just individual letters that
means it’s going to try a maximum of
four letters together likewise here
again that’s still only four tokens that
were using together whereas when we get
to some of these ones where we’re
chaining together sets of tokens out of
dictionaries we need to tell it how many
individual words we want to string
together so that only matters when we’re
using a token list not a password list
and we can see here that was the five
letter recovery so we had five tokens
and likewise with these later tests
where we’re stringing say two and three
words together we’re setting the max
tokens to be two for the two word one
and three
for the other you really need to make
sure you’re setting this max tokens
command otherwise you’re just gonna end
up with like stupidly long and complex
tests almost straight away
the other thing you’ll notice is again
this one down the bottom here for three
words strung together you know it would
have taken a day on the desktop CPU and
three hours on like a Ryzen threadripper or a Linode 48 core system
so the token list that I used for
example 1 & 2 is really really
straightforward it’s just basically a
text file so tokens-BF-double.txt with just
one letter per line and it’s just a to Z
0 to 9 and I’ve just gone through the
alphabet twice and again you just need
to work out whether using dictionary
words or using completely random words
and letters and whether you’re including
maybe special symbols in there or not
because again the more of these you add
in the quicker you’re going to get into
the realm of pass phrases that are
difficult to brute-force but again if
you’re thinking you were likely just to
use you know normal alphanumeric sort of
stuff then you know a list like this can
be perfectly acceptable for that so the
other thing to consider is if you’re
wondering which letters and which
numbers and which symbols to include on
here it’s worth having a look at your
hardware wallet and seeing what it
supports so for example the Trezor
allows you to just enter in with your
keyboard on your computer the passphrase
every time and you can use just about
whatever symbols you like whereas with a
ledger wallet because you’re entering it
all on the device all the time they
actually only offer a much smaller set
of symbols than is actually possible
within BIP39 passphrases so yeah just
have a look and see what are the valid
passphrase you could have even entered
in in the first place because it might
be a smaller character set than you
think
so we’ll just go into example 3 and
example 4 and essentially what we’re
looking at here is sort of where you get
to that boundary between where a
brute-force approach will work versus a
dictionary approach and we can see here
for this next test we were looking for
the word Smith you know super common
last name there’s no reason why someone
might not have just picked that as their
passphrase and we’re still using that
same tokens bf let’s double that I
talked about earlier
we’re also saying well maybe there’ll be
a cap some capitalization in there
because it’s a last name so we run that
test and this is where we have a few
different things so firstly I ran it via
brute force with that same dictionary we
see before it took you know four five
hours but it could have taken five days
bearing in mind this is on an i5 desktop
so what you could achieve with a you
know multi-core monster is a lot more
would have taken a lot less time about a
tenth of the time we can also see that
on a dictionary attack this is now we’re
using English txt so this is just an
English dictionary that I got that took
about 32 minutes to run and the other
dictionary that I used for this one it
was his password list is this password
list called RockYou comes with some
various security focused distributions
of Linux and you can see it’s basically
a whole list of passwords that people
have been using on websites and things
that were leaked or hacked and it
includes some fairly long and complex
passwords that you know you’d be really
unlikely to get using you know brute
force based approaches but you know this
this has got like huge huge number of
passwords in there and you know if you
just happen as lots of humans do to have
picked something that you thought was
really random that just happens to be
the same thing that a whole bunch of
other people just picked because they
thought it was random you might very
well find that you’re super duper secure
password is actually on this list and
it’s really worth just checking it as
well because who knows you may have just
thought the same as many many other
hundreds of people before you in picking
a password so that’s RockYou so we can
see that rock you found Smith as well
capitalized in a few seconds and so what
you can see we’ve done is also for this
password list you can use that in
conjunction with having typos that BTC
recovered we’ll check so what we’ve done
here is we’ve said for each row in the
password list file it’s going to assume
that two of those letters might have
been an uppercase letter and doesn’t
just have to be the first letter
or the last letter, any two, I’m just going to run
through and test all of those so we can
see an example here of that dual
capitalization happening here with the
password’s a bit longer and just
YouTube for the sake of it and we can
see that we ran it with this command
against the rockyou list and that’s
what it came up with so it found it in
just a couple of minutes but the full
length run could have taken a couple of
days so I thought we’d do another test
as well which was just the correct horse
battery staple running that against the
tokens English list just just to
demonstrate that but that ran out of
memory and didn’t work so it’s important
to realize that once you start using a
big long dictionary as a token list you
very quickly start hitting both memory
and computation limits so three megabyte
token list file is just too long for
that sort of thing and we’ll see the
practical limits of that in a minute the
other two examples we have our examples
where we’ve used a dictionary list so in
this one we’ve used the token list zero
to a thousand and had a password that
might have strung two words together
that are on that list so correct
question and we can see here that that
ran and it found the result in a couple
of minutes and basically on this one
though it didn’t take nearly as long
because we weren’t bothering to check
for capitalization or anything like that
we’re trying to illustrate connecting
multiple tokens together usually the
last example that I’m going to show is
it’s basically looking at three words
together and we’re using the Google top
500 English words for this one just to
make the dictionary a bit shorter just
to illustrate the point and we can see
that that ran on the laptop it took
about two days to run so that would have
taken a day on the desktop and three
hours on the Ryzen CPU and I’m gonna
do another video on this but I think it
really illustrates the importance of
using a long dictionary if you’re going
to be just chaining words together if
you’re using a short dictionary file
like even the dice where short list you
really need to select a decent number of
words out of there before an attack on
that phrase becomes impractical that’s
probably a really good place to talk
about the computational limits for some
of this stuff and it’s important to
understand that possible passwords when
using a password list like RockYou or
something like that they scale in a
linear
fashion if you’re using a password list
that’s twice as long it’ll take twice as
long to run if you’re using a password
list that’s ten times as long it’ll take
ten times as long to run so time is
basically the speed of your system times
the list length once you start using a
token list the possible time for it to
process scales exponentially so what
that means is that if you double the
size of the list the processing could
potentially take four times as long if
you increase the size of the token list
by ten the test will take 100 times
longer to run the other thing to
understand is the token list increases
exponentially with the number of tokens
you are looking for so for example a
four word password with a hundred word
dictionary will have 100 to the power of
four options a five word password with a
hundred word dictionary will be 100 to
the power of five so what we can see is
that adding one more word by increasing
the length of a phrase by 25% it will
take 100 times as long to be processed
I’m gonna look more into that my next
video that looks at selecting a good BIP39 passphrase
so my suggested workflow for this would
be number one to set up the recovery
environment to make sure that you know
how to make it work and to try running
through some of the examples that are
provided here and the reason why I say
that that’s a good thing to do is just
to make sure you don’t waste a whole
month running a test that’s actually
never going to find the result you want
simply because you’ve got a typo
somewhere or have misunderstood some of
the arguments for BTCrecover i’m also
going to include the password lists that
i used for this on github so you can
actually reproduce all of the tests that
i did here just to make sure you’ve got
a firm grasp on how to use these tools
before you start using them on your own
wallets and phrases
my suggestion is again if you’ve got no
idea what your passphrase was start with
trying to brute-force something short
because you know you might have just
picked a password that was easy to
remember that was mostly about plausible
deniability and you can create a token
list with letters in it or I just
download the one I’ve put on github my
next suggestion would be to try a couple
of different dictionaries just for
dictionary attacks like the RockYou lists
just as a password list and also if that
doesn’t give to you a result create
yourself a token list with every
password every family member name every
nickname friends pets all of those
things
and stuff that you might use in
passwords and start doing a multiple
token sort of attack on that so if
you’re someone who might have strung
together two or three of those kinds of
things to make a password that can be a
really common thing and you have to look
through the password lists in things
like RockYou to see that a lot of
people do that the other thing I’d say
is we really selective about the kinds
of typos that you wanted to try and have
BTCRecover check in my previous video
we’re sort of assumed that you had a
pretty good idea about what your
passphrase was and you were really just
trying to recover from a typo or a
slight misremembering or something like
that whereas if you’re trying to do
brute force and dictionary based stuff
having BTCRecover run through and like
check the missing letters check for
letters that haven’t pressed twice check
for caps lock and all that sort of stuff
can be extremely time-consuming so just
be really selective about the kinds of
typos you want BTC recover to check the
other note that I think I would add is
concerning cloud servers you really need
to understand the risks so if you’re
presented with a test that you are
pretty sure will recover your passphrase
but it’s going to take months you know
paying for a cloud-based server or
finding someone or buying a high-end CPU
to run these tests can be cost effective
I just use Linode and I’ve put a link
to them in the description it’s just
important that you understand that the
process of using BTC recover exposes
both your 24 word seed and
potentially a passphrase onto the system
you’re running them on so if you do use
a cloud-based server to do these kinds
of recoveries one of your first
priorities should be to move all of your
funds onto a brand new wallet so brand
new 24 word seed brand new passphrase
because you should assume that the ones
that were exposed to this hired server
have been compromised or at least could
very easily be in the future I think
it’s important to acknowledge from the
outset that trying to do a brute force
recovery with your passphrase should be
considered a fairly low odds play and
it’s not something that has any
guarantees of success at all but it is
also a good thing to do just to help
satisfy yourself that you’ve done
everything you can to try and recover
your funds
thanks for watching I hope that was
helpful just hit subscribe if you’d like
to be kept in the loop about future
content I make to help people stay safe
in the crypto space and to recover if
they get
trouble or if there’s a question you’d
like some more information about or
topic you’d like me to cover in the
future just leave a reply

2 Comments

Add a Comment

Your email address will not be published. Required fields are marked *