Wednesday, February 11, 2015

[nlxvloep] Machine learning cryptanalysis

Apply the most powerful machine learning algorithms backed by the most powerful computers against cryptographic primitives.  It is a shot in the dark, but has anyone tried this?  Cryptographic notions of security are often defined in terms of games in which the attacker tries to guess just one bit with greater than 50% accuracy.  This aligns well with machine learning.  And overfitting won't be a problem because we can generate practically an endless supply of new examples.

The inspiration is instances of when a computer is tasked to develop an algorithm, and an immense amount of resources are thrown at it, there have been fields in which the computer succeeds, emitting an algorithm which works but is beyond human comprehension to understand why it works.  The computer works in mysterious ways.  This mysteriousness suggests a computer can be smarter than the smartest human in certain fields.

Famous examples of mysteriousness: chess endgames (not machine learning but still mysterious), parallel sorting networks (mentioned in Knuth), tuning Threefish / Skein.

Multilayer neural networks can certainly reach incomprehensible complexity.  Support vector machines, Bayesian networks, genetic algorithms, and genetic programming also.

Do machine learning researchers fear outputs they cannot understand?

How large a, say, neural network with manually set weights does it take to compute (not learn) a known cryptanalytic break of a weak cipher?  That can be the size of an untrained network seeking new breaks of a currently unbroken cipher.

Even learning how to compute a cryptographic primitive in the forward direction, i.e., replicating the primitive from scratch just from examples, would be impressive.

We imagine we are getting a computer to do a cryptanalyst's job.  Computers often do well in artificial intelligence when the task is clearly defined: recognize this digit, win this chess position, predict this bit.

Even if unsuccessful, the infrastructural lessons learned in doing giant machine learning, perhaps larger than ever done before, will be useful.  Perhaps it can be commoditized.

No comments :