The world would be a better place with free implementations of some algorithms available as a public good. Which algorithms? Can we measure which algorithms are most needed perhaps by some market mechanism? How much of a better place, i.e., how much public funding would it be worth it to spend?
Is free gratis all that is necessary, or how important is free libre? This might be one of the rare cases in which a black box is fine. How often will users need to modify the implementation, or examine its internals (perhaps to check for security properties)? Ideally we want formally proved correct implementations.
How important is the implementation of the fundamental part of the algorithm versus the "last mile" of binding it to the user's programming language and data structures? Perhaps we still need fundamental development in software engineering to support such a library. Do domain specific aspects of the problem dwarf the fundamental part?
An easy API has "batch" input, where all the input data are provided at once. But sometimes we want a "online" algorithm in which the input is streamed in, and sometimes wanting output before even all the input has been given. Sometimes we are solving multiple similar instances of the same problem and want to take advantage of the similarity for optimization.
Will we die a death of a thousand cuts in the numerous special cases?
Should it be a clean implementation designed for modifiability or a highly optimized one difficult to understand? Again, we may still seek advancements in software engineering. Or we might say, implement the simplest version within a constant time factor of optimal; i.e., constant factor improvements that make the code more complicated are rejected.
There are many different architectures an algorithm might need to run on (e.g., parallelization). Which should be targeted?
There seems to be a dichotomy of implementations for "production" or for "exploration". I have certainly had fun exploring number theory with Pari/Gp. Mathematica also claims to be good at the latter. For exploration, UI matters a lot, less so than performance, and we want things to robustly work despite special cases.
Is there an incremental approach toward attacking this problem? An incremental approach is already happening, in a highly disorganized fashion. Is this disorganization optimal? It might be, given how vague the problem is.
LAPACK fostering Matlab is the golden example.
No comments :
Post a Comment