It’s been more 10 years since I got my first job working with scientific software for simulation and data analysis. In this post I will share some of my views about the scientific software industry.
In my first job I was working as an intern in an industrial R&D center. As I’m not sure of what I can publicly share about my experience, I will be omitting many details just to be on the safe side.
I was part of a team in which we used a commercial simulation software. However, as our use case was pretty niche, we needed to develop certain customizations to suit our needs.
While this commercial software offered great value for the company I was working for (saving countless of man-hours of software development time), there were a number of problems and limitations we encountered:
- This commercial software was very easy-to-use in some predefined cases, but quite obscure to customize, and very hard to integrate with external tools. Also, new releases would introduce breaking changes which specially affected our customization efforts.
- The software was based on a fine GUI which offered well-integrated but a largely outdated mathematical core, leading in some cases to extremely slow simulation times.
- There were occasional bugs here or there.
- Getting our model out of the software was nearly impossible, thanks to proprietary and undocumented file-formats, and other joys of vendor lock-in techniques.
What should be offered to users of scientific software?
Given this experience, I think the value proposition that should be offered to users of scientific software is as follows:
- Start with something easy-to-use. This can be of course challenging for novel implementations of cutting-edge techniques.
- Make it easy to customize, at least for a subset of power-users.
- Let users be the real owners of their work, for the long term. (or in other words, just don’t be too greedy and avoid predatory vendor lock-in tactics).
With commercial software the first feature comes guaranteed: you start with something easy to use, indeed. But you have to give up substantially on customization**, and accept that some degree of vendor lock-in with take place, so it will be hard to switch later on.
Academic open-source software, on the other hand, solves the second and third of these issues, ensuring transparency, reliability, and long-term value for users. In particular:
- Different open-source packages can be composed with each other
- File formats are well known and standard import/export is available
- It is based on the latest research, so the underlying methods are faster more accurate
- There are fewer bugs, as there is a substantial amount of testing and bug-fixing by the community.
But of course, there are limitations to open-source scientific software, the main one being that it is generally harder to use, to the extent that it could be even hard to install.
Roads for sustaining open-source initiatives
Now, there are challenges in making a high-quality open-source scientific software project viable beyond academic use. Many interesting alternatives have been proposed over time in order to make working on those projects was financially sustainable, each one with its pros and cons:
- Government support via grants. Of course, open-source projects can attract government funding via grants. But grants are limited, highly competitive, and usually limited to very early-stage development.
- Donationware. This is when developers ask for donations. It might work for Wikipedia, but the record shows that most projects that follow this path are severely underfunded.
- Sell paid support and training. This is perfectly fine, but when a project starts to get funded solely by this means the incentive for making the software easy to use quickly vanishes. In fact, some of the projects which have followed this path have become what I call “propietary obscureware”.
- Consultancy Services. The main developers of an open-source package could sell consultancy services that go beyond support and training, like customizations for specific use cases, integrations with their client’s propietary software, or even work as a third-party R&D agency that can provide results for a client based on open-source software components.
- Offering premium features, or in other words, going for an open-core approach. This is challenging to get right, as the commercial components can get in the middle of the open-source development. “Community edition” open-source also tends to fall under my definition of “propietary obscureware”.
- Dual-licencing, that is, having the open-source software released under a restrictive license like GPL or AGPL, where the original creators of the software retain the right to sell commercial licenses. This includes asking for non-core collaborators to sign CLA that weave all rights in favor of the founders, among other disincentives for wide-scale collaboration.
- Offering an official SaaS product. This could be combined with a dual-licencing approach (like MongoDB does since AWS started competing against them).
What is the Julia Computing business model?
Let’s look at what some of the founding members of the Julia programming language have done. What they do and what they don’t.
I believe they attempted pretty much all the above methods at once, but at the same time making an effort so that nothing gets in the way of the open-source ecosystem.
- Donationware: The Julia Project is listed in Numfocus, a foundation that raises tax-deducible donations for scientific computing software.
- IT Consultancy and enterprise support. Importantly, this is only one of multiple sources of income for them.
- Premium packages: they offer some tools for finance and big pharmaceutical companies. Importantly, there seems to be no conflict in the use cases of the general public (like using Julia for research) and that of the commercial users (of a Bloomberg API for example).
- Simulation Agency and official SaaS. They are offering end-to-end solutions, for example to the pharmaceutical industry in the form of a SaaS.
- Cloud computing. The approach to leveraging the cloud has an interesting twist. They have built their own specialized cloud. This is capital intensive, though, and probably impossible to replicate without VC funding. They sell access to this cloud via JuliaHub
- They’ve also experimented with Dual Licencing, offering JuliaPRO, which was non-free and closed-source, but eventually dropped it as it was too much effort to sustain even the minor existing differences with the main branch.
So maybe we can draw a conclusion here, on what is a good approach to sustain open-source scientific software: try everything at once (maybe forgetting about dual-licencing and sticking to a permissive MIT license), but making sure that there are well-separated use cases for the open-source and the commercial components or associated services.