Open Source in the Cloud
I have always been a big fan of web (or cloud) applications, because they make it easy to switch freely between computers and between platforms. With web applications, your applications don’t tie you to any platform, leaving you free to choose whichever platform suits you best. Recently, though, a Linux.com article and a post on a Clipperz blog got me thinking about how open-source and the cloud go together, or don’t go together.
As I first began to think about it, it seems putting the two (open source and web applications) together would be a challenge, but I soon realized that they are, in fact, a natural fit for each other.
Perhaps the best example of open-sourcing a web application is Reddit, which recently open-sourced all of their code. The biggest argument against open-sourcing code with an application like Reddit is that everyone can copy what you have done. That is true, but anything based on, for example, the Reddit code must be released back as open-source, assuming the original code was released under a GPL-like license that requires this. So for an application (if you call it an application) like Reddit, open-sourcing seems to make a lot of sense. This basic logic can also be applied to most other more traditional web applications, such as word processors.
In the case of partially or fully businesses-oriented web application, a further step could be taken by selling a subscription update service for businesses that wanted to host the application themselves.
You could pretty much summarize the last two paragraphs by saying that the arguments for open source in web applications are essentially the same as the arguments for open source in traditional applications, but there is one more (or at least one more) reason for open-sourcing web applications:
One frequent concern with web applications is privacy. How do I know that Google is not reading everything I make in Google Docs? Done correctly, it should be completely possible to encrypt the data before it leaves the client’s computer such that Google (or <insert name of web application company>) cannot possibly read it. Even if a web application provider claims they have done this, though, how do you know for sure? The answer is that you really cannot - unless the application is open source. If it is open source, tech-savvy people will almost certainly go through the code and ensure that your data is safe, but if the application is closed source, you just have to trust the company.
So for these reasons, I see the aruments for open source in the cloud as even stronger than the arguments for open source on the desktop. I just hope that the web application developers agree.


July 17th, 2008 at 11:35 am
> “Done correctly, it should be completely possible to encrypt the data before it leaves the client’s computer”
Yes. The pattern is called Host-Proof Hosting (http://ajaxpatterns.org/Host-Proof_Hosting). It requires that all data is encrypted before leaving the browser. The encryption key should never be sent to the server.
Since the encryption happens in client-side Javascript, all the code is actually present in the browser and can be reviewed, and the calls to the server can be watched in the DOM in real-time. That said, the code review would need to be done every single time the application was loaded so as to make sure that nothing had changed in the meantime.
There’s an ongoing discussion about this right now, and if/how we can overcome it.
For anyone looking to build their own HPH application, we’ve just released a MIT/LGPL library here: http://code.google.com/p/passpack/
July 17th, 2008 at 4:37 pm
> anything based on, for example, the Reddit code must be released back as open-source, assuming the original code was released under a GPL-like license that requires this
Yes, but there’s a problem wrt web-based apps - you only need to make the source to your changes available if you distribute the application - but putting it on a web server and allowing remote access is not considered the same as distribution.
As cloud computing becomes more popular this is potentially going to become a hot issue.
July 17th, 2008 at 8:50 pm
David Claughton - That is really interested. I assume that you are talking mostly about the GPL in this case. I know Reddit choose the Common Public Attribution License (which they say is a modified version of the Mozilla license. Any idea if that has the same issue?
Source for Reddit license info:
http://blog.reddit.com/2008/06/reddit-goes-open-source.html
July 18th, 2008 at 10:06 am
As far as I know, the AGPL addresses the issue of code availability for web-services.
July 18th, 2008 at 4:29 pm
Well, IANAL, but the CPAL section 3 is headed “Distribution Obligations” and although it talks about “making available” executables and source code, I suspect it would be easy for someone to make the case that this is only talking about distribution and not remote access.