Hadoop on EC2 July 20, 2007
Posted by James Webster in : software, finance , add a commentA while ago I suggested that Hadoop and Amazon EC2 could be used to construct an open source derivatives risk pricing grid platform as an alternative to commercial offerings such as Datasynapse or Digipede. Well now there is a tutorial for running Hadoop on EC2 (via Andrew Newman’s More News). Between this and EC2’s upcoming support for paid AMIs there might be a business opportunity to set up a ‘Software as a Service’ risk management offering for hedge funds; Map/Reduce and the elastic response of EC2 ought to allow the numbers behind complicated trades (magic potion passport options anyone?) to be crunched quickly. I still feel that multicast IP will be an important feature for Amazon to add to EC2 to properly support grid distribution of processing and caching.
More open source software for capital markets April 11, 2007
Posted by James Webster in : software, finance , add a commentI came across Marketcetera a few months ago and they have recently released version 0.3. Marketcetera is an open source (Java) trading platform with client and server components. In addition to a trade entry front-end with a basic blotter it includes a back-end Order Management System compliant with the FIX protocol and a post-trade allocation and reporting system (known as Tradebase) built on Ruby on Rails.
They ship the platform as discrete components or as VMware/Parallels virtual machine images. I’m looking forward to seeing how this platform evolves, particularly whether they will integrate Tradebase more tightly with the OMS and an open-source app server when JRuby 1.0 is finalised. It would be interesting to see if it could be deployed as a Java EAR.
How to build a scalable derivatives risk & pricing platform using open source software March 30, 2007
Posted by James Webster in : software, finance , 6 commentsJust some brief thoughts on how a selection of open-source software might be used to build a architecture stack supporting a derivatives risk & pricing platform, a common system found on derivatives trading floors.
- QuantLib: A popular quantitative finance library implemented in C++ with multiple language bindings. It would provide a good base upon which to build more exotic (and proprietary) models. There are a number of commercial toolkits available as well.
- Hadoop: A Java implementation of Google’s MapReduce algorithm. Hadoop would support horizontal distribution across a server grid of the pricing calculations implemented by QuantLib within a Monte Carlo method framework. A commercial alternative (although much broader in scope) would be DataSynapse. Also have a look at the recently released IBM alphaWorks MapReduce Tools for Eclipse.
- OpenTick: You have to pay the exchange fees for the data but otherwise OpenTick provides an open API for accessing real-time and historical ticks for a broad range of US exchanges and ECNs.
- Ehcache: Although Hadoop includes a Distributed File System it is geared towards large files. A better solution for ensuring the prices received and the figures calculated are updated across the server grid would be a distributed cache. Ehcache has had basic distributed caching for a while now. A commercial alternative would be Tangosol Coherence.
- Esper: Event stream processing (aka complex event processing) is becoming a big part of trading systems. Plugging an ESP toolkit into our stack might provide a way to support algorithmic trading or identification of arbitrage opportunities. There are many commercial players vying for success in this market.
- Amazon EC2: And finally, as a deployment option why build your own grid of servers when you can rent one? There is already a HOWTO for running Hadoop on EC2. One drawback to using EC2 is the current lack of support for IP multicast, but I suspect this will change one they are able to effectively firewall off multicast packets from separate EC2 domains. Another drawback that may be harder to overcome is the latency arising from the Internet links between OpenTick, Amazon EC2 and the trading workstations that ultimately receive the output of this platform.
Whether some of the open-source alternatives I have suggested are appropriate substitutes for their commercial brethern is highly debatable. Nevertheless the commoditization of software continues.