Developing Software with Market Data – Part 1

Earlier this month, I was invited to take part in an international webinar panel discussion on Developing Software with Market Data, organised by FISD. Sinara have worked in the market data space for over 30 years, and built large-scale systems for clients around the world, so we are always pleased to be invited to share our thoughts and experiences with industry colleagues.

In this article, I’ll summarise some of the key topics we talked about in the webinar, from Sinara’s perspective. We’ll cover the critical elements you need to consider when building software with market data (whether as producer or consumer) and the skillset your development team or supplier needs to have. In Part 2, we’ll look at the issues to think about when replacing a legacy system and touch on the shift to cloud and how that affects your choices.

Critical elements in market data systems

First of all, when looking to consume a market data feed in your application, you need to make sure that it has the content you need. Are all the specific fields you need available? Might a critical field in fact be missing or unavailable (for commercial reasons, for example)? A consolidated feed may not include certain exchange-specific fields; an alternative data source may not have the coverage or granularity your application needs.

You also need to look at what format the data is delivered in (e.g. FIX, XML, CSV, etc), and determine how much effort it will take to extract it into usable information for your application. Unstructured data (like many alternative data sources) can prove especially challenging and this needs to be taken into account when planning the project.

Latency is obviously a huge topic in itself, as with all the other aspects here. Critically, you need to consider whether the latency offered by the feed in question is suitable for your application; clearly the needs of a real-time trading algorithm are going to be very different to those of an end-of-day compliance system. It is also of little point to pick the feed with lowest latency if your downstream systems are much slower; your system is only as fast as your slowest link. What do you really need to do the job?

On a related note, understand what the volumes will be of the feed you are receiving and what resources your application will need to handle them. Can it cope with the volumes without being overwhelmed? Can your downstream systems? If not, consider what kind of throttling or mitigation mechanism will you need to introduce to make sure the larger volumes can be reduced to something more manageable. You don’t want to end up with the classic problem of a fast producer and slow consumer, with queues of messages building up or being lost. Indeed, dealing with missed messages (which may or may not be critical, depending on the feed and application) is a related issue to consider.

As touched on already, it is likely the information extracted from the feed will be integrating into other business systems or even distributed to large numbers of downstream applications (internal or external). A key consideration is to know what data formats those systems in turn expect and how much work will be needed to perform that conversion. Finally, with market data of any kind, there will inevitably be issues around entitlements and how they are managed. You want to avoid the problem of data being passed between your systems without control and no way of tracking where a piece of data originated and for what it can be used.

All these considerations apply equally when you are building a system that will actually produce market data. What will your consumers expect? What applications is it designed for? What kind of latency, format and volume can you offer?

The skillsets needed

The development team building your market data system, whether as a consumer or producer, needs to have a fundamental grasp of all the topics above. A good understanding of the associated data model and reference data is also important, both in terms of the technical aspects of the feed and relevant domain knowledge. This makes it much easier to have conversations with business users about what the system needs to do, and ensure that requirements aren’t ‘lost in translation’ between business and IT. At Sinara, we regularly have our development teams talk with our end-users, making sure we have good lines of communication and developers have a good understanding of the business processes they are implementing.

A vital skill in building many large-scale market data systems is being able to design a database that will be able to store the data in such a way it can be retrieved efficiently later on, whether by a software application or by a database user running a query tool. Can the design be successfully extended in future to incorporate new types of data as the model evolves? Can new fields or data types be added easily? What are the performance characteristics? Of course, the last thing you want to do is tie your design too closely to the specific market data feed you are consuming. What happens if it must be replaced by another feed in future? It is important to create a separation in design, so that if the format changes, changes to the rest of your application can be minimised.

It is also invaluable to have experience building high performance, realtime systems, in which every millisecond (or indeed microsecond) counts. Building this kind of application is very different from building many other types of business systems in which these considerations do not apply. Some market data distribution components have to manage large amounts of data very fast and push them out via different channels. A good example of this is Sinara’s own MDP, which has served as the foundation for many of the market data systems we have delivered.

Finally, as with any professional software build, good software engineering principles apply: good design in terms of the components/libraries; separation of concerns; abstraction; testing; good project management, and so on. We champion these at Sinara and they are a key part of everything we do.

Continued in Part 2.