Profit And Loss Report Generation5 min read
India has about 150 million active investors and 750 million demat accounts which is around 6 percent of the total population as against 50% in US, 30% in UK. With this number expected to grow in India the number of Transactions and Holdings is also expected to grow with individual users contributing at least 500 to 2000 holdings / realized transactions yearly. This data would be ever increasing in nature i.e. once a user has squared off his / her position he/she can come anytime and see what profit or loss was made by him / her in a given tenure.
It would not be an easy task to maintain the data from inception and share users their profit and loss statement. Moreover it is a compute intensive operation as details are returned as a report file. Initial system load tests and data size validated that the design of the system suits to have an asynchronous workflow. The goal was to process user requests as fast as possible and reliably deliver reports to recipients while offering strong observability and telemetry.
In this blog, we will share some of the challenges we came across and the learnings we gained in the process of implementing a highly scalable Profit and Loss Statement Report Generation system.
- Generate Profit and Loss statement for holdings and Realized Detail from data scattered across multiple datasources.
- Scalability: Make sure the system can handle spikes in user requests for statements during tax filing months and at the same time ensure cost effectiveness
- Resiliency: Design the system to degrade gracefully and recover in case of system failures.
Architecture: The use case of our system was such that it should be able to deal with ever growing data, its size and still provide high availability. After multiple load tests and data analysis, we determined that asynchronous processing suffices our current and future requirements. The flow of the system was segregated into Incoming Requests, Processing and Outgoing Response tasks.
As a result, the architecture can be broadly divided into 3 systems
1. Request: The request is accepted from the user and passed onto the queueing system. In case the user requested to download the report, a unique identifier across application instances is returned as a successful initial response. If an email report was requested, a successful acceptance response is returned. The acceptance of the request triggers an AWS Lambda function responsible for the processing tasks.
2. Processing: The AWS Lambda function takes care of the processing system which involves creating a Profit and Loss excel report by using multiple source databases. In our case it was ElasticSearch and Microsoft SQL. A single connection was created per Lambda function with the underlying databases to avoid undue resource creation. All the report requests were processed in parallel threads to enhance the compute speed. After the report is generated, it is placed in Amazon S3 storage and a pre-signed URL of the file is generated.
3. Response: With the pre-signed URL, the report is either emailed to the user or it is placed in Redis which can be polled as requested by the client. The polling of the report is done in arithmetic progression until a set time duration.
Usage of Async processing and message queuing makes sure that the delivery is guaranteed as we always preserve the request in the queue, in case of any failure during processing tasks the request will be re-processed. This helps make the system resilient. Segregating the request and processing systems makes it easy to implement API versioning.
High Throughput : By using an asynchronous design pattern we are able to achieve high throughput with less number of application instances. Moreover we were able to control the utilization of our downstream systems by modulating the number of requests we want to process at a given time. Thereby preventing a cascading effect on overutilized systems. The Microsoft SQL database was a single instance read replica that was part of an external component we were integrating with, and hence didn’t have direct control over the architecture / data model. And hence had to workaround this limitation that was a bottleneck in the flow and find a sweet spot.
Persistence: The storage of file URLs in Redis and reports generated in Amazon S3 had their corresponding TTL (Time To Live).
Cost Effective Development: We were able to utilize the event driven, serverless computing service AWS Lambda which is charged / billed only for the duration the function is run. Moreover, we would be able to control the number of messages consumed and concurrent Lambda functions we want to run. Being a compute intensive process, the downside of using a Lambda function was the time involved in cold start (initial code spin up) and no control over the CPU allocation to the function.
As per the AWS documentation, CPU allocated to lambda execution is proportional to RAM configured. Report generation however was not memory intensive hence RAM was underutilized which led to lower CPU allocation.We did however see improvement in the processing stage when the configured RAM was increased which in turn elevated the number of CPU cores.
- A High throughput system ready for request bursts.
- Cost Effective Infrastructure utilization
- Guaranteed response delivery due to the usage of a message queue
- Generic implementation for API versioning
- Constant compute ability
- Cold start latency
Reached so far !!
Today we came across one of the system designs utilizing asynchronous flows for a compute heavy workflow of generating a report file with multiple underlying data sources. The future might hold using the same Lambda function to generate other reporting as well.
Most importantly, valuable feedback for the blog is always welcome. One can share the same on firstname.lastname@example.org. Users with a demat account with Paytm Money can find the utility on the Paytm Money website / Paytm Money application.
Cheers and Happy Learning.
Blog Contribution : Ankit Singh