{"id":3397,"date":"2022-04-27T11:47:28","date_gmt":"2022-04-27T11:47:28","guid":{"rendered":"https:\/\/www.paytmmoney.com\/blog\/?p=3397"},"modified":"2022-04-28T06:38:26","modified_gmt":"2022-04-28T06:38:26","slug":"magic-of-portfolio-charts","status":"publish","type":"post","link":"https:\/\/www.paytmmoney.com\/blog\/magic-of-portfolio-charts\/","title":{"rendered":"Magic of Portfolio Charts"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">An equity market is a market in which shares of companies are issued and traded. It is basically a meeting point for buyers and sellers of stocks. When an investor buys stocks that ultimately get credited in his \/ her Demat account, it forms part of their holdings in\u00a0 his \/ her portfolio. An investor typically needs\u00a0 a system by which, over a period of time, his \/ her portfolio performance could be tracked and also to get some insights. To support this, we have developed the <\/span><b>Portfolio Charts<\/b><span style=\"font-weight: 400;\"> system that would give a graphical representation of an investor&#8217;s total purchase value and current market value.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this blog post, we will give an overview of the <\/span><b>Portfolio Charts<\/b><span style=\"font-weight: 400;\"> system at Paytm Money and share some of the learnings we gained along the way.<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-3398\" src=\"https:\/\/paytmmoney.wpengine.com\/wp-content\/uploads\/2022\/04\/pc_image.png\" alt=\"Magic of Portfolio Charts \" width=\"756\" height=\"1174\" srcset=\"https:\/\/www.paytmmoney.com\/blog\/wp-content\/uploads\/2022\/04\/pc_image.png 756w, https:\/\/www.paytmmoney.com\/blog\/wp-content\/uploads\/2022\/04\/pc_image-193x300.png 193w, https:\/\/www.paytmmoney.com\/blog\/wp-content\/uploads\/2022\/04\/pc_image-659x1024.png 659w\" sizes=\"(max-width: 756px) 100vw, 756px\" \/><\/p>\n<p><b>Background Knowledge<\/b><\/p>\n<p><span style=\"font-weight: 400;\">First, let\u2019s talk about how the holdings information gets generated based on the stocks bought by an investor. Let\u2019s suppose below are the user\u2019s transactions.<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Date<\/b><\/td>\n<td><b>Stock<\/b><\/td>\n<td><b>Type<\/b><\/td>\n<td><b>Quantity<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">04 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Sell<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">05 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S3<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">05 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">So at the end of 05 Jan 2022 and onwards, we will have four records as the buy transactions get accumulated, if not sold.<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Holding Date<\/b><\/td>\n<td><b>Stock<\/b><\/td>\n<td><b>Type<\/b><\/td>\n<td><b>Quantity<\/b><\/td>\n<td><b>Buy Date<\/b><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"2\"><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy\u00a0<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"2\"><span style=\"font-weight: 400;\">04 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td rowspan=\"4\"><span style=\"font-weight: 400;\">05 Jan 2022<\/span><\/td>\n<td><span style=\"font-weight: 400;\">S1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">03 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">S2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">05 Jan 2022<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">S3<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Buy<\/span><\/td>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">05 Jan 2022<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400;\">The idea of the above table is to let you imagine how the records will grow over a period of time.<\/span><\/p>\n<p><b>Problem Statement<\/b><\/p>\n<p><span style=\"font-weight: 400;\">In Paytm Money, we have a third party legacy system which generates the users\u2019 holdings based on transactions on a daily basis and uses an MS SQL Server at the persistence layer. Database tables which contain the holdings have more than 25 columns and billions of rows as holdings get accumulated. They store only the last N(15) days\u2019 holdings because the latest date\u2019s record is sufficient to know a user\u2019s holdings and the last N days\u2019 holdings are good enough to correct the system in case any inconsistencies are found.\u00a0 This would also reduce the amount of storage used. Apart from this, the third party system suffered from the following problems &#8211;\u00a0<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Not robust enough\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Not scalable and\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Could not handle high traffic volumes.\u00a0<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Improving the legacy system was not an option due to multiple constraints. Owing to those we decided to build our own scalable system by pulling out all the holdings on a daily basis from the MS SQL Server.<\/span><\/p>\n<p><b>Challenges<\/b><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0We had four major challenges<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Processing billions of holding records in a reasonable time and build the charts\u2019 data points.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Choosing the destination persistence layer and minimizing the storage utilization.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Serving the data with low latency.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Reprocessing of a chart\u2019s data points when incorrect holdings get pushed into the source database.<\/span><\/li>\n<\/ul>\n<p><b>Optimizing DB record process<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Let&#8217;s talk about the data source which is MS SQL Server in our case. Fetching the millions of records was really a big challenge and we did multiple iterations to find out the sweet spot and that is, to fetch the rows in batches through a range query of sequential primary keys. We have followed the below steps to achieve this:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Get the max and min primary key id.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Hit the query with the primary key range condition in where clause in batches.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Why does this logic work?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">SQL Server uses B-tree data structure to store clustered primary keys which makes the retrieval faster. It has three levels.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0<\/span><b>\u00a0\u00a0Root level: <\/b><span style=\"font-weight: 400;\">The root level is the starting point of the query<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0<\/span><b>\u00a0Intermediate level:<\/b><span style=\"font-weight: 400;\"> This level provides a connection between root and leaf levels.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0\u00a0\u00a0\u00a0<\/span><b>Leaf Level:<\/b><span style=\"font-weight: 400;\"> This level is the lowest level of the clustered index and all records are stored at \u00a0 this level<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For example, when we want to query suppose 130 &lt;= id &lt;400, Query will begin its traversal from root level to intermediate level then reaches the leaf level. As our ids are in sequence there is no need of going back to the root and starting the search again. It will first read the data from the current leaf and go to the next leaf and so on until it meets the condition.<\/span><\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-3399\" src=\"https:\/\/paytmmoney.wpengine.com\/wp-content\/uploads\/2022\/04\/Blog-Btree.drawio.png\" alt=\"Optimizing DB record process\" width=\"691\" height=\"301\" srcset=\"https:\/\/www.paytmmoney.com\/blog\/wp-content\/uploads\/2022\/04\/Blog-Btree.drawio.png 691w, https:\/\/www.paytmmoney.com\/blog\/wp-content\/uploads\/2022\/04\/Blog-Btree.drawio-300x131.png 300w\" sizes=\"(max-width: 691px) 100vw, 691px\" \/><\/p>\n<p><b>Choosing the persistence layer<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Our second task was to choose the right destination storage entity which can scale well, provide quick searching capabilities and be able to store huge volumes of data quickly and in near real-time. We explored multiple options like MongoDB, Elasticsearch, Cassandra etc. and found that Elasticsearch suits our requirements very well. It&#8217;s a distributed NoSQL database system and analytics engine built on Apache Lucene. We\u2019ll talk a bit more about why we chose Elasticsearch.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It provides extensive REST APIs for storing and searching the data<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It is a Document-based NoSQL database and the documents have a key and value pair, so we can keep all the holdings of users in one document and can search on some of the keys.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It&#8217;s horizontally scalable, meaning data can be distributed amongst multiple nodes.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It also has the concept of sharding and routing. Sharding is basically a mechanism to divide the data volume into smaller chunks and each shard itself is a lucene search engine. Shards will be distributed among the nodes uniformly.\u00a0 Routing is a way to tell Elasticsearch where to keep \/ search for the data.\u00a0<\/span><\/li>\n<\/ul>\n<p><b>Storage Optimization<\/b><\/p>\n<p><span style=\"font-weight: 400;\">The next task was to optimize the\u00a0 Elasticsearch storage. We did the following to achieve an optimized storage:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Stored all nested details in the form of an array rather than storing them in a list of objects.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Reviewed whether the variables fit into smaller data types.<\/span><\/li>\n<\/ul>\n<p><b>API Optimization<\/b><\/p>\n<p><span style=\"font-weight: 400;\">The\u00a0 next and most important challenge was the API\u00a0 performance optimization. We did the following to achieve it.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Indexed only those fields which are needed to perform the search operation. This helped to\u00a0 reduce the memory pressure on Elasticsearch<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">As there is no date data type available and the date value typically gets converted into epoch long value before getting stored in\u00a0 Elasticsearch, we decided to use epoch long values for dates rather than using formatted dates. This saved a lot of operations while performing search operations.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Excluded unnecessary fields from the search response.<\/span><\/li>\n<\/ul>\n<p><b>Reprocessing<\/b><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Sometimes, due to wrong trade data or incorrect corporate action submission, inconsistent holdings get pushed to the third party legacy system and the same gets reflected in the Elasticsearch index as well. These inconsistencies typically get resolved on\u00a0 the third party system with some manual efforts. Now the challenge is to\u00a0 make the Elasticsearch index consistent.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One easy but inefficient solution is to reprocess those records again when required. But we didn\u2019t choose this approach since it requires extra manual effort on\u00a0 both sides . So, we built\u00a0 a system that\u00a0 pulls <\/span><b>ALL<\/b><span style=\"font-weight: 400;\"> last N days\u2019 holdings on a daily basis which are available in the legacy system. That way, we do not have to worry about the corrections happening on the legacy system side.<\/span><\/p>\n<p><b>Conclusion<\/b><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As some will read this blog to learn something new, our idea is to showcase how one can put the right technology in the right place to improve the scalability of a system that integrates with an existing third party legacy system that can\u2019t be improvised.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Team Members: <a href=\"https:\/\/www.linkedin.com\/in\/ganesh-kumar-a0a66834\/\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/www.linkedin.com\/in\/ganesh-kumar-a0a66834\/&amp;source=gmail&amp;ust=1651046471947000&amp;usg=AOvVaw3Cwe-O5O0SNpcYOAUgTBEV\">Ganesh Kumar<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/rajkumarsaroj\/\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/www.linkedin.com\/in\/rajkumarsaroj\/&amp;source=gmail&amp;ust=1651046471947000&amp;usg=AOvVaw1H9vzrXzhld-qvDowvCAXO\">Raj Kumar Saroj<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/koushik-savudam-26545b197\/\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/www.linkedin.com\/in\/koushik-savudam-26545b197\/&amp;source=gmail&amp;ust=1651046471947000&amp;usg=AOvVaw2pcCxbqHqm0mwegvSQ1k5G\">Koushik Savudam<\/a><\/span><\/p>\n<p><span style=\"font-weight: 400;\">Blog contribution: <a href=\"https:\/\/www.linkedin.com\/in\/rajkumarsaroj\/\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/www.linkedin.com\/in\/rajkumarsaroj\/&amp;source=gmail&amp;ust=1651046471947000&amp;usg=AOvVaw1H9vzrXzhld-qvDowvCAXO\">Raj Kumar Saroj<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>An equity market is a market in which shares of companies are issued and traded. It is basically a meeting point for buyers and sellers of stocks. When an investor buys stocks that ultimately get credited in his \/ her Demat account, it forms part of their holdings in\u00a0 his \/ her portfolio. An investor<a href=\"https:\/\/www.paytmmoney.com\/blog\/magic-of-portfolio-charts\/\">Continue reading <span class=\"sr-only\">&#8220;Magic of Portfolio Charts&#8221;<\/span><\/a><\/p>\n","protected":false},"author":31,"featured_media":3400,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-classic-ns.php","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-3397","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/posts\/3397","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/users\/31"}],"replies":[{"embeddable":true,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/comments?post=3397"}],"version-history":[{"count":0,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/posts\/3397\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/media\/3400"}],"wp:attachment":[{"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/media?parent=3397"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/categories?post=3397"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.paytmmoney.com\/blog\/wp-json\/wp\/v2\/tags?post=3397"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}