Centralized Logging for AMRIT: Scaling Observability Across a Distributed Infrastructure

Ivor D'Souza
04 Mins read
DevOps

Recently I had the opportunity to contribute to the AMRIT platform through the C4GT Community initiative, working on a mission-critical challenge: centralizing monitoring for their distributed services. With servers deployed nationwide, AMRIT faced significant observability hurdles — and designing a solution to tackle that was both technically rewarding and personally enriching. This open-source engagement pushed me to think holistically about system design, reliability, and the human side of infrastructure.

Getting Started: From GitHub Issue to Kickoff Call

I came across the opportunity through the C4GT community and applied via a GitHub issue that outlined the project goals. Having recently completed a project involving Elasticsearch, I felt confident that my skills aligned well with the challenge. I was thrilled when, a couple of weeks later, I received an email saying I had been selected.

What truly drew me to this project wasn’t just the technical stack — it was the mission behind it. AMRIT, supported by the team at PSMRI, is driven by a powerful vision: that digital health solutions should be accessible to everyone. As someone who deeply believes in the power of open-source technology to bridge systemic gaps, this principle deeply resonated with me. The idea that my contributions could help make public health systems more transparent and responsive — especially for underserved communities — made the work feel incredibly meaningful from day one.

We kicked things off with an introductory call with the team at PSMRI, where I was introduced to the system’s architecture and the project’s core objectives. It became clear that this wasn’t just about setting up a logging pipeline — it was about designing a solution that could scale, adapt, and ultimately support better healthcare outcomes through improved visibility and observability.

Diving In: Understanding the Landscape

Once onboarded, I began by familiarizing myself with AMRIT’s ecosystem consisting of over 20 interdependent services, each playing a critical role in delivering healthcare functionality. It was a lot to absorb, but the complexity also made it exciting.

To get hands-on, I set up a reproducible local environment using Docker. This helped me simulate the distributed system in a controlled setup, allowing for rapid experimentation without impacting production.

Facing Complexity, One Step at a Time

As I delved deeper into the project, understanding how the monitoring system would integrate with the existing API services wasn’t straightforward. Rather than trying to solve everything at once, I focused on building the system incrementally — making sure each part worked well before moving on to the next. It felt a lot like piecing together a giant puzzle, one that demanded both patience and curiosity.

There were plenty of challenges along the way, but what consistently helped was taking the time to build a solid mental model of the system. That clarity gave me the confidence to dig deeper, investigate issues thoroughly, and address problems at their root — not just apply temporary fixes.

The Outcome: A Scalable Solution with Real Impact

As the project came to a close, the results were clear. The centralized logging solution we implemented greatly improved how the team at AMRIT could track system performance and troubleshoot issues across their services. What once took a lot of manual effort to set up and maintain was now streamlined and automated, saving time and reducing the potential for errors.

Once all the services were integrated with the logging solution, we were able to find a call spanning multiple services that took too long to complete - something that would have gone unnoticed if we saw it in isolation.

Looking back, I feel contented with the outcome of this project. It reinforced for me the importance of thoughtful, purpose-driven work. It wasn’t just about solving a technical problem — it was about contributing to a larger mission that could positively impact people’s lives.

Key Technical Details:

ECS Logging Configuration: Implemented standardized logging across all API services using Elastic Common Schema (ECS) for consistent log formatting.
Log Collection Infrastructure: Deployed Filebeat and APM agents to collect logs and traces from distributed servers, enabling comprehensive system monitoring.
Source Identification: The logs and traces were injected with an environment property to precisely identify log sources across multiple server locations.
Intelligent Log Management: Implemented an effective log sharding strategy to optimize index management and improve query efficiency, with a log retention policy for systematic data pruning and storage management.
Secure Access Control: Role based access control policy controls log and trace visibility for development teams.

About the Contributor

Ivor D’Souza is a Software Engineering masters student at the University of Limerick. His passion for DevOps, Data Engineering and Open Source technologies led him to contribute to AMRIT. With experience at Juspay and expertise in developing monitoring and logging solutions, Ivor is dedicated to improving healthcare accessibility.

Conclusion

The deployment of the ELK stack for monitoring on the AMRIT platform marks a pivotal step in enhancing operational visibility and reliability. By automating log retention, centralizing data collection, and simplifying access to critical insights, we have significantly reduced manual overhead and improved system scalability. This setup not only ensures seamless monitoring across distributed servers but also empowers our teams to proactively address issues, contributing to the platform’s stability and performance. It demonstrates how thoughtful integration of technology can drive meaningful impact, enabling better service delivery at scale.