Discover how R is revolutionising data science with scalable solutions and cloud-native platforms. Xinye explores key takeaways from the latest innovations driving data-driven success across industries.
The Importance of Programming Skills in Modern Data Science.
A few weeks ago, I had the privilege of speaking at the EARL Conference 2024 in Brighton, where I observed several key themes shaping the future of data science. Programming has evolved from a specialised skill into a baseline requirement for organisations seeking to unlock the full potential of their data assets. Tools like R and Python, with their low barriers to entry and vibrant community support, are democratising data science and empowering sectors as varied as local authorities, national museums, journalism, and energy infrastructure. The value derived from data is not just about technology; it’s about creating real business benefits and, in many cases, improving lives. As I took the stage as the headline speaker on day one of the conference, I reflected on the importance of communication in driving data’s impact. The ability to tell the right data story has never been more critical. Today, data scientists have access to an unprecedented range of tools to craft compelling stories, design strategies to tackle business challenges, and create engaging, publication-ready content. Another standout theme was the community’s growing embrace of the cloud, which allows organisations to scale their operations and adapt to modern workflows. During my talk, I shared key examples to demonstrate how Ascent is leveraging R to deliver enterprise-scale, end-to-end solutions across industries. From data engineering and platform engineering to software development and design thinking, each example reflects the growing role of data science in solving real-world problems and delivering measurable outcomes. Below are some of the main insights from that session.
Data Engineering: Optimising for Performance.
Let’s start with data engineering, where R’s ability to streamline complex data transformations is invaluable. In one of our projects with a clinical research organisation (CRO), we were tasked with optimising a Shiny application for clinical trial data. The challenge? SAS datasets that were too large and slow for efficient Shiny app performance. We built a pre-processing pipeline based on the medallion architecture, splitting the data into bronze, silver, and gold layers to minimise what the app had to handle.
Bronze Layer: Raw data stored in its original SAS format, providing a fail-safe version of the dataset.
Silver Layer: Transformed datasets using dplyr functions to prepare the data for specific analyses.
Gold Layer: Ready-to-use, purpose-built datasets optimised for Shiny and machine learning models.
By shifting data transformation to the pipeline, we reduced both data size and application complexity, resulting in significant improvements in the app’s performance. In data science, good engineering practices like this are often the unsung heroes behind successful solutions.
Platform Engineering: Trusting R in the Cloud.
Next, I explored how R can be harnessed in platform engineering to build scalable, cloud-native solutions. In another project with a biopharma company, we helped their Clinical Data Science Group reduce reliance on CROs by enabling in-house, real-time analysis during clinical trials. The goal? To produce regulatory-compliant analysis outputs using R. By building a cloud-native analysis platform, we addressed concerns about trust and compliance — common issues in industries like pharma. The platform featured:
Security and Compliance: Azure’s cloud security tools, ensuring that clinical and patient data met stringent regulatory requirements.
Data Lineage: Full traceability of every analysis step, from ingestion to reporting, ensuring reproducibility and accountability.
The data is automatically classified and can be further organised into catalogues. For analysts, this makes it easier to find and explore data for analysis. For regulators, it provides confidence that we know what data is available, how it moves around, and how it is being used.
Cost Efficiency: Cloud-native environments that only charge for active resources, perfect for dynamic, trial-driven workloads.
This project emphasised a key theme I saw at EARL 2024: the increasing embrace of the cloud in the R community. Cloud-native solutions offer scalability and flexibility while maintaining the trust that’s critical in regulated industries like pharmaceuticals.
Software Engineering: Modernising Reinsurance Pricing Systems.
We move on to a project in the reinsurance sector, where we were tasked with modernising an outdated pricing tool. This pricing tool had been in use for over 20 years and was originally written in APL, a language that last saw a stable release more than two decades ago. This tool was integral to their actuarial processes, but it was outdated and difficult to maintain. The goal was to completely overhaul this system and bring it up to modern standards, making it scalable, secure, and efficient. Our team of architects and engineers worked extensively with Azure technologies to create a solution that allowed actuaries, underwriters, and reviewers to collaborate seamlessly. The actuarial team, who were familiar with R, were able to publish their models as R packages, which were then stored in Posit Package Manager. Plumber APIs deployed through Connect exposed these models to underwriters via a new web portal built with Blazor— a framework based on C# and .NET. The decision to use Blazor instead of Shiny, which we are known for, came down to skills, scalability, and cost. While Shiny could have been a solution, the scale and security requirements for this enterprise-level application led us to choose Blazor for optimal integration with Azure’s native cloud features. This approach ensured a robust system that provided a seamless experience for actuaries, underwriters, and reviewers. Ultimately, the project demonstrated the importance of asking the right question — not just whether Shiny can be used, but whether it should be used. By focusing on the client’s specific needs, we delivered a solution that balanced performance, security, and cost-effectiveness, showcasing the value of modern software engineering practices in delivering the best version of data science to end users.
Strategic Design: The Bridge Between Data and Business Impact.
At Ascent, we treat design as a key component in data-driven solutions across the strategy, service, or product levels. Starting with Strategic Design, our work with BrewDog provides a great example. The project highlights the importance of design in data science — specifically, how good design can amplify the impact of even the most sophisticated models. BrewDog wanted to realise the true multi-channel vision by converting as many in-bar customers to online consumers, and our machine learning models were key to driving that change. But, as we discovered, placing a machine-driven recommendation engine in the hands of bartenders — who are experts in their craft — wouldn’t work. So, we deployed the engine in selected digital channels, where the personalised recommendations complemented their existing customer touchpoints. The result? A significant increase in customer engagement and spend from marketing campaigns. This project underscores a vital lesson: data-driven solutions must align with the people who use them. It’s not enough to build great models — we need to consider the human experience that will ultimately define their success.
Service Design: Enhancing a CRO’s Shiny-Based Insight Application.
Returning to the CRO from the data engineering case, we found that technical challenges in their Shiny-based insight application sparked a broader discussion about service design. The application was intended to provide insights into clinical study data, but as we worked through issues like data security and reconfiguration for new studies, it became clear that there were gaps in the overall service offering. We initiated a service design process that produced a service blueprint. This clarified fundamental questions such as:
Does the service include data cleansing and remapping?
Is this a software-as-a-service (SaaS) or a consulting service?
What are the review points for configuration and data?
How do different personas within the organization interact with the application?
What is the expected end-user support?
By systematically addressing these questions, we not only improved the application itself but also helped the CRO better articulate their service, internally and externally. The result was a more cohesive offering that aligned with their business goals and clarified user expectations.
Product UX/UI Design: Posit Monitoring App for a Reinsurance Data Science Team.
At one of our reinsurance clients, the Centre of Excellence Team for Data Science was using R and related infrastructure, including Posit. They asked us to help develop a Shiny application to monitor Posit usage, surfacing key metrics and trends. The challenge was to present a vast amount of information in a way that was both user-friendly and visually coherent. Through a focused design sprint, we worked on one of the main dashboards to optimise the user experience for stakeholders. This included displaying departmental metrics clearly to highlight ROI, incorporating arrow indicators to showcase trends, and adding call-to-action buttons for deeper insights, and tooltips that provided granular details, ensuring the interface was informative yet uncluttered. We also developed a design system based on the company’s branding to ensure consistency across the application. The final product offered a clear, visually consistent interface that enhanced decision-making and provided stakeholders with easy access to critical insights. This project highlights the value of UX/UI design in data science, ensuring that complex information can be communicated effectively and in a way that drives action.
Conclusion.
Looking back at my EARL 2024 talk, I was reminded how far the R community has come and how much further we can go. Whether it’s in data engineering, platform engineering, or design, R’s versatility is unparalleled, and it’s positioned to continue driving innovation across industries. R is more than just a tool for analysis — it’s a key component in enterprise-scale solutions that are transforming industries and making a real-world impact. With the increasing adoption of cloud technologies and an ever-growing focus on effective data communication, there’s never been a better time to be working in this space.