The Pagos SpreadsheetWEB API is designed to execute spreadsheet calculations in a stateless manner where it can be utilized as a back-end calculation engine for applications with moderate-to-large user concurrency. In this article, we aim to compare SpreadsheetWEB’s performance with two other well-known web-based spreadsheet software platforms: Microsoft’s Office 365 Excel and Google Sheets.

Microsoft Office 365 Excel and Google Sheets play a vital role for the data processing needs of many enterprise and professional software solutions because their powerful and flexible features – such as formulas, calculations, and quick modelling potential – provide their user base with the means to generate calculation-driven applications with ease. It is also common practice for these users to utilize the built-in calculation logic offered by the Microsoft Office 365 Excel and Google Sheets platforms to handle their general business requirements, such as managing and maintaining their financial, accounting, planning, and forecasting objectives.

However, the task of seamlessly and accurately integrating your existing applications with these Excel/Spreadsheet-based calculation models is not a trivial one since this type of integration requires extensive research to provide the best possible solution prior to allocation of costly development resources. Therefore, we are providing a performance comparison of the calculation APIs for Pagos SpreadsheetWEB, Google Sheets, and Microsoft Office 365 Excel in this study:

  • We provide insights into how developers interact with these three platforms to run a model and retrieve output values from the spreadsheet model.
  • We show performance metrics in the form of response time across the three platforms and validate the accuracy of the calculated output.

Testing Methodology

For the purposes of this case study, we tested the following three platforms:

  1. Pagos SpreadsheetWEB accessed via the SpreadsheetWEB API [1]
  2. Google Sheets accessed via Google Sheets API [2]
  3. Microsoft Office 365 Excel accessed via Microsoft Graph API [3]

The Pagos SpreadsheetWEB API is one approach that allows developers to connect to their spreadsheets programmatically and use them as calculation engines through a set of exposed API methods. Developers send POST and GET requests to the exposed methods via the RESTful API to set values to in the target workbook and retrieve the calculated results. In the Pagos SpreadsheetWEB API, a single API call is sufficient to pass the inputs to the spreadsheet model and retrieve the outputs of the model after the calculation logic in the spreadsheet model is executed by the calculation engine.

The Google Sheets API enables developers to programmatically access and manipulate their Google spreadsheets, which are hosted on Google Drive. The Google Sheets API requires two operations to receive the final calculation output from the model in the spreadsheet: (1) an operation to write the values into the input cells and (2) an operation to read the values from the cells where the calculation logic stores output. Therefore, we provide aggregate response time values for both write and read operations in the experimental results section below.

Microsoft Graph API is a gateway for various Microsoft products such as Azure Active Directory (AD), Excel, Outlook, and OneDrive through a single endpoint (https://graph.microsoft.com). It allows developers to make calls to the underlying cloud services after authentication through OpenID Connect and OAuth, as provided by the Azure AD service. Like Google Sheets, the Microsoft Office 365 API also requires to make two separate calls to generate the calculated output: a write operation to set the values and a read operation to retrieve the output.

Having two separate API calls to achieve one operation may lead to some anomalies and result in inaccurate and inconsistent values during concurrent user scenarios. In other words, a user may get the results of another user’s inputs, depending on the order in which the requests are processed. The only way to resolve this concurrency issue is to place locks around consecutive user calls.

To test the response time and accuracy of the calculation results, we followed these three test scenarios: (1) single user, (2) five users, and (3) twenty-five users, each sending 20 different sequential requests.  The spreadsheet document (9.5 MB) comprises of a model that calculates the shipping cost of a package based on weight,  originating zip code, and destination zip code.

Experimental Setup

The experiments were conducted from an Amazon AWS environment. Table I provides the hardware and software configuration of the virtual machine where the requests to each endpoint were initiated.

Table I – Hardware and Software Configuration of the Virtual Machine Where the Requests to the APIs Were Initiated

vCPU  4
Memory  16 GB
Hard Disk (SSD)  30 GB
Processor Type & Speed  2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
Operating System  64 Bit Windows Server 2012 R2 (ami-1562d075)
Virtualization Mode  HVM
Instance Type  Dedicated – m4.xlarge
Data Center Location  US-West (Oregon)
Availability Zone  us-west-2a

 

A dedicated instance was used, rather than on-demand instance, to post the API requests in order to minimize the resource contention between the virtual machines running on the same host machine. Dedicated instances in the Amazon AWS environment run on single-tenant hardware dedicated to a single customer. This can be interpreted as a testbed (i.e. virtual machine) that is physically isolated from other customer instances on the same host hardware.

Experimental Results

In this section, we provide the results of these performance tests, including the average response times  and output accuracy ratio for the three aforementioned user count scenarios. Again, it should be noted that the Pagos SpreadsheetWEB API requires a single API call for each operation, whereas Google Sheets and Microsoft Office 365 Excel API require two separate API calls (one for writing inputs and the other for retrieving the calculated outputs). As a result, the average response times for Google Sheets and Microsoft Office 365 Excel in Figure 1 and Figure 2 represents the aggregate of both write and read operations.

ResponseTimev2

Figure 1 – Comparison of Average Response Time Versus Concurrent User Count for 20 Sequential Requests

Figure 1 shows the comparison of average response time versus concurrent user count, where each concurrent user is sending 20 sequential calculation requests. Pagos SpreadsheetWEB provides the better average response time against Google Sheets and Microsoft Office 365 Excel.

Accuracy of the Model Output vs Concurrent User Count

Figure 2Accuracy of the Model Output Versus Concurrent User Count for 20 Sequential Requests

The accuracy of the calculated output retrieved from each API versus the concurrent user count (each sending 20 sequential requests) is illustrated in Figure 2. Pagos SpreadsheetWEB has 100% correctness rate for all the use cases and output received. However, neither Google Sheets nor Microsoft Office 365 Excel APIs returned 100% correct output ratio for all use cases.

For all three test scenarios, Pagos SpreadsheetWEB outperforms Google Sheets and the Microsoft Office 365 Excel APIs in both average response times and accuracy of output results ratio in Figures 1 and 2. We believe that the fundamental reason behind the low accuracy percentages for both Google Sheets and Microsoft Office 365 Excel APIs in Figure 2 is the result of their underlying design goal: both platforms are designed for ubiquitous access and providing a collaborative environment for shared documents, rather than performing strictly as calculation engines. This is in contrast to the Pagos SpreadsheetWEB API, which was designed for the specific purpose of performing as a calculation engine.

It should also be noted that both Google Sheets and Microsoft Office 365 Excel also return better accurate output ratio if a 2 second delay is placed between the write and read operations. Of course, including this type of sleep counter would throw off the results of the performance analysis, since each request would be padded with that additional wait time. We suspect that this is the result of calculation times required by both APIs, prior to reading the outputs from the calculation models; however, it is not a trivial task to configure the exact delay period as the number of consecutive users are increased, and there does not appear to be means for configuring the processing as synchronous (i.e. only respond to the write request after calculations are performed).

In conclusion, from a performance and accuracy perspective, the Pagos SpreadsheetWEB API is a strong contender in the realm of spreadsheet-based calculation engines and a viable alternative if your solution requires such a model, whereas Google Sheets and the Microsoft Office 365 Excel APIs are solid choices for a solution that aims to provide strong workbook collaboration and ubiquitous access to shared spreadsheets.

References

[1]: https://pagosinc.atlassian.net/wiki/display/SSWEB/API

[2]: https://developers.google.com/sheets/api/

[3]:  https://developer.microsoft.com/en-us/graph/