Web-Services Best Practice : Using Parallel queueing to streamline Web-service data loads

, , ,

Using Parallel queuing to streamline Web-service data loads


The key to moving data between applications that are in and out of the cloud is using web-service APIs.  This works great for individual transaction done in real-time, but what happens when you have a large batch of transactions to process, say 100000 transactions?   And 90% of the transactions take three seconds and the other 10% take an average of 45 seconds.   Well let’s do some math:

90000 transaction * 3 seconds = 270000 seconds = 4500 minutes = 75 hours

10000 transaction * 45 seconds = 450000 seconds = 7500 minutes = 125 hours

Total : 200 hours or 8.3 days.

As a data integrator, I worked at a healthcare provider, and as a team we found that this completely unacceptable.  The customer experience for processing EDI 834 Enrollment files for that project demanded a 24 hour turnaround time to printing enrollment cards.

However we knew that web-services are transaction independent.  Well pretty much.   An example, when processing these EDI 834 Enrollment files we needed to prioritize that the Subscribers are processed before dependents, but other than this, transactions could be processed in parallel within limits.   The limits for the HealthEdge system we were loading to seemed to have a threshold of about 75 parallel queues.   I have run into smaller thresholds in Oracle CX , Salesforce , and other cloud apps of 5 and 10 and 20.   Some of these queue limits may have come from the source side.   I was using SQL-Server Express, and it has a low threshold for open query sessions, while SQL-Server Enterprise is much larger.   We used SQL-Server to store messages to be processed.   This allowed us a lot of flexibility to be able to the queues, assign priorities based upon multiple criteria, such as message-type, or Subscriber/Dependent.   The database model is agnostic though, it will work in Oracle, MySQL, Postgres and well as the a fore mentioned SQL-Server.

The data model is quite simple.  Load data tables – these are tables where each column is a data value identified by the column header.  XML-Message views – These are views that translate the data into a single XML message per row from the Load data tables.  XML-Message tables take the generated message from the view and stores it in a Message request column, as well as having the URL to send the message to, and the credentials..

From here there are multiple ways to send the message and get its response and store it back into the original table.

So taking the above scenario applied to each threshold limit, please see the potential results below:

As you can see, the initial 200 hours was reduced to 2.7 hours when using 75 parallel queues.

The results of this strategy can be a critical approach for an organization to meet tactical objectives.

If you think this approach can help your company, please contact us to discuss your possibilities.


Related Content:

SFCG’s Skynet: A Small Introduction

First Impressions : Using Dell Boomi AtomSphere Like A Pro

BOOMI Integrations : The Coolest Feature yet – The Flow Control Step

Z.E.N. tip # 7863 (Using External Keys in OSC Web Services)

Defining a Web-services Parallel processing Controller

Parallel Messages Queues in SQL Server