Rest Apis Must Guarantee Strong Data Integrity
REST APIs must guarantee strong data integrity
Akbar S. Ahmed | Feb 13, 2015

Validate input and output

Designing an API with strong data integrity improves data quality, overall system performance, and allows businesses to make better data-driven decisions. Data integrity also helps improve development velocity at all levels of the engineering organization.

According to Wikipedia, data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle, and is a critical aspect to the design, implementation and usage of any system which stores, processes, or retrieves data.

Why is data integrity important?

Data integrity ensures that your data is accurate which simplifies life for all levels of an organization.

Managers

Accurate data allows managers to make decisions based on data while trusting that the data is accurate.

Data analysts

Data integrity allows data analysts to spend more time analyzing data and waste less time cleaning up data or interpolating missing data.

UI/UX developers

High quality data also helps UI/UX developers by allowing them to simplify and reduce the amount of code to check for and handle low quality data.

Data integrity and performance

Beyond data quality, data integrity directly affects application performance. How? Quality at every step of a data pipeline prevents backtracking which reduces the number of requests thereby directly reducing bandwidth and CPU utilization.

When data integrity does not exist then each tier that finds a problem in the data must issue additional requests in an attempt to retrieve the correct data. The performance problems from re-issuing requests may stay hidden when volume is low, however the performance overhead becomes increasingly painful as volumes increase.

For example, if 10% of requests from the Data Services API to the UI/UX API lack data integrity, then the load on the system increases as follows:

  • UI/UX API issues 10% more requests to the Data Services API
  • Data Services API receives 10% more requests
  • Data Services API issues 10% more database queries
  • Data Services receives 10% more responses from the database which it then turns into 10% more responses to the UI/UX API
  • UI/UX API receives 10% more responses

Add all of that up and you can see that a lack of data integrity in just 10% of responses actually increases the overall system load by a lot more than 10%.

Validate in, validate out

Validating data when it is received and just before it is returned allows your API provide strong data integrity guarantees.

The pseudo code below highlights a validate in/out pattern for an API that performs a simple database query. Similar logic applies to an API at any layer of the stack, such as the UI/UX API or a Platform Services API.

# Get inputs to API (via URL, query string, or request body)
in = get_inputs()

# Validate inputs, return error (such as 400) if invalid
try:
    validate(in)
catch:
    log(exception)
    return(error_response)

# Query database
out = query_db()

# Validate data before sending to client
try:
    validate(out)
catch:
    log(exception)
    return(error_response)

return out

The client that calls this API can trust that the data it receives conforms to the data schema. The client’s validation code will fail less often. Further, the logic in the client can be simplified as it will require less logic to work around problems in the data. This in turn will result in lower maintenance costs.

Summary

A high quality API should provide strong data integrity guarantees. To achieve this goal an API should validate both inputs and outputs.

Strong data integrity yields high quality data that improves business performance by allowing managers to trust the information they use to make decisions. Further, high quality data improves engineering performance by simplifying development for engineers at all layers in the stack.




Subscribe to our newsletter

Contact Information

ABOUT EXPONENTIAL.IO

We specialize in helping professional developers, like you, expand your skill set. Our courses are focused on enabling you to learn everything necessary to use a new technology in a live, production application.

LOCATION

All courses are made with love in
Palo Alto, CA.

Subscribe to our newsletter