Validate input and output
Designing an API with strong data integrity improves data quality, overall system performance, and allows businesses to make better data-driven decisions. Data integrity also helps improve development velocity at all levels of the engineering organization.
According to Wikipedia, data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle, and is a critical aspect to the design, implementation and usage of any system which stores, processes, or retrieves data.
Data integrity ensures that your data is accurate which simplifies life for all levels of an organization.
Accurate data allows managers to make decisions based on data while trusting that the data is accurate.
Data integrity allows data analysts to spend more time analyzing data and waste less time cleaning up data or interpolating missing data.
High quality data also helps UI/UX developers by allowing them to simplify and reduce the amount of code to check for and handle low quality data.
Beyond data quality, data integrity directly affects application performance. How? Quality at every step of a data pipeline prevents backtracking which reduces the number of requests thereby directly reducing bandwidth and CPU utilization.
When data integrity does not exist then each tier that finds a problem in the data must issue additional requests in an attempt to retrieve the correct data. The performance problems from re-issuing requests may stay hidden when volume is low, however the performance overhead becomes increasingly painful as volumes increase.
For example, if 10% of requests from the Data Services API to the UI/UX API lack data integrity, then the load on the system increases as follows:
Add all of that up and you can see that a lack of data integrity in just 10% of responses actually increases the overall system load by a lot more than 10%.
Validating data when it is received and just before it is returned allows your API provide strong data integrity guarantees.
The pseudo code below highlights a validate in/out pattern for an API that performs a simple database query. Similar logic applies to an API at any layer of the stack, such as the UI/UX API or a Platform Services API.
# Get inputs to API (via URL, query string, or request body) in = get_inputs() # Validate inputs, return error (such as 400) if invalid try: validate(in) catch: log(exception) return(error_response) # Query database out = query_db() # Validate data before sending to client try: validate(out) catch: log(exception) return(error_response) return out
The client that calls this API can trust that the data it receives conforms to the data schema. The client’s validation code will fail less often. Further, the logic in the client can be simplified as it will require less logic to work around problems in the data. This in turn will result in lower maintenance costs.
A high quality API should provide strong data integrity guarantees. To achieve this goal an API should validate both inputs and outputs.
Strong data integrity yields high quality data that improves business performance by allowing managers to trust the information they use to make decisions. Further, high quality data improves engineering performance by simplifying development for engineers at all layers in the stack.
We specialize in helping professional developers, like you, expand your skill set. Our courses are focused on enabling you to learn everything necessary to use a new technology in a live, production application.
All courses are made with love in
Palo Alto, CA.
Subscribe to our newsletter