Secure Data Integration

Anyone working in data today is spoilt for choice when it comes to software and tools for moving and analysing data. The fantastic low and no-code movement also means technical integrations are readily accessible to  those who don’t code. Whether you’re seeking Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT), data visualisation, or database as a service the availability of low-code and no-code software means non-techies can get stuck in, with little to no administrator oversight.

More and more SaaS apps are coming to market each month and at the end of 2022 there were almost 10,000 apps in the Martech Map, up almost 300% in the last 5 years. Data apps are growing fastest but what is clear is that as businesses use more apps and software, the need for self-service tools and integrations will continue to grow.

Compliance, privacy and data security are (rightly) some of the hottest topics in marketing and data. When businesses interact with customers they gather data, and CRMs, ERPs, Data Warehouses etc contain a lot of sensitive data – Names, addresses, card details, payroll information and every day, more is added. Access to this data is (or it should be) strictly controlled and security risks minimised by keeping the data contained.

The problem comes when businesses want this data analysed. The analysis often means allowing software to connect to the closed system to pull data out. These external tools will likely have different access levels which can create significant misalignment.

Connecting apps to sensitive data is becoming increasingly common and necessary. But connecting apps can lead to leaks in sensitive data. This is a real risk to customer trust and can lead to breaches of regulations such as the GDPR or HIPAA, so needs to be treated very carefully.

Keep your data integrations secure


Thankfully it is possible to minimise the risks by thinking strategically about your integrations and keeping a few questions front of mind when connecting your systems. 

  • How can we prevent unnecessary flow of data to other systems?
  • How can we secure the data if it doesn’t need to be shared?
  • How can we ensure proper access control?
  • How do we ensure that if there is a breach, damage is minimised?

 

Eliminating risk is nigh on impossible, but minimising it should be very high on your list of data priorities. Let’s look at how we can do that.

Data Security

Separate Your Data Assets

Data functions are split broadly into three categories; namely storage, processing and visualisation. These are often connected but creating a degree of separation is a good place to start to minimise the risks.

Ecommerce business, software business, consulting businesses all have their own needs when it comes to databases. A database will likely contain everything from inventory to customer info. As businesses grow they’ll bring in people to make the data work harder and one of the first things they’ll ask for is access to the datasets so they can work the data to gather insights.

The easy response is to grant them access to the main database, but in reality this is a bad move. Mainly because data could inadvertently be exported to a dashboard for example and seen by people who shouldn’t see it, not to mention the performance issues you’ll experience if running queries on a live database.

The key is to take a step back and look specifically at what needs to be analysed and replicate this data (and only this data) into a secondary store. The best options for this are analytics specific data warehouses such as Amazon RedShift or Google BigQuery. By creating the secondary store you ring fence only what you need and keep everything else secure, and the data you’re analysing is split from the production database so won’t impact performance of your live assets.

Exclude and Mask Your Data

 

Prevention is better than a cure, right? This is also true in data analysis and using exclusion rules or masking can help prevent the flow of sensitive information in the first place.

A large chunk of your data and compliance issues can be solved by simply not extracting the sensitive information. It’s similar to the concept of “least privilege” whereby users only have access to the least amount of resources they need to perform what they need to. If you don’t need to send someone’s postcode from your CRM to your database then don’t.

Exclusion is a straightforward concept and one which is helped greatly by an ETL tool that allows you to select subsets of data to extract. If you have this in place you can very easily select only the data you need and leave sensitive data in place.

In reality though, there will be scenarios when sensitive data needs to be extracted for analysis. This is were data masking (or hashing) comes in. Masking helps with end to end security because it keeps the information unique but it makes the sensitive data unreadable so you can carry out your analysis while keeping your data secure.

Document and Log Routinely


Access to your sensitive data should be strictly controlled, but you should also keep detailed records of who is accessing it, when and where the data is going. This is important for two reasons: 

  1. If you know who is accessing and when, you can quickly detect and anomalies and shut down suspicious behaviour to protect your sensitive data.
  2. Most regulators require you to show that you’re tracking the data, and if you want to achieve various accreditations you’ll need to have the necessary documentation. 

 

Keeping records is both your responsibility and that of any data tools you use, so this should be a key consideration when selecting tools to bring into your organisation. I recommend assessing in detail how logs are stored, access-controls (think least privilege) and whether there are fundamentals like two-factor authentication. A good indicator of a tool’s credentials is whether it has a certification such as SOC-2 Type 2 as this means the company has robust processes in place. Don’t be put off a tool that doesn’t have this though as strong security measures might still be in place.

Secure Data Access

Security vs Access, You Can have Both

Sharing data has become increasingly necessary for businesses to function these days. With that comes an increasing need to keep data secure while allowing access. Thankfully for data pros and businesses you can have your cake and eat it (to an extent).

If you think about your data needs strategically and employ sound management principles you can create a secure environment while allowing all the necessary access to do great things with your data.

Make sure you know what data you need to share and create secure copies that can be worked on away from your live database. Next, keep sensitive data out of pipelines as much as possible with a “least privilege” approach using exclusion. Always mask any sensitive data that does need to be shared. Finally, make sure you have detailed audit and logging in place to enable you to spot issues and deal with them quickly.

Share:

Related Posts