One of the most daunting tasks (in the USA and globally) that is not discussed in commercial research circles very often is the challenge data scientists have getting their data into the cloud AND convincing information security to configure cloud-based tools to allow access to critical data.
We all know the benefits of the cloud are many, including less expensive ways to store data, the scalability of big data platforms, and an advanced tool kit with many AI and machine learning tools (data and machine learning operations.) Risk and information security professionals still have many concerns about cloud data security, including adequate staffing and configuration expertise to make broad-scale use of cloud data a reality. There is often confusion about the current software stack and the scope of their cloud capabilities (on-prem versus cloud), further complicating the maze of issues surrounding using data in the cloud.
Chief data officers should be joined at the hip with chief technology officers and information security professionals to allow the migration of analytics data to the cloud and provide configuration of access tools (i.e. MS, Azure, or AWS).
The Challenge
- Chief risk officers have fears of data breaches. However, many may not be familiar with the latest cloud data security protocols and technologies that allow for creating highly secure data zones.
- Information security departments are often woefully understaffed versus other parts of IT. They lack the skillsets and talent to configure newer tools to set up security protocols and controls for getting data to cloud and allowing access.
- Other IT departments may similarly lack configuration and API skills for secure cloud tool integration.
- Vendors are focused on matching the business problem to the software solution but may neglect to mention the level of skills needed for proper deployment.
Recommended Solutions
- Bring your risk and compliance teams early in software or technology purchase decisions.
- Collaborate with information security to map the implementation journey before proceeding with large-scale cloud-based data science deployments.
- Before buying new cloud-based big data platforms or ML and DataOps tools, create a data science and cloud data literacy program. Most data literacy programs focus on data quality and governance but lack a focus on data science and cloud big data tools. Such a program should include curriculum or learning paths for cloud data security.
- Ensure the business case for new cloud tools includes the total cost of ownership, including the best fit-for-purpose resource costs for the first five years post-deployment. This is a journey, and the right skill sets are essential for each journey.
- Consider whether your market has the right talent pool to run your selected cloud ecosystem. Ensure Information Security has the right skillsets to configure and deploy the tools.
- Vendors will provide rapid cycle training, but the business must understand these programs and how quickly their team members will come up to speed.
- Understand what anonymization and generative AI solutions are available, which also help protect data in the cloud.
- Create an identity graph and keying system of standard identifiers to help protect data and ensure restricted data types are secured.
- Establish a cloud data risk committee and map all potential risks and controls to be implemented across all functional areas (i.e. information security, data science, technology, etc.) to understand the unique issues related to AI/analytics in the cloud.
- Data scientists must map out their use cases, explaining the analytics and data matching to be performed:
- Transaction analysis
- Network analysis
- Sentiment analysis
- Risk analysis
I hope this post helps frame some of the issues and recommended solutions. I look forward to your thoughts and any additions you might have.