When computers ran on punched cards and information was stored and communicated using paper, suspicious individuals could sometimes be seen loitering close to the large rubbish bins or dumpsters used for corporate refuse.
The idea was to fish discarded documents out of the bin to glean information useful for hacking the enterprise that had thrown them out.
Even if documents considered as highly confidential might have been shredded, company phone books, business correspondence, and similar items might still be found intact. Today’s equivalent to the dumpster may be the data lake. If so, what should you do about it?
Data lakes are repositories that contain large volumes of digital data in its raw or native format. The data can and frequently does come from anywhere, including structured data from enterprise systems, Excel spreadsheets, market surveys, customer feedback, social media conversations, and so on.
By storing data “as is”, a data lake offers great flexibility in mining and analysing the data afterwards, instead of storing it in more rigidly defined formats, as in data warehouses and data marts.
Storage units containing data lakes are often selected for their high storage capacity; “infinitely” expandable cloud storage is also used, for the same reason.
The problem is in the security. Some organisations allow data to be placed in the data lake with little or no control of access to the data lake or of the nature of the data put into it.
As a result, users may put in data that would normally be subject to privacy measures, but that is left visible to all in the data lake. Attacks on data lakes are threats for many organisations, with risk that grows as the number of data lakes multiplies.
The solution? Protection of a data lake starts with the right user authentication and authorization controls, and continues with suitable data encryption, incident response plans, and audit processes.
If your enterprise is lacking any of these things, now is the time to put them in place.