In April 2020, SAP released SAP HANA Cloud services. No doubt this solution will deliver value; indeed, my favorite feature is the Data virtualizations options. However, SAP also claims low-cost storage options:
“SAP HANA Cloud offers low-cost storage options, including SAP HANA native storage extension and a built-in data lake. You can keep your current, business-critical (hot) data in memory for real-time processing and move data that you use frequently but not every day (warm) to the SAP HANA native storage extension. For older, but still important (cold) data, you can use the data lake and still retain access to your data when and where you need it. This tiering helps reduce cost and gives you the freedom to choose where you want to store your data based on when you need it.«
More information on the data lake feature may be found in this short video: https://youtu.be/8mbLyS0FmpA
Let’s clarify some topics
- When SAP talks about data lake in HANA Cloud, they mean Relational Data Lake. In fact, this means SAP IQ database deployed in the cloud.
- SAP IQ is a column based database (formerly Sybase IQ): very fast and efficient, supporting SAP NLS (Near Line Storage) for SAP datawarehouses solution, and also being used as an ILM Store when implementing SAP ILM.
- For cloud developers, usually non-SAP native, a Data Lake is something different to what you’ll find in Wikipedia (See Data lake in Wikipedia). For them, ‘A data lake is a system or repository of data stored in its natural/raw format, usually Binary Large OBject (blob) or files’ .
So, does SAP data lake vie for the classical SAP Data Volume Management (DVM) – DVM being either SAP Data Archiving or SAP ILM ? Today, Archiving is not available in HANA cloud as you will hear in Q&A session at the end of this SAP webinar. But let’s anticipate.
Well, in my opinion SAP data lake does not compete with classical SAP DVM. When you need to store an archiving file, files folder or cloud blobs will do the job, depending on if you’re working on premise or on cloud. When you run SAP datawarehouse, SAP data like (SAP IQ) is the solution of choice for SAP Near Line Storage (Archived) data.
Data lake and SAP ILM
There is one solution where they compete : it is for ILM store (data privacy related projects).
SAP relational data lake is the cold extension of HANA database, and it’s cheaper compared to HANA. Cloud native data lake is mostly a blob or file approach, which is obviously in a much cheaper price range. Both approaches make perfect sense.
Your architecture, your choices
At TJC, in the past, we made folder storage as efficient than content server storage for years. Now we claim you willl use relation data lake as well as cloud data . It’s up to you and your team to figure up what is best for your architecture. We will just make it work the way you want.
Sources of information:
SAP, SAP HANA Cloud services: https://saphanacloudservices.com/
SAP HANA Cloud Services YouTube channel, SAP HANA Cloud: Data Lake. https://www.youtube.com/watch?v=8mbLyS0FmpA&feature=youtu.be
SAP Digital, SAP HANA Cloud Overview: https://info.sapdigital.com/2020-04-21-sap-hana-cloud-overview.html
Wikipedia, Binary Large Objects: https://en.wikipedia.org/wiki/Binary_large_object
Wikipedia, Data library: https://en.wikipedia.org/wiki/Data_library