When designed well, a data lake is an effective data-driven design pattern for capturing a wide range of data types, both old and new, at large scale. By definition, a data lake is optimized for the quick ingestion of raw, detailed source data plus on-the-fly processing of such data for exploration, analytics, and operations. Even so, traditional, latent data practices are possible, too.
Organizations are adopting the data lake design pattern (whether on Hadoop or a relational database) because lakes provision the kind of raw data that users need for data exploration and discovery-oriented forms of advanced analytics. A data lake can also be a consolidation point for both new and traditional data, thereby enabling analytics correlations across all data. With the right end-user tools, a data lake can enable the self-service data practices that both technical and business users need. These practices wring business value from big data, other new data sources, and burgeoning enterprise data; these assets are not mere cost centers. Furthermore, a data lake can modernize and extend programs for data warehousing, analytics, data integration, and other data-driven solutions.