As data continues to grow in both volume and formats across multiple deployments, performing analytics has become more complicated. By 2019, 75% of analytic solutions will incorporate 10 or more external data sources. Organizations able to glean insights from this diverse data set will have competitive advantages, from deeper understanding of customers, better responsiveness to trends, and more efficient operations, to name just a few.
This reality of data diversity has given rise to the “data lake”-a data management architecture that allows organizations store and analyze a wide variety structured and unstructured data.
A data lake is a method of data storage. What makes this approach unique is that all of the data is stored in its native format. This means that data in the lake might include everything from highly structured files to completely unstructured data such as videos, emails and images.
In addition, it is not only IT that is now integrating data. Business users are also getting involved with new self-service data preparation tools. The question is, is this the only way to manage data? Is there another level that we can get reach to allow us to more easily manage and govern data across an increasingly complex data landscape?
This seminar/Conference looks at the challenges faced by companies trying to deal with an exploding number of data sources, collecting data in multiple data stores (Cloud and on-premises), multiple analytical systems and at the requirements to be able to define, govern, manage and share trusted high quality information in a distributed and hybrid computing environment.
It also explores a new approach of how IT data architects, business users and IT developers can collaborate together in building and managing a Logical Data Lake to get control of your data. This includes data ingestion, automated data discovery, data profiling and tagging and publishing data in an information catalog.
It also involves refining raw data to produce Enterprise Data Services that can be published in a catalog available for consumption across your company. We also introduce multiple Data Lake configurations including a centralised Data Lake and a ‘logical’ distributed Data Lake as well as execution of jobs and governance across multiple data stores.
Once the organization has made a decision of leveraging Hadoop data lake, the next big concern is to plan and size a Hadoop cluster.
This session will encompass different parameters involved in sizing a cluster; like size and structure of data, Velocity of ingesting (batch and/or streaming), the computational complexity of your use-cases from a Hadoop administrators perspective.
Organizations in the past have struggled with storage requirements for ever growing data and increased computational demand to process this data. With advent of Hadoop there was a way to handle structured and unstructured data at scale but that came with the cost of maintaining a huge infrastructure. In recent years leading Public Cloud Service Providers (CSP) have started offering lot of capabilities in Data Engineering and Data Analytics domain. This has forced many organizations to rethink about their Data Strategy i.e. Storage, Processing and Analytics including ML & AI. There are lot of capabilities available with CSPs which could increase your execution pace and time to market for Data products and applications. In this presentation we will look at approach towards creating and managing Data Lake on Google Cloud Platform (one of the CSPs) and how to leverage Data Lake to create your Data Marts, Data warehouse, Predictive Analytics and Machine Learning applications.
A data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. an enterprise data lake provides the necessary foundation to clear away the enterprise-wide data access problem to all groups/departments in organization. We have seen many multi-billion dollar organizations struggling to build a Enterprise data lake to establish a culture of data-driven insight and innovation because not clear understanding business requirement , data on boarding, analytics and data consumption policy , no proper design on data lake including technology , storage etc. Not following proper architecture patterns for data ingestion , analytics etc.
In this talk , We will cover different architecture pattern to build enterprise data lake including ingestion , analytics and Data visualization such as Kappa , Fabric , Lambda , Wind etc. Even we will cover best practices and open source technology stack can help to build enterprise data lake.
During the session we shall cover the people, process, architecture and technology elements involved in setting up data-lakes. The session shall also include, the data-lake lifecycle, stakeholder management and positioning with respect to data-lakes. Finally, we shall cover best practices, mistakes to avoid and pitfalls in building data-lakes.
Data generated within enterprises having large infrastructure and applications, itself is critical and provides huge opportunity to create a Data lake for exploration & analytics. It consist of, critical business data like Tactical, Strategic and Operational data.
An intelligent, self evolving system can be built using this data lake, which will provide data insights using various ways like analytics, reporting, real time notifications and using chatbot where users will get all relevant, mission critical details immediately. This will help in policy decisions, simplifying business by removing redundancy and cost saving. In this session, we will cover Enterprise data characteristics, Data opportunities and then strategy for building the enterprise data lake.
We will also, discuss about some of the architectures using architecture diagrams at different stages. e.g. Integrating chatbot on enterprise data lake to access the data and the opportunities which will open due to the same.Also, will cover about some of the products, which would be helpful at some stages
More and more customers struggle not only to manage the growth of big data, but also to reap timely business insights from data using their existing data infrastructure. Over the past two years, Red Hat has worked closely with customers aiming to solve these problems with a cost-effective, Open hybrid-cloud infrastructure that can scale to support not only the massive amount of data but also to manage the influx of requests from data scientists needing access to a proliferation of different analytic tools.
Now, infrastructure teams are able to provide self-service workload isolated compute environments either through Red Hat OpenStack Platform, or by containerizing analytic tools on Red Hat OpenShift Container Platform. Please join us to learn about and how, first hand, this modern, provisioned data analytics experience is helping customers achieve improvements to TCO and analytic performance together with modernizing their data analytics infrastructure.
Trend 1 - How organizations are becoming
agile by making use of fast data
Trend 2 - How data driven organizations
are managing data
Trend 3 - Emerging data ecosystem
What it means for us
Enterprises have started adopting 'Cloud First' strategy for becoming 'Data Driven'. In the era of cloud, it is imperative that the architectural patterns for setting up the Data Lake need to dramatically change. This presentation talks about the unlearning cycle from traditional Data Lake implementation practices and a learning cycle for cloud based Data Lake setup. We will talk about the best practices for architecting a serverless Data Lake
Marisoft - I, Annexe Building, Kalyani Nagar, Pune, Maharashtra 411014
020 4000 3000
Confirm your CANCELLATION in writing up to 15 working days before the event and receive a refund less a 10% service charge. Regrettably, no refunds can be made for cancellations received less than 15 working days prior to the event.
However, SUBSTITUTIONS are welcome at any time and is done at no extra cost. The organisers reserve the right to amend the programme if necessary.
Important Disclaimer:The organizers reserve the right to make substitutions or alterations and/or cancel a speaker(s) if deemed necessary by circumstances beyond its control.
INDEMNITY: Should for any reason outside the control of UNICOM Training & Seminars (P) ltd (hereafter called UNICOM), the venue or the speakers change, or the event be cancelled due to industrial action, adverse weather conditions, or an act of terrorism, UNICOM will endeavour to reschedule, but the client hereby indemnifies and holds UNICOM harmless from and against any and all costs, damages and expenses, including attorneys fees, which are incurred by the client. The construction validity and performance of this Agreement shall be governed by all aspects by the laws of India to the exclusive jurisdiction of whose court the Parties hereby agree to submit.