+34 91 904 7138
Contact us
How to register ?
Client account
Contact us
+34 91 904 7138
Contact us
How to register ?
TRAINING COURSES
OUR SOLUTIONS
CERTIFICATIONS
USEFUL INFORMATION
ABOUT ORSYS
All our training areas
Digital technologies
Management – Personal development
Business skills
How to register ?
inter-company training
In-house training courses
State-of-the-art seminars
Remote classrooms
Customised services
Educational engineering
Publisher certifications
Certification programs
Corporate services
Framework contracts and packages
Our center in Madrid
How to register ?
Overview of a training course
Sustainable development and CSR
Who are we?
Teaching method
TRAINING COURSES
>
Digital technologies
Management – Personal development
Business skills
All our training areas
How to register ?
OUR SOLUTIONS
>
CERTIFICATIONS
>
USEFUL INFORMATION
>
Corporate services
Framework contracts and packages
Our center in Madrid
How to register ?
Overview of a training course
Sustainable development and CSR
ABOUT ORSYS
>
Who are we?
Teaching method
Vous avez déjà un compte client ORSYS
Se connecter
Vous êtes nouveau chez ORSYS
Créer un compte
Vous ne souhaitez pas créer de compte
Poursuivre sans me connecter
Course : Big Data: State of the art
Big Data: State of the art
INTER
IN-HOUSE
CUSTOM
Seminar
in person or remote class
Ref.
BGA
2d
- 14h00
Price : Contact us
Download in PDF format
Share this course by email
Training at your location, our location or remotely
Ref. BGA
2d
- 14h00
Would you like to transpose this course—without changes—for your company?
Quote request
Download in PDF format
Share this course by email
A la carte training
Do you want a training course tailored to the needs of your company and its teams?
Your training will be built to measure by our experts!
Quote request
Download in PDF format
Share this course by email
en
esp
OBJECTIVES
PROGRAMME
Teaching objectives
At the end of the training, the participant will be able to:
Learn the main concepts of Big Data.
Identify the economic issues
Evaluate the pros and cons of Big Data.
Understand the main problems and potential solutions
Identify the main methods and areas of application for Big Data
Course schedule
1
Introduction
The origins of Big Data: A world of digital data, e-health, timeline.
The four-V's definition: Origins of the data.
A breakthrough: Changes in quantity, quality, and habits.
The value of data: A change in importance.
Data as a raw material.
The fourth paradigm of scientific discovery.
2
Big Data: Processing, from acquisition to result.
The sequence of operations. Acquisition.
Data collection: crawling, scraping.
Managing event flows (Complex Event Processing, CEP).
Indexing incoming flows.
Integration with old data.
Data quality: A fifth V?
Different types of processing: Searching, learning (Machine Learning, transactional learning, data mining).
Other sequencing models: Amazon, e-Health.
One or more data repositories? From Hadoop to the in-memory.
From tonal analysis to knowledge discovery.
3
Relationships between the Cloud and Big Data
The architecture model of public and private Clouds.
XaaS services.
The goals and benefits of Cloud architectures.
Infrastructure.
Similarities and differences between the Cloud and Big Data.
Storage clouds.
Classification, security, and privacy of data.
Structure as a classification criterion: Unstructured, structured, semi-structured.
Classification by life cycle: Temporary or permanent data, active archives.
Security difficulties: Increased volumes, distribution.
Potential solutions.
4
Introduction to Open Data
Philosophy of open data and goals.
Releasing public data.
Implementation difficulties.
Essential features of open data.
Areas involved. Expected benefits.
5
Equipment for storage architectures
Servers, disks, networks, and use of SSD drives, importance of network infrastructure.
Cloud architectures and more traditional architectures.
Benefits and difficulties.
The TCO. Power consumption: Servers (IPNM), drives (MAID).
Object storage: principle and benefits.
Object storage compared to traditional NAS and SAN storage.
Software architecture.
Storage management location levels.
Software-Defined Storage.
Centralized architecture (Hadoop File System).
Peer-to-peer and hybrid architectures.
Interfaces and connectors: S3, CDMI, FUSE, etc.
Future of other storage types (NAS, SAN) relative to object storage.
6
Data protection
Preservation over time in the face of increased volumes.
Online or local backups?
Traditional archiving and active archiving.
Links with storage hierarchy management: Future of magnetic tape.
Multisite replication.
Damage to storage media.
7
Scope processing methods
Classification of analysis methods based on data volume and processing power.
Hadoop: The Map Reduce processing model.
The Hadoop ecosystem: Hive, Pig. The difficulties of Hadoop.
OpenStack and the Ceph data manager.
Complex Event Processing: An example? Storm.
From BI to Big Data.
Return to decisional and transactional models: NoSQL databases. Types and examples.
Data ingestion and indexing. Two examples: Splunk and Logstash.
Open-Source crawlers.
Search and analysis: Elasticsearch.
Learning: Mahout. In-memory.
Visualization: Real-time or not, in the Cloud (Bime), comparison of QlikView, Tibco Spotfire, and Tableau.
A general architecture of data mining via Big Data.
8
Usage case through examples and conclusion
Anticipation: Needs of users within companies, equipment maintenance.
Security: People, fraud detection (mail, taxes), the network.
Recommendation. Marketing analysis and impact analyses.
Path analyses. Distribution of video content.
Big Data for the automotive industry? For the oil industry?
Should you begin a Big Data project?
What future is there for data?
Governance of data storage: Roles and recommendations, Data Scientists, skills involved in a Big Data project.