aws roadshow 2013 curitiba

Post on 06-May-2015

950 Views

Category:

Technology

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

AWS RoadShow 2013 Curitiba

TRANSCRIPT

José Papo - @josepapo

Thank you Sponsors!

The Roadshow

Key note - AWS e Parceiros AWS 14:00 - 15:30

Break

Key note - AWS e Parceiros AWS 16:00 -18:00

Apresentacoes e Videos

http://awshub.com.br

Pessoal

e totalmente transferivel

Vamos escrever juntos a historia do Cloud

Computing no Brasil

#awssummit

Resumo do Dia:

Benefícios da Nuvem da AWS

Casos de Uso da Nuvem AWS

Arquiteturas de Software para o

Século XXI (e demos!)

Resumo do Dia:

Benefícios da Nuvem da AWS

Casos de Uso da Nuvem AWS

Arquiteturas de Software para o

Século XXI

A Nuvem é o alavancador das novas tendências tecnológicas

“Para sobreviver no mercado, empresas precisam continuamente realocar recursos de processos de manutenção para processos de inovação.”

“Cloud is like a fertilizer that creates Startups”

Eric Ries

“Amazon Web Services is probably the most

important thing that has happened to mobile

and web app developers that the press just

misses. Jeff Bezos has accidentally or maybe

on purpose powered a whole generation of

applications.” Steve Blank

Acelerando o boom das startups e novos devices

Otimizando as Grandes Corporações

Dezenas de Milhares de Clientes na América Latina

Amazon S3: Mais de 2 Trilhões de Objetos

1.1M requisições/seg

no pico

Amazon Elastic MapReduce: Clusters criados pelos clientes

5.5 M clusters desde Maio de 2010

2012

159 2011

82

2010

61 2009

48

2008

24

2007

9 Amazon FPS

Red Hat EC2

SimpleDB

CloudFront

EBS

Availability Zones

Elastic IPs

Relational Database Service

Virtual Private Cloud

Elastic Map Reduce

Auto Scaling

Reserved Instances

Elastic Load Balancer Simple Notification Service

Route 53

RDS Multi-AZ

Singapore Region

Identity Access Management

Elastic Beanstalk

Simple Email Service

CloudFormation

RDS for Oracle

ElastiCache

DynamoDB

Simple Workflow

CloudSearch

Storage Gateway

Route 53 Latency Based Routing

Mais de 125 anúncios já feitos em 2013

Inovação Técnica rápida

e orientada aos clientes

Jul 28, 2013

Anunciando Amazon SNS for Mobile Push

– 1 milhão de notificações gratuitas por

mês e $1,00 para cada milhão adicional

Mar 11, 2013

Anunciando AWS Elastic Beanstalk para

Node.js

Feb 18, 2013

Anunciando AWS OpsWorks

Jan 28, 2013

Anunciando Amazon Elastic Transcoder

“Queria muito enviar notificações

push para apps móveis de forma

simples e barata”

“Seria tão bom se no Beanstalk eu

pudesse usar Node.js!”

“É difícil gerenciar minhas Chef recipes”

“Seria bom se a AWS deixasse o

processo de Transcoding mais fácil”

Samsung reduziu drasticamente seus custos com a AWS

Economizou $34 Milhões 85% de economia versus modelo tradicional

Desperdício

Clientes

Insatisfeitos

Demanda Atual

Demanda Prevista

Tradicional Cloud

Demanda atual

AWS

Redução de Custos via Elasticidade

Redução de Custos via preços baixos

Escala nos permite reduzir

preços continuamente

Estamos acostumados a trabalhar

em um negócio de altos volumes e

baixas margens

Nós sempre passamos

nossas economias de escala

a nossos clientes

37

Reduções de

preços

Somos obcecados em ajudar nossos clientes a economizar

Shell usa a AWS para aumentar significativamente sua agilidade

Time Remoto

Time Interno

Recursos Extras

Time de Outsourcing

Aumentando a agilidade, a TI deixa de ser vista como…

E passa a ser vista como tendo…

Nasdaq usa AWS para criar um novo serviço para hedge funds

Inovação: Experimentação Rápida com Custo e Risco Baixo

On-Premises

Menos Experimentos

Falhar é caro

Menos Inovação

Experimenta mais

Falha rápido com custo baixo

Mais Inovação

$ Milhões Centenas

ou Milhares

Gartner Magic Quadrant for Cloud Infrastructure as a Service (August 19, 2013)

Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong (asteven@amazon.com). Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Resumo do Dia:

Benefícios da Nuvem da AWS

Casos de Uso da Nuvem AWS

Arquiteturas de Software para o

Século XXI

Big Data

Analysis of Data Can Transform Society

Create new business

models and improve

organizational

processes.

Enhance scientific

understanding, drive

innovation, and

accelerate medical cures.

Increase public safety

and improve

energy efficiency with

smart grids.

Democratizing Analytics gets Value out of Big Data

Unlock Value in

Silicon

Support Open

Platforms

Deliver Software Value

Intel at the Intersection of Big Data

Enabling exascale computing on massive data

sets

Helping enterprises build open

interoperable clouds

Contributing code and fostering ecosystem

HPC Cloud Open Source

Intel at the Heart of the Cloud

Server

Storage

Network

Reinventing Supercomputing

On Demand

Scale-Out Platform Optimizations for Big Data

Cost-effective performance

•Intel® Advanced Vector Extension Technology

•Intel® Turbo Boost Technology 2.0

•Intel® Advanced Encryption Standard New

Instructions Technology

Power of the Platform built by Intel

Richer

user

experiences

4HRS

50% Reduction

10MIN

80% Reduction 50%

Reduction 40% Reduction

TeraSort for

1TB sort

Intel®

Xeon®

Processor

E5 2600

Solid-State

Drive 10G

Ethernet Intel® Apache

Hadoop

Previous

Intel®

Xeon®

Processor

Cloud

Intelligent Systems

Clients

Virtuous Cycle of Data-Driven Experience

BIG DATA

ANALYTICS

ON AWS

Michel Pereira

LET’S TALK

ABOUT DATA

Data-Obese,

Digital-Fast

DATA SUPPLY CHAIN

BIG

The data is too big, moves

too fast, or doesn’t fit the

strictures of your database

architectures

27 TB per day Large Hadron Collider – CERN

LET’S TALK

ABOUT TOOLS

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

Generation

Collection & storage

Analytics & computation

Collaboration & sharing

EC2 & S3,

CloudFormation,

Elastic MapReduce,

RDS, DynamoDB, Redshift

EC2,

Elastic MapReduce &

Redshift

S3, Glacier,

Storage Gateway,

DynamoDB,

Redshift, RDS,

HBase AWS Data Pipeline

AMAZON EMR HADOOP + AWS

What is EMR?

Map-Reduce engine Integrated with tools

Hadoop-as-a-service

Massively parallel

Cost effective AWS wrapper

Integrated to AWS services

2 million+ Hadoop clusters last year

Amazon EMR is the #1 Enterprise Hadoop Solution

AWS is “the most prominent Hadoop cloud service provider” and “leads the pack (of Leaders) due to its proven, feature-rich Elastic MapReduce service…”

-The Forrester Wave™: Enterprise Hadoop Solutions Q1 2012

LET’S TAKE A DIP

HADOOP

HDFS

HIVE

AWS Elastic MapReduce

EMR cluster Start an EMR cluster using console or cli tools

Master instance group EMR cluster

Master instance group created that controls the cluster

Master instance group EMR cluster

Core instance group

Core instance group created for life of cluster

Master instance group EMR cluster

Core instance group

HDFS HDFS

Core instances run DataNode and TaskTracker daemons

Master instance group EMR cluster

Task instance group Core instance group

HDFS HDFS

Optional task instances can be added or subtracted to perform work

Master instance group EMR cluster

Task instance group Core instance group

HDFS HDFS

Amazon S3

S3 can be used as underlying ‘file system’ for input/output data

Master instance group EMR cluster

Task instance group Core instance group

HDFS HDFS

Amazon S3

Master node coordinates distribution of work and manages cluster state

Master instance group EMR cluster

Task instance group Core instance group

HDFS HDFS

Amazon S3

Core and Task instances read-write to S3

Working with EMR

THE BIGGER

THE BETTER

PETABYTES AND EVEN EXABYTES

GIGABYTES AND TERABYTES

MEGABYTES

KILOBYTES

Amazon Redshift

Design Objectives

A petabyte-scale data warehouse service that was…

Amazon Redshift

A Whole Lot Simpler

A Lot Cheaper

A Lot Faster

Redshift Dramatically Reduces I/O

• Direct-attached storage

• Large data block sizes

• Columnar storage

• Data compression

• Zone maps

Id Age State 123 20 CA 345 25 WA 678 40 FL

Row storage Column storage

Redshift Runs on Optimized Hardware

• Optimized for I/O intensive workloads

• HS1.8XL available on Amazon EC2

• Runs in HPC - fast network

• High disk density

HS1.8XL: 128GB RAM, 16 Cores, 24 Spindles, 16TB Storage, 2GB/sec scan rate

HS1.XL: 16GB RAM, 2 Cores, 3 Spindles, 2TB Storage Click to grow

…to 1.6PB

Redshift Parallelizes and Distributes Everything

Load

Query

Resize

Backup

Restore

10 GigE (HPC)

Ingestion Backup Restore

JDBC/ODBC

Resize your cluster while remaining online

New target provisioned in the background

Only charged for source cluster

Resize your cluster while remaining online

• Fully automated Data automatically redistributed

• Read only mode during resize

• Parallel node-to-node data copy

• Automatic DNS-based endpoint cut-over

• Only charged for one cluster

Amazon Redshift has security built-in

• SSL to secure data in transit

• Encryption to secure data at rest AES-256; hardware accelerated

All blocks on disks and in Amazon S3 encrypted

• No direct access to compute nodes

• Amazon VPC support

10 GigE (HPC)

Ingestion Backup Restore

Customer VPC

Internal VPC

JDBC/ODBC

Continuous Backup, Automated Recovery

• Replication within the cluster and backup to Amazon S3 to maintain multiple copies of data at all times

• Backups to Amazon S3 are continuous, automatic, and incremental

Designed for 99.999999999% durability

• Continuous monitoring and automated recovery from failures of drives and nodes

• Able to restore snapshots to any Availability Zone within a region

Redshift is Priced to Analyze All Your Data

$0.85 per hour for on-demand (2TB)

$999 per TB per year (3-yr reservation)

CASE STUDY

HAPYRUS

Data

TSV files, gzip compressed

Imp_log

1) 300GB / 300M record 2) 1.2TB / 1.2B record

date datetime

publisher_id integer

ad_campaign_id integer

bid_price real

country varchar(30)

attr1-4 varchar(255)

click_log

1) 1.4GB / 1.5M record 2) 5.6GB / 6M record

date datetime

publisher_id integer

ad_campaign_id integer

country varchar(30)

attr1-4 varchar(255)

1) for 1 month 2) for 4 months

ad_campaign

100MB / 100k record

publisher

10MB / 10k record

advertiser

10MB / 10k record

We use 5 tables to run a query which join tables and creates a report.

Sample Query

select

ac.ad_campaign_id as ad_campaign_id,

adv.advertiser_id as advertiser_id,

cs.spending as spending,

ims.imp_total as imp_total,

cs.click_total as click_total,

click_total/imp_total as CTR,

spending/click_total as CPC,

spending/(imp_total/1000) as CPM

from

ad_campaigns ac

join

advertisers adv

on (ac.advertiser_id = adv.advertiser_id)

join

(select

il.ad_campaign_id,

count(*) as imp_total

from

imp_logs il

group by

il.ad_campaign_id

) ims on (ims.ad_campaign_id = ac.ad_campaign_id)

join

(select

cl.ad_campaign_id,

sum(cl.bid_price) as spending,

count(*) as click_total

from

click_logs cl

group by

cl.ad_campaign_id

) cs on (cs.ad_campaign_id = ac.ad_campaign_id);

The query generates a basic report for ad campaigns performance, imp, click numbers, advertiser spending, CTR, CPC and CPM.

1. Query Speed

• Redshift takes 155 seconds to complete our query for 1.2TB

• Hadoop takes 1491 seconds to complete our query for 1.2TB

• Redshift is about 10 times faster than Hadoop for this query

Here, we are comparing Hadoop and Redshift servers of the same cost. (Hadoop: c1.xlarge vs Redshift: dw.hs1.xlarge).

0

500

1000

1500

2000

300GB 1.2TBPro

cess

ing

Tim

e (

seco

nd

s)

Data Size

Query Speed

Redshift

672sec

38sec 155sec

1491sec

* The query used can be referenced in our Appendix

2. Total Cost

• Redshift costs $20 per month to run queries every 30 minutes

• Hadoop costs $210 per month to run queries every 30 minutes

• Redshift is about 10 times cheaper than Hadoop to run this job

Here, we are comparing Hadoop and Redshift servers running the same query for the same duration of time.

$0

$50

$100

$150

$200

$250

$300

$350

$400

0 50 100 150 200 250

Co

st P

er

Day

(U

S$)

Query Per Day

Cost Per Day (query for 300GB data size)

RedshiftHadoop

Redshift Query Result

Data Size Instance Type Number of

Instances Trial

Processing

Time Average

Server Cost Per

Day

300GB dw.hs1.xlarge 1

1 58s

38s $20.40

2 43s

3 31s

4 30s

5 30s

1.2TB dw.hs1.xlarge 1

1 164s

155s $20.40

2 149s

3 158s

4 156s

5 150s

Hadoop Query Result

Data Size Instance Type Instance

Number

Processing

Time

Server Cost Per

Day

300GB

c1.xlarge 1 1h 23m 2s $0.80

c1.medium 10 37m 48s $0.89

c1.xlarge 10 11m 12s $1.06

1.2TB

m1.xlarge 1 6h 43m 24s $3.22

c1.medium 4 5h 14m 0s $3.04

c1.xlarge 10 37m 7s $3.58

c1.xlarge 20 24m 51s $4.64

Elastic MapReduce

and/or

Redshift

• Used by analysts and data scientists to explore raw data before some, all, or none of it is added to the data warehouse

• Structured OR Unstructured Data

EMR for Exploratory Analytics

analytic sandbox

All data fed into EMR data store

EMR

Exploratory Analytic Environment

Structured Data and Redshift

reporting warehouse

RDS (Relational)

Data Pipeline

Redshift

OLTP ERP

Reporting and BI

S3

Structured Data and Redshift

live archive

DynamoDB (NoSQL)

Redshift

OLTP Web Apps

Reporting and BI

Data Pipeline

Unstructured Data and Redshift

transform and enrich

S3 S3 EMR

Redshift

logs / files

Data Pipeline

Reporting and BI

exploratory analytics

AWS Big Data Overview

Redshift

CRM

ERP

Billing

OLTP

Web Apps

Business Apps

Reporting and BI

Dashboarding

Ad Hoc Analysis

RDS

DynamoDB

S3

EBS

EMR

Data Pipeline

Obtenha 600 Horas Gratuitas

de Tempo de Supercomputação!

Passe no Stand da

Intel para obter as 600

horas de computação

www.powerof60.com

Segurança

Construída com altos padrões de segurança

Infraestrutura de Segurança AWS

SOC 1/SSAE 16/ISAE 3402, ISO 27001, PCI DSS, HIPAA, ITAR,

FISMA Moderate, FIPS 140-2, FedRAMP

Suas Apps

Infraestrutura Global US West

(Northern

California)

US East (Northern

Virginia)

EU (Ireland)

Asia

Pacific (Singapore)

Asia

Pacific (Tokyo)

Regiões AWS

Pontos Edge AWS

GovCloud (US ITAR

Region)

US West (Oregon)

South

America (Sao Paulo)

Asia

Pacific (Australia)

Regiões da AWS e Zonas de Disponibilidade

Cliente pode decidir onde suas aplicações e dados residem

Amazon VPC

Região AWS

Subrede pública

Subrede privada

Data Center

Matriz

Zona de Disponibilidade 1

Zona de disponibilidade 2

Subsidiárias

VPN Gateway Gateway

Internet Gateway

Amazon S3 Amazon SimpleDB Amazon SES Amazon SQS “Nuvem Privada não precisa ser dentro de casa” - Gartner

10G

Arquitetura Híbrida com a AWS

Link Dedicado

Aberta, flexível e suportada pelos principais fornecedores Sistemas Operacionais Linguagens e Bibliotecas Aplicações Certificadas

Android iOS Java nodeJS .NET PHP Python Ruby

Rico conjunto de APIs e kits de dev para as principais linguagens e plataformas

E ferramentas e plugins integrados ao seu ambiente de desenvolvimento

Eclipse Visual Studio CLI Powershell

Suporte a muitas linguagens e ferramentas

Resumo do Dia:

Benefícios da Nuvem da AWS

Casos de Uso da Nuvem AWS

Arquiteturas de Software para o

Século XXI (e demos!)

Com AWS, cresça de um servidor

…para milhares

Totalmente automatizado!

Além de escalabilidade nos servidores você pode

Adicionar bilhões de objetos com o Amazon S3

Selecionar a performance desejada nos bancos de

dados

Processar e analisar petabytes de dados facilmente

‘Cost Aware Architecture’

…ao utilizar: Reduza Custo de

Compute

1. S3/CloudFront para Otimização de conteúdo estático

2. Load Balancing e Auto-Scaling desde o início

Storage 4. Armazenar objetos derivados no S3 ‘Reduced

Redundancy’ e usar Glacier sempre que possível

Banco de Dados 5. Read Replicas e/ou ElastiCache para performance

e redução de custos dos bancos de dados

Dev & Test 6. Ambientes Dev/Test/CI criados/desligados sob demanda

3. Modelos de Preços On-Demand, Reservado e Spot

7. A/B Testing e Testes de Carga mais baratos

O que isso significa em termos de custos?

Um Exemplo

Mês

Instancias EC2 Medium 1 $ 121

CloudFront Data Transfer Out 1Tb $ 168

Requisições CloudFront $1.89

TOTAL $ 291

Mês

Instancias EC2 Medium 4 $ 485

AWS Data Transfer Out 1Tb $ 194

TOTAL $ 679

Arquitetura Usual Arquitetura Otimizada

Custo 57% menor - Até 6 x mais rápido

Volume de Storage no S3

Números de Horas de Servidores EC2

Fevereiro de 2013

48,7 milhões de usuários

Levantou $338M de capital

Valuation de $2.5B

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Servidores de Aplicação Python

150 instancias EC2 High-CPU

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Web Services em Python

35 instancias EC2 High-CPU

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Caches Memcache e Redis

90 instancias EC2 High-Memory

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Servidores de Bancos de Dados MySQL

70 Pares Master/Slave

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Srorage no Amazon S3

8 Bilhões de Objetos • 410 Terabytes

Web Application

Servers

Sharded Database

Cache Servers Internal

Web Services

File Storage

Serviços Auxiliares

60 Instancias EC2

Development

Logging

Operational Tools

Asynchronous

Task Workers Search

Data Analysis

Elastic MapReduce Continuous Integration

• Maior parte do tráfego acontece à tarde e no início da noite, portanto reduzem o número

de servidores durante a madrugada em 40%.

• No pico gastam $52 por hora com EC2 e à noite, for a do pico, o gasto é de $15 por hora.

Economias de até 71%

Ambientes Dev / Test

Versão de Homologação

Versão Beta / MVP

Arquitetura de

Produção 1.0

1 2

3

Arquitetura de

Produção 2.0

Arquitetura de

Produção 3.0

Arquitetura de

Produção 4.0

"Startups are all about focus. AWS enables focus" Ray Bradford, Kleiner Perkins, Caulfield & Byers

Sua aplicação

Seu negócio e seu diferencial competitivo

Inovação, não gestão de hardware / data centers / software

Investir tempo dos profissionais de TI no que importa

Automatizar o máximo que puder

(Insight profundo: Tempo do Profissional de TI = Muito Dinheiro!)

Automação = Foco!

…cresceu para 14 milhões de usuários em menos de um ano

…chegou a 150 milhões de fotos e terabytes de dados

…1 milhão de usuários em 12 horas após lançar versão Android

…mais de 100 milhões de usuários ativos em Janeiro de 2013

na Nuvem da AWS… com 3 engenheiros

Automação

Total

Controle

Total

Elastic

Beanstalk

CloudFormation

“Só quero minha

aplicação funcionando,

com acesso a servidores

só se necessário”

“Eu quero colocar no

controle de versões

toda a definição do meu

data center”

Construa Aplicações, Não Infraestrutura

AWS

OpsWorks

“Quero usar o Chef com

simplicidade e

orquestração de recipes”

EC2 EBS

RDS ELB

Upload de sua aplicação Beanstalk faz deploy Você ainda tem controle

Elastic Beanstalk

Não construa seu próprio…

1. Disparador de Emails

2. Fila de Mensagens

3. Notificações

4. Transcoding

5. Busca

6. Monitoração

7. Gestão de Workflow

…Use serviços prontos quando possível

…mas use como serviço

Amazon Simple Email Service

Amazon Simple Queuing Service

Amazon Simple Notification Service

Amazon Elastic Transcoder

Amazon CloudSearch

Amazon CloudWatch

Amazon Simple Work Flow

DEMOS!!!

White Papers

4X Mais Confiável e até 1/4 do Custo de Infra Tradicional

http://media.amazonwebservices.com/idc_aws_business_value_report_2012.pdf

Storage custa menos na Nuvem que dentro de casa

Relembrando:

Benefícios da Nuvem da AWS

Casos de Uso da Nuvem AWS

Arquiteturas de Software para o

Século XXI

OBRIGADO! awshub.com.br

slideshare.net/AmazonWebServicesLATAM

José Papo

AWS Tech Evangelist

@josepapo

top related