aprendizado de máquina e visualização de informação para otimização de sistemas de...

74
Robson Motta | robson@chaordic.com.br Aprendizado de Máquina e Visualização de Informação para otimização de Sistemas de Recomendação

Upload: robson-motta

Post on 27-Jan-2017

257 views

Category:

Technology


0 download

TRANSCRIPT

Robson Motta | [email protected]

Aprendizado de Máquina e Visualização de Informaçãopara otimização de Sistemas de Recomendação

312.000.000.000(this means billions)

recommendations in 2014

Get to know our solutions

How to present the

bestrecommendation for each client/context?

recommendations

data

recommendations

data

preprocessing

processing

postprocessing

● products● pageviews● clicks● buyorders

etc.

Machine Learning

“All models are wrong,but some are useful”

(George E. P. Box)

Collaborative Filtering1

Collaborative Filtering

1

Customers Who Bought This Item Also Bought, PaulsHealthBlog.com, 11.04.2014

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

Collaborative Filtering

1

user-based

Collaborative Filtering

1

10 5 7 0 2 3 4 1...Collaborative

Filtering

1

10 5 7 0 2 3 4 1...item-based

Collaborative Filtering

1

Challenges

+...

popular items

outliers

incompatible

principal-accessory

+

+

???new items

How do weguarantee qualityto our clients?

● subjective evaluation: Visualization● objective evaluation: Quality measures● online evaluation: A/B test● online optimization: Bandit

Multidimensional Projection(tSNE technique)

Stability, purity and coverage measures

Content-based Filtering2

Content-based Filtering

2

frequency of term n in document d

IDF factor ofterm n

weight of term nwithin document d

reference

reference

reference

reference

Content-based Filtering

2

Content-based Filtering

2

Content-based Filtering

2

Clustering3

Clustering

3

Clustering

3

Clustering

3

… main issues

the numberof clusters

Clustering

3

Clustering

3

Clustering

3

… main issues

false positives(pair of products wrongly

assigned to the same cluster)

false negatives(pair of products wrongly

assigned to different clusters)

Clustering

3

Clustering

3

Clustering

3

Classification4

Classification

4

Classification

4

Classification

4

Classification

4

… main issues

unbalanced classes

unlabeled areas

Classification

4

Challenges

+...

popular items

outliers

incompatible

principal-accessory

+

+

???new items

Challenges

+...

popular items

outliers

incompatible

principal-accessory

+

+

???new itemsx

Circular connected chart: alternatives

Circular connected chart: complementars

Tabular information

Circular connected chart: complementars

A/B tests

+16%clicks

final result:10 days95% significance

Multi-armed Bandit5

Multi-armed Bandit

5

Exploration-Exploitation trade-off

Multi-armed Bandit

5

… case 1

algorithm 2

algorithm 1

…algorithm N

Multi-armed Bandit

5

… case 2

order 2

order 1

Multi-armed Bandit

chance to be picked

5

Multi-armed Bandit

5 chance to be picked

Multi-armed Bandit

5 chance to be picked

Multi-armed Bandit

5 chance to be picked

Multi-armed Bandit

user feedback: click

5 chance to be picked

Multi-armed Bandit

5 chance to be picked

Multi-armed Bandit

user feedback: click

5 chance to be picked

Bandit - Beta Distribution

http://www.distributome.org/js/sim/BetaSimulation.html

0 success and10 attempts

0 success and0 attempts

5 success and10 attempts

http://www.distributome.org/js/sim/BetaSimulation.html

0 success and10 attempts

0 success and0 attempts

5 success and10 attempts

Bandit - Beta Distribution

http://www.distributome.org/js/sim/BetaSimulation.html

0 success and10 attempts

0 success and0 attempts

5 success and10 attempts

Bandit - Beta Distribution

0 success and10 attempts

0 success and0 attempts

5 success and10 attempts

Bandit - Thompson Sampling

http://www.distributome.org/js/sim/BetaSimulation.html

Bandit - Thompson Sampling

success and attempts: [(0, 10), (0, 7), (0, 7), (0, 6), (0, 4), (0, 3), (0, 4), (0, 3), (0, 0), (0, 0), ...

success and attempts: [(1, 44), (10, 398), (0, 66), (1, 57), (2, 25), (14, 324), (0, 3), (1, 46), ...

Bandit - Thompson Sampling

success and attempts: [(103, 1183), (64, 1138), (48, 900), (25, 524), (56, 527), (37, 546), (11, 216), …

success and attempts: [(143, 2227), (8, 299), (119, 1706), (28, 889), (146, 1288), (86, 1646), (63, 1272) ...

Bandit convergence

A/B tests

+3,5 % purchases

final result:25 days95% significance

Robson [email protected]