sketch-finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho...

28
1 Sketch-Finder – A New Approach for Sketch-Based Image Retrieval Carlos Alberto F. Pimentel Filho [email protected] Arnaldo de Albuquerque Araújo (UFMG) Michel Crucianu (CNAM)

Upload: big-data-week-sao-paulo

Post on 01-Dec-2014

134 views

Category:

Technology


0 download

DESCRIPTION

Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens. Carlos Alberto Fraga Pimentel Filho Dentre as várias abordagens de recuperação de imagens existentes, o uso de uma imagem de rascunho permite que o usuário expresse o que deseja buscar de forma visual, simples e rápida. O maior desafio desta categoria de busca consiste em encontrar uma representação para o conteúdo visual que permita comparar de forma eficiente o rascunho do usuário e as imagens da base de dados, mantendo ainda a precisão dos resultados e tendo uma solução escalável. O sketch-finder é uma abordagem para recuperação de imagens com base em rascunho onde tanto o rascunho quanto as bordas das imagens da base de dados são representadas e comparadas no domínio da transformada de wavelet. Assim, apenas os dados mais relevantes, provenientes do rascunho e das imagens, são representados. Um índice invertido indexa as informações com o objetivo de prover uma abordagem eficiente e rápida para a comparação do rascunho com as imagens da base. Além do mais, a solução proposta permite o ajuste do tamanho do índice com base na taxa de compressão de dados. Esse ajuste reflete o balanço entre eficiência e precisão, podendo ser facilmente adequado aos recursos computacionais disponíveis. Uma avaliação comparativa entre o estado da arte usando uma base de imagens de Paris e um subconjunto da base do ImageNet com 535 mil amostras, revela que a presente solução preserva os mesmos níveis de precisão dos resultados ao mesmo tempo em que é bem mais rápida nas consultas.

TRANSCRIPT

Page 1: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

1

Sketch-Finder – A New Approach for Sketch-Based

Image Retrieval

Carlos Alberto F. Pimentel Filho [email protected]

Arnaldo de Albuquerque Araújo (UFMG) Michel Crucianu (CNAM)

Page 2: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

2

Introduction

Content-Based Image Retrieval (CBIR)

Sketch-Based Image Retrieval (SBIR)

Mind-Finder Approach

Sketch-Finder Approach

Experiments

Conclusion

Future Work

Page 3: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

3

Content-Based Image Retrieval

Query-by-Sketch:

Query-by-Painting:

Query-by-Example:

Query-by-Icon:

Query-by-Text: 3

Page 4: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

4

Sketch-Based Image Retrieval

SBIR fills two gaps in image retrieval (i) Allows specification details like object position,

scale and rotation. (ii) Allows image retrieval when there is no example

image to use. Our goal is: to retrieve in large datasets images

visually similar to the query sketch object's shape at similar scale, position and rotation.

Page 5: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

5

Sketch-Based Image Retrieval Why do We Care?

Page 6: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

6

Web Image Retrieval Personal Image Retrieval

Mobile Image Retrieval Video Retrieval

Page 7: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

Sketch-Based Image Retrieval

Query-by-Sketch:

osition Sensitive:

Object's shape at similar scale, position and rotation. Approaches:

Mind-Finder (EI)

Sketch-Finder

Compact Hash Bits

Object Sensitive:

Object's shape at any scale, position and rotation

Approaches (BoW):

HOG

GF-HOG

FISH

SYM-FISH

Page 8: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

Mind-Finder (Edgel-Index)

Page 9: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

9

Edgel-Index

* Compares matching of edgels

* Huge number of edgels for big dataset. * Edgel:

Page 10: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

10

Sketch-Finder Image processing flow (dataset): Query flow:

Page 11: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

11

Contour Detection & Threshold

Clique para adicionar texto

11

Page 12: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

12

Orientation and Dilation

Clique para adicionar texto

12

Page 13: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

13

Wavelet Transform

Clique para adicionar texto

Wedgel: Contour signature: set of wedgels

Page 14: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

14

Similarity Measure

Page 15: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

15

Why Wavelet Transform?

Clique para adicionar texto

Page 16: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

16

Indexing Structure

Clique para adicionar texto

16

Page 17: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

17

Dataset Evaluation

For evaluating we are comparing Edgel-Index [1] with Sketch-Finder

* Paris Dataset: 6412 images [2] * ImageNet Dataset: Subset of 535K images [3]

[1] Yang Cao et al, Edgel index for large-scale sketch-based image search. [2] Visual Geometry Group - http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/index.html [3] ImageNet: http://www.image-net.org/

Page 18: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

Genertic Algorithm

Page 19: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

Genertic Algorithm

Population is a set of 100 Selection of the best results (mAP20)

Crossover

Mutation

Page 20: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

Some user sketches

Group (VGG1) from Flickr.To collect the sketches for the Paris dataset queries, we

asked some voluntaries to draw one sketch for each one ofthe 11 categories of Paris landmarks present in the dataset(La Defense, Tour Ei↵el, Hotel des Invalides, Musee du Lou-vre, Moulin Rouge, Musee d’Orsay, Notre Dame, Pantheon,Pompidou, Sacre Cœur and Arc de Triomphe). We selected10 users composing a set of 110 sketches. The sketchesand the ground-truth are available2. Fig. 7 presents somesketches collected for the Paris dataset query evaluation.

Also, the Paris dataset was used to compare the e↵ec-tiveness of our approach with the sketch-finder [9] and themind-finder [3]. This e�ciency was evaluated consideringthe precision of z best rank position, and in this paper weused the 20 best positions as in [17].

Figure 7: Examples of the Paris sketch dataset.

To evaluate the e�ciency of [9], [3] and our approach, weused the CPU time and I/O in a big dataset with more than535K images. These images were issued from ImageNet3 asdescribed in [9], and we performed the experiments with 75queries.

5.1 Parameter Relevance EvaluationTo discover the most relevant parameters of our approach,

we applied the 2k Factorial Design [12]. We have six para-meters or factors and, as in the 2k Factorial Design eachfactor of k has two alternatives levels, the higher and lowervalue of the factor, our analysis has 26, or 64 possibilitiesconfiguration of parameters combinations.

The parameters and the e↵ect of each one are describedin the following:

a) radius map growth for the first wavelet trans-form: this parameter corresponds to the first grown edgelmap Gr1

✓ pattern used for the wavelet transform;b) radius map growth for the second wavelet trans-

form: this parameter corresponds to the second Gr2✓ used

for the wavelet transform;c) radius map growth for the third wavelet trans-

form: this parameter corresponds to the third Gr3✓ used for

the wavelet transform;d) radius map growth for pixel matching (OCM):

this parameter corresponds to the radius rOCM used in theOCM. In our approach, 0 meant to not use the pixel match-ing, i.e., only measure the wavelet similarity (WQ,T );

e) number of wavelet coe�cients: this parameter cor-responds to the number of wavelet coe�cients used to repre-sent the most discriminant information of each grown edegel1Visual Geometry Group – http://www.robots.ox.ac.uk/˜ vgg/data/parisbuildings/index.html2Paris sketches and ground-truth – https://sites.google.com/site/sketchretrieval/3ImageNet – http://www.imagenet.org/

Table 1: 2k Factorial design model

Factor L H R

a) Edgel map 1 3 15 13.3%

b) Edgel map 2 15 30 0.5%

c) Edgel map 3 30 45 0.2%

d) Edgel map (OCM) 0 45 36.2%

e) Coe�cients 20 50 13.5%

f) Threshold 0.18 0.30 36.3%

map in the compressed-domain of wavelet. For this parame-ters, half of the coe�cients are negative and the other halfpositive;f) threshold of the image contours: this parameter

is used to threshold the UCM image for indexing.Table 1, presents the lowest (L) and the highest (H) values

used to test each factor. Also Table 1 presents the relevance(R) of each factor in percentage.The two most important parameters in our set are the

threshold of the image contours and the radius size of theOCM similarity comparison, both summing more than 70%of impact in the precision. The variation on the number ofwavelet coe�cients from the lowest the highest value, andthe first radius of grown edge map presents each one almost14% of impact, while the variation of the second and thethird radius maps is almost insignificant.The di↵erence between our approach and the one pre-

sented in [9] is the addition of the oriented chamfer matchingfor comparing sketch and image contours. The e↵ect of thisparameter presented by the 2k Factorial Design indicatesimprovement in e↵ectiveness of our proposal over [9].

5.2 Parameter Tuning with genetic algorithmThe parameters described in Section 5.1 need to be well

setted in order to obtain the best performance of our ap-proach. Genetic algorithms are robust search and optimiza-tion techniques for finding the global optimum in a multi-modal landscape. In this section, we show how the para-meters of our approach were chosen in a genetic evolutionapproach [15].To set the parameters, due to the large number of exper-

iments, we used a small dataset, the Paris dataset with aground-truth for the sketches that we collected.The parameters and its intervals were chosen as described

in Section 5.1. We started the genetic algorithm with apopulation of one hundred random di↵erent configurationswhere each parameter had a random value between the low-est and highest limits (L and H) presented in Table 1. Oneach “evolution”, the best fitness solution was preserved forthe next iteration or “generation”. The other fitness solu-tions above the average were preserved for crossover andmutation, while solution fitness sub averaged were disrupted.Random best solutions in pairs were used to generate twonew solutions with crossover and mutation of parameters.For each hight fitness pair, with a probability Pc of 90%, weapplied a crossover in three of the six random parameters,and with a probability Pm of 10% we applied a mutation intwo random parameters inside the limits. Table 1 presentsthe lowest and highest limits (L and H) used in the mutationfor each parameter.To represent and evaluate each individual parameter con-

figuration in a single fitness value, we used the sub area

Page 21: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

21

Some Results (Paris Dataset)

Clique para adicionar texto

under the precision⇥recall [5] curve of the 20 first results.We considered this sub area as the resume of the best rankposition as criteria because a good retrieval solution mustpresent the expected result in the very first positions.

Extensive experiments were conducted to achieve the bestrank of the proposed approach. More than 25 genetic gen-eration evolutions, each one with 100 individual set of pa-rameters were performed, what gave more than 2,500 ex-periments. Each experiment built one index solution andperformed on it 110 queries by sketch, i.e., in total 275,000queries.

5.3 Evaluation of sketch retrieval effectivenessand efficiency

The experiments on sketch retrieval used the best para-meters obtained by the genetic algorithm in average of 110queries. In the present approach, this configuration is re-spectively 14, 27, 45, 29, 46 and 0.18 for the parameters a,b, c, d, e and f, described in Section 5.1.

For the sketch-finder, we used the same parameters incommon with our approach. This configuration is respec-tively 14, 27, 45, 46, and 0.18 for the parameters: a, b, c, eand f presented in Section 5.1.

Regarding to the evaluation of the Mind-Finder, we ap-plied the same parameters to the steps in common with thesketch-finder and our approach, i.e., 256⇥256 of image reso-lution, same contour detection (UCM) and threshold = 0.18.For the radius r we experimented several configurations, be-tween 25 and 65, finding that r = 45 brings the best fitnesson 110 queries of the Paris dataset using the same criteriafor the fitness as in our approach. The selected parameterswere used as default for the Mind-Finder in the comparisonswith the sketch-finder and our approach.

The experiments were realized in a machine with CPUIntel Xeon X5670 with 2.93GHz and 72Gb of RAM memory.

For the e↵ectiveness evaluation, we considered the 20 bestrank position. According to the results, our approach over-come Sketch-Finder (SF) and Mind-Finder (MF) in terms ofe↵ectiveness as shown by the precision curve of the 20 bestrank positions presented in Fig. 8.

Figure 8: Best z ranked images.

Fig. 9 shows some queries performed by our approachusing the Paris dataset. The first image on each line isthe query image and the following images are the four bestranked images. Also, some of these results on the Parisdataset are available4.

To compare the e�ciency of the three approaches, we eva-luated the query CPU time, I/O and the size of the indexes.To evaluate the CPU and I/O of the queries, we used thesubset of the ImageNet with 75 queries.

4Queries results of our approach on the Paris Dataset –https://sites.google.com/site/sketchretrieval/

Figure 9: Query results for the Paris dataset.

Table 2: CPU: query time in seconds

Sketch-Finder Mind-Finder Our

CPU AVG 6.56 394.43 31.66

CPU SD 1.13 401.57 11.73

Table 3: I/O in bytes

Sketch-Finder Mind-Finder Our

I/O AVG 2.89 ⇥ 108 2.96 ⇥ 109 1.52 ⇥ 109

I/O SD 1.58 ⇥ 107 2.79 ⇥ 108 4.08 ⇥ 108

Table 4: Index size in bytes

Sketch-Finder Mind-Finder Our

2.14 ⇥ 109 2.61 ⇥ 1010 2.05 ⇥ 1011

Regarding to the CPU cost, we did not considered thetime of I/O used to retrieve the inverted file list of IDs.With this strategy we can simulate the CPU time withoutconsider I/O variations and main memory limitations. Thebenchmark of the CPU cost is presented in seconds in Ta-ble 2, with its average time (AVG) and standard deviation(SD) of the 75 queries.Regarding to the I/O, we measured it in bytes considering

the the size of all inverted lists of IDs used on each one ofthe 75 queries. Table 3 presents the average I/O and thestandard deviation.The index size of our approach is the bigger among the

evaluated approaches due to the storage of the preprocessedgown edgel maps used in the OCM similarity comparison,however, as advantage, this index makes possible a fasterquery than the approach of [3], see Table 2. The total indexsize of each approach is presented in Table 4.

6. CONCLUSIONThis work presented an approach for SBIR using both,

the compressed-domain and the pixel domain indexes. Thecompressed-domain index allows the comparison betweenthe sketch and the image dataset contours in a few set ofdata while the pixel domain is used to improve the precisionby applying a spatial pixel consistency verification using the

Page 22: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

22

Effectiveness Precision vs. Recall

We used the VGG ground-truth for the Paris dataset and built one for the ImageNet.

The same sketches were used to evaluate Mind-

Finder and Sketch-Finder

Page 23: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

23

Average - Precision vs. Recall (75)

23

Page 24: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

24

Efficiency Comparison - CPU

Page 25: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

25

Efficiency Comparison – I/O

Page 26: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

26

Conclusion

Sketch-Finder: •  The number of retrieved inverted files is reduced to

a small and fixed number; •  The volume of indexed data is 5% of the Edgel-

Index; •  The speed of retrieval is faster due to less amount of

data.

Page 27: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

27

Future Work

Build an android Sketch-Finder application

Page 28: Sketch-Finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho para grandes bases de imagens

28

Thank You!

Carlos Alberto Fraga Pimentel Filho [email protected]