sketch-finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho...

1

Sketch-Finder – A New Approach for Sketch-Based

Image Retrieval

Carlos Alberto F. Pimentel Filho [email protected]

Arnaldo de Albuquerque Araújo (UFMG) Michel Crucianu (CNAM)

2

Introduction

Content-Based Image Retrieval (CBIR)

Sketch-Based Image Retrieval (SBIR)

Mind-Finder Approach

Sketch-Finder Approach

Experiments

Conclusion

Future Work

3

Content-Based Image Retrieval

Query-by-Sketch:

Query-by-Painting:

Query-by-Example:

Query-by-Icon:

Query-by-Text: 3

4

Sketch-Based Image Retrieval

SBIR fills two gaps in image retrieval (i) Allows specification details like object position,

scale and rotation. (ii) Allows image retrieval when there is no example

image to use. Our goal is: to retrieve in large datasets images

visually similar to the query sketch object's shape at similar scale, position and rotation.

5

Sketch-Based Image Retrieval Why do We Care?

6

Web Image Retrieval Personal Image Retrieval

Mobile Image Retrieval Video Retrieval

Sketch-Based Image Retrieval

Query-by-Sketch:

osition Sensitive:

Object's shape at similar scale, position and rotation. Approaches:

Mind-Finder (EI)

Sketch-Finder

Compact Hash Bits

Object Sensitive:

Object's shape at any scale, position and rotation

Approaches (BoW):

HOG

GF-HOG

FISH

SYM-FISH

Mind-Finder (Edgel-Index)

9

Edgel-Index

* Compares matching of edgels

* Huge number of edgels for big dataset. * Edgel:

10

Sketch-Finder Image processing flow (dataset): Query flow:

11

Contour Detection & Threshold

Clique para adicionar texto

11

12

Orientation and Dilation


12

13

Wavelet Transform


Wedgel: Contour signature: set of wedgels

14

Similarity Measure

15

Why Wavelet Transform?


16

Indexing Structure


16

17

Dataset Evaluation

For evaluating we are comparing Edgel-Index [1] with Sketch-Finder

* Paris Dataset: 6412 images [2] * ImageNet Dataset: Subset of 535K images [3]

[1] Yang Cao et al, Edgel index for large-scale sketch-based image search. [2] Visual Geometry Group - http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/index.html [3] ImageNet: http://www.image-net.org/

Genertic Algorithm

Genertic Algorithm

Population is a set of 100 Selection of the best results (mAP20)

Crossover

Mutation

Some user sketches

Group (VGG1) from Flickr.To collect the sketches for the Paris dataset queries, we

asked some voluntaries to draw one sketch for each one ofthe 11 categories of Paris landmarks present in the dataset(La Defense, Tour Ei↵el, Hotel des Invalides, Musee du Lou-vre, Moulin Rouge, Musee d’Orsay, Notre Dame, Pantheon,Pompidou, Sacre Cœur and Arc de Triomphe). We selected10 users composing a set of 110 sketches. The sketchesand the ground-truth are available2. Fig. 7 presents somesketches collected for the Paris dataset query evaluation.

Also, the Paris dataset was used to compare the e↵ec-tiveness of our approach with the sketch-finder [9] and themind-finder [3]. This e�ciency was evaluated consideringthe precision of z best rank position, and in this paper weused the 20 best positions as in [17].

Figure 7: Examples of the Paris sketch dataset.

To evaluate the e�ciency of [9], [3] and our approach, weused the CPU time and I/O in a big dataset with more than535K images. These images were issued from ImageNet3 asdescribed in [9], and we performed the experiments with 75queries.

5.1 Parameter Relevance EvaluationTo discover the most relevant parameters of our approach,

we applied the 2k Factorial Design [12]. We have six para-meters or factors and, as in the 2k Factorial Design eachfactor of k has two alternatives levels, the higher and lowervalue of the factor, our analysis has 26, or 64 possibilitiesconfiguration of parameters combinations.

The parameters and the e↵ect of each one are describedin the following:

a) radius map growth for the first wavelet trans-form: this parameter corresponds to the first grown edgelmap Gr1

✓ pattern used for the wavelet transform;b) radius map growth for the second wavelet trans-

form: this parameter corresponds to the second Gr2✓ used

for the wavelet transform;c) radius map growth for the third wavelet trans-

form: this parameter corresponds to the third Gr3✓ used for

the wavelet transform;d) radius map growth for pixel matching (OCM):

this parameter corresponds to the radius rOCM used in theOCM. In our approach, 0 meant to not use the pixel match-ing, i.e., only measure the wavelet similarity (WQ,T );

e) number of wavelet coe�cients: this parameter cor-responds to the number of wavelet coe�cients used to repre-sent the most discriminant information of each grown edegel1Visual Geometry Group – http://www.robots.ox.ac.uk/˜ vgg/data/parisbuildings/index.html2Paris sketches and ground-truth – https://sites.google.com/site/sketchretrieval/3ImageNet – http://www.imagenet.org/

Table 1: 2k Factorial design model

Factor L H R

a) Edgel map 1 3 15 13.3%

b) Edgel map 2 15 30 0.5%

c) Edgel map 3 30 45 0.2%

d) Edgel map (OCM) 0 45 36.2%

e) Coe�cients 20 50 13.5%

f) Threshold 0.18 0.30 36.3%

map in the compressed-domain of wavelet. For this parame-ters, half of the coe�cients are negative and the other halfpositive;f) threshold of the image contours: this parameter

is used to threshold the UCM image for indexing.Table 1, presents the lowest (L) and the highest (H) values

used to test each factor. Also Table 1 presents the relevance(R) of each factor in percentage.The two most important parameters in our set are the

threshold of the image contours and the radius size of theOCM similarity comparison, both summing more than 70%of impact in the precision. The variation on the number ofwavelet coe�cients from the lowest the highest value, andthe first radius of grown edge map presents each one almost14% of impact, while the variation of the second and thethird radius maps is almost insignificant.The di↵erence between our approach and the one pre-

sented in [9] is the addition of the oriented chamfer matchingfor comparing sketch and image contours. The e↵ect of thisparameter presented by the 2k Factorial Design indicatesimprovement in e↵ectiveness of our proposal over [9].

5.2 Parameter Tuning with genetic algorithmThe parameters described in Section 5.1 need to be well

setted in order to obtain the best performance of our ap-proach. Genetic algorithms are robust search and optimiza-tion techniques for finding the global optimum in a multi-modal landscape. In this section, we show how the para-meters of our approach were chosen in a genetic evolutionapproach [15].To set the parameters, due to the large number of exper-

iments, we used a small dataset, the Paris dataset with aground-truth for the sketches that we collected.The parameters and its intervals were chosen as described

in Section 5.1. We started the genetic algorithm with apopulation of one hundred random di↵erent configurationswhere each parameter had a random value between the low-est and highest limits (L and H) presented in Table 1. Oneach “evolution”, the best fitness solution was preserved forthe next iteration or “generation”. The other fitness solu-tions above the average were preserved for crossover andmutation, while solution fitness sub averaged were disrupted.Random best solutions in pairs were used to generate twonew solutions with crossover and mutation of parameters.For each hight fitness pair, with a probability Pc of 90%, weapplied a crossover in three of the six random parameters,and with a probability Pm of 10% we applied a mutation intwo random parameters inside the limits. Table 1 presentsthe lowest and highest limits (L and H) used in the mutationfor each parameter.To represent and evaluate each individual parameter con-

figuration in a single fitness value, we used the sub area

21

Some Results (Paris Dataset)


under the precision⇥recall [5] curve of the 20 first results.We considered this sub area as the resume of the best rankposition as criteria because a good retrieval solution mustpresent the expected result in the very first positions.

Extensive experiments were conducted to achieve the bestrank of the proposed approach. More than 25 genetic gen-eration evolutions, each one with 100 individual set of pa-rameters were performed, what gave more than 2,500 ex-periments. Each experiment built one index solution andperformed on it 110 queries by sketch, i.e., in total 275,000queries.

5.3 Evaluation of sketch retrieval effectivenessand efficiency

The experiments on sketch retrieval used the best para-meters obtained by the genetic algorithm in average of 110queries. In the present approach, this configuration is re-spectively 14, 27, 45, 29, 46 and 0.18 for the parameters a,b, c, d, e and f, described in Section 5.1.

For the sketch-finder, we used the same parameters incommon with our approach. This configuration is respec-tively 14, 27, 45, 46, and 0.18 for the parameters: a, b, c, eand f presented in Section 5.1.

Regarding to the evaluation of the Mind-Finder, we ap-plied the same parameters to the steps in common with thesketch-finder and our approach, i.e., 256⇥256 of image reso-lution, same contour detection (UCM) and threshold = 0.18.For the radius r we experimented several configurations, be-tween 25 and 65, finding that r = 45 brings the best fitnesson 110 queries of the Paris dataset using the same criteriafor the fitness as in our approach. The selected parameterswere used as default for the Mind-Finder in the comparisonswith the sketch-finder and our approach.

The experiments were realized in a machine with CPUIntel Xeon X5670 with 2.93GHz and 72Gb of RAM memory.

For the e↵ectiveness evaluation, we considered the 20 bestrank position. According to the results, our approach over-come Sketch-Finder (SF) and Mind-Finder (MF) in terms ofe↵ectiveness as shown by the precision curve of the 20 bestrank positions presented in Fig. 8.

Figure 8: Best z ranked images.

Fig. 9 shows some queries performed by our approachusing the Paris dataset. The first image on each line isthe query image and the following images are the four bestranked images. Also, some of these results on the Parisdataset are available4.

To compare the e�ciency of the three approaches, we eva-luated the query CPU time, I/O and the size of the indexes.To evaluate the CPU and I/O of the queries, we used thesubset of the ImageNet with 75 queries.

4Queries results of our approach on the Paris Dataset –https://sites.google.com/site/sketchretrieval/

Figure 9: Query results for the Paris dataset.

Table 2: CPU: query time in seconds

Sketch-Finder Mind-Finder Our

CPU AVG 6.56 394.43 31.66

CPU SD 1.13 401.57 11.73

Table 3: I/O in bytes


I/O AVG 2.89 ⇥ 108 2.96 ⇥ 109 1.52 ⇥ 109

I/O SD 1.58 ⇥ 107 2.79 ⇥ 108 4.08 ⇥ 108

Table 4: Index size in bytes


2.14 ⇥ 109 2.61 ⇥ 1010 2.05 ⇥ 1011

Regarding to the CPU cost, we did not considered thetime of I/O used to retrieve the inverted file list of IDs.With this strategy we can simulate the CPU time withoutconsider I/O variations and main memory limitations. Thebenchmark of the CPU cost is presented in seconds in Ta-ble 2, with its average time (AVG) and standard deviation(SD) of the 75 queries.Regarding to the I/O, we measured it in bytes considering

the the size of all inverted lists of IDs used on each one ofthe 75 queries. Table 3 presents the average I/O and thestandard deviation.The index size of our approach is the bigger among the

evaluated approaches due to the storage of the preprocessedgown edgel maps used in the OCM similarity comparison,however, as advantage, this index makes possible a fasterquery than the approach of [3], see Table 2. The total indexsize of each approach is presented in Table 4.

6. CONCLUSIONThis work presented an approach for SBIR using both,

the compressed-domain and the pixel domain indexes. Thecompressed-domain index allows the comparison betweenthe sketch and the image dataset contours in a few set ofdata while the pixel domain is used to improve the precisionby applying a spatial pixel consistency verification using the

22

Effectiveness Precision vs. Recall

We used the VGG ground-truth for the Paris dataset and built one for the ImageNet.

The same sketches were used to evaluate Mind-

Finder and Sketch-Finder

23

Average - Precision vs. Recall (75)

23

24

Efficiency Comparison - CPU

25

Efficiency Comparison – I/O

26

Conclusion

Sketch-Finder: •  The number of retrieved inverted files is reduced to

a small and fixed number; •  The volume of indexed data is 5% of the Edgel-

Index; •  The speed of retrieval is faster due to less amount of

data.

27

Future Work

Build an android Sketch-Finder application

28

Thank You!

Carlos Alberto Fraga Pimentel Filho [email protected]

sketch-finder: uma abordagem para recuperação efetiva e eficiente de imagens com base em rascunho...

Technology