memorando tÉcnicolnbr.cnpem.br/wp-content/uploads/2016/08/met-112014.pdf · memorando tÉcnico...

10
MEMORANDO TÉCNICO _____________________________________________________________________________________________________ BDAgro CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu Yi Ling, Guilherme M. Sanches, Paulo S. G. Magalhães, João E. Ferreira and Carlos E. Driemeier Laboratório Nacional de Ciência e Tecnologia do Bioetanol CTBE Centro Nacional de Pesquisa em Energia e Materiais CNPEM Campinas, November 2014

Upload: others

Post on 07-Nov-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

MEMORANDO TÉCNICO

_____________________________________________________________________________________________________

BDAgro – CTBE Database of Agricultural

Experiments

Angélica O. Pontes, Liu Yi Ling, Guilherme M. Sanches, Paulo S. G. Magalhães,

João E. Ferreira and Carlos E. Driemeier

Laboratório Nacional de Ciência e Tecnologia do Bioetanol – CTBE

Centro Nacional de Pesquisa em Energia e Materiais – CNPEM

Campinas, November 2014

Page 2: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

1 BDAgro – CTBE Database of Agricultural Experiments

Summary

1. Introduction .......................................................................................... 2

2. Standard tools for software development ............................................. 2

3. Conceptual data model ........................................................................ 3

4. Logical data model ............................................................................... 5

5. Physical data model ............................................................................. 6

6. Examples of query and table ................................................................ 6

7. Analytical module ................................................................................. 7

8. Conclusion ........................................................................................... 8

9. References .......................................................................................... 9

Figures

Figure 1: Conceptual data model of the CTBE Database of Agricultural Experiments4

Figure 2: Logical data model of the CTBE Database of Agricultural Experiments ...... 5

Figure 3: List of events from one selected experiment, as retrieved through the SQL

query ............................................................................................................. 6

Figure 4: Table of types of attributes specifying the applied linearization and filtering

functions, as visualized in pgAdmin ........................................................................... 7

Figure 5: Table of data preparation for analysis, as visualized in pgAdmin. The table

contains three data states, raw (bruto), linearized (linearizado), and filtered (filtrado),

for all types of attributes ............................................................................................ 8

Page 3: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

2 BDAgro – CTBE Database of Agricultural Experiments

1. Introduction

Contemporary scientific and technologic research is evolving to become

increasingly data-intensive and collaborative. Proper computing capabilities for data

acquisition, storage, sharing, modelling, and analysis are pivotal in this novel

research perspective (HEY & TREFETHEN, 2003; TANSLEY & TOLLE, 2009). The

pervasive role of computing is observed across virtually all disciplines. In particular,

this is the case of agricultural experiments aiming at enhancing biomass production

in an environmentally benign way. Such agricultural experiments are within the scope

of the Brazilian Bioethanol Science and Technology Laboratory – CTBE. With this

motivation, we developed the CTBE Database of Agricultural Experiments (Banco de

Dados de Experimentos Agrícolas – BDAgro), which is described in the present

Technical Memorandum.

BDAgro was developed with the following specific aims. (i) to store data of

CTBE agricultural experiments in structured form, assuring long-term data

readability; (ii) to enable statistical analysis and knowledge discovery integrated to

the database; and (iii) to pave the way for data-driven collaboration with other

research groups in Brazil and abroad.

BDAgro was a joint development of CTBE e-Science and sugarcane

precision agriculture research groups, with support from the experience of the e-

Science group from the Math and Statistics Institute from the University of São Paulo

(IME-USP). Although database development and initial data sets were associated to

research in sugarcane precision agriculture, BDAgro was modelled with the aim of

supporting all CTBE agricultural experiments.

2. Standard tools for software development

BDAgro as well as other databases and software developed within the CTBE

e-Science group will preferentially follow selected tools. Free, open-source platforms

will be always preferred and adopted whenever possible.

Concretely, BDAgro was developed with basis on the following tools:

PostgreeSQL as relational database management system;

pgAdmin as database administration and development platform;

R programming language for statistical computing integrated to the database;

Python as auxiliary programming language employed primarily to create SQL

scripts to input data into the database.

Page 4: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

3 BDAgro – CTBE Database of Agricultural Experiments

In addition to these selected tools, it is important to mention that the glossary

of BDAgro is in Portuguese language. Furthermore, BDAgro was developed in IME-

USP computer server and will be soon migrated to a local CNPEM server.

3. Conceptual data model

The BDAgro conceptual model, i.e. entity-relationship model, (ELMASRI &

NAVATHE, 2010) is shown in Figure 1. The model comprises the following entities:

Project (projeto): One project is defined by the contract terms with a research

granting agency (CNPq, FAPESP, etc.) or a company, or an internal project

of CTBE/CNPEM.

Experiment (experimento): One experiment is defined by a certain land area

during a certain period of time. Land area is most often an open agricultural

field, but may also be inside close environments such as greenhouses.

Event (evento): One event is an important fact within one experiment. Events

may be of three types: (i) intervention, associated with change in

experimental land area (e.g., harvest); (ii) characterization, associated with

data acquisition without change in land area (e.g., characterization of soil

granulometry); and (iii) planning, representing a record associated with

neither physical change in land area nor new data acquisition (e.g., nutrient

application recipes).

Person (pessoa): One person is an individual that may be responsible for a

project, an event, or the data of an event.

Static data (dado estático): Data generated by events are termed static

because events are defined at specific moments within one experiment.

Static data has x and y spatial coordinates as attributes. Additional attributes

depend on the type of static data, with each type of static data stored in a

dedicated table. Soli granulometry (granulometria solo), soil apparent

electrical conductivity (condutividade elétrica aparente), and harvest yield

(produtividade) are examples of static data types.

Dynamic data (dado dinâmico): Data acquired continuously during the course

of one experiment is termed dynamic data. Date is one attribute of dynamic

data. Additional attributes depend on type of dynamic data. Meteorological

information is one example of dynamic data.

Page 5: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

4 BDAgro – CTBE Database of Agricultural Experiments

Figure 1: Conceptual data model of the CTBE Database of Agricultural Experiments

Page 6: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

5 BDAgro – CTBE Database of Agricultural Experiments

4. Logical data model

The logical data model of BDAgro is presented in Figure 2, representing data

tables, their attributes, and relationships.

Figure 2: Logical data model of the CTBE Database of Agricultural Experiments

Page 7: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

6 BDAgro – CTBE Database of Agricultural Experiments

5. Physical data model

The physical model is encoded in the SQL script that created BDAgro. The

script is available at the following address accessible to CNPEM personnel. The

script will be provided to external researchers upon request.

Central de Documentos > Central de Documentos > Programa de Pesquisa Básica > e-Science > BDAgro >

Memorando Tecnico > BDAgro_141002.sql

6. Examples of query and table

Information retrieval from BDAgro is obtained through explicit SQL queries.

As an example, the following script shows the table of events from one selected

experiment.

SELECT id_evento, E.data, AT.descricao AS atividade, RD.nome_pessoa AS responsavel_dado, RC.nome_pessoa AS responsavel_campo, OB.descricao AS objeto, TE.descricao AS evento, TS.descricao AS laboratorio_sensor, EX.nome_experimento AS experimento, DB.descricao AS tp_modo_aquisicao_dado, detalhamento_evento FROM evento E INNER JOIN tp_atividade AT ON AT.id_tp_atividade = E.id_tp_atividade INNER JOIN pessoa RD ON RD.id_pessoa = E.id_responsavel_dado INNER JOIN pessoa RC ON RC.id_pessoa = E.id_responsavel_campo INNER JOIN tp_objeto OB ON OB.id_tp_objeto = E.id_tp_objeto INNER JOIN tp_evento TE ON TE.id_tp_evento = E.id_tp_evento

LEFT OUTER JOIN tp_laboratorio_sensor TS ON TS.id_tp_laboratorio_sensor = E.id_tp_laboratorio_sensor

INNER JOIN experimento EX ON EX.id_experimento = E.id_experimento LEFT OUTER JOIN tp_modo_aquisicao_dado DB ON DB.id_tp_modo_aquisicao_dado = E.id_tp_modo_aquisicao_dado

order by id_evento;

Figure 3: List of events from one selected experiment, as retrieved through the SQL query

Page 8: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

7 BDAgro – CTBE Database of Agricultural Experiments

7. Analytical module

Data analysis is performed through analytical steps recorded in tables distinct

from those of raw data. This strategy creates a separate analytical environment

within BDAgro. This analytical environment is not represented in the conceptual and

logical data models of Figures 1 and 2, respectively. Two tables associated with

analysis of static data are shown in Figures 4 and 5. The table in Figure 4

(analise_tp_atributo) delineates types of attributes, which hold primary keys

(id_tp_atributo). This table also identifies the functions for linearization (column

linearizacao) and filtering (column filtro) applied to raw data. These functions create

two data states, linearized (linearizado) and filtered (filtrado), in addition to the raw

(bruto) data state.

Each data state is recorded as one attribute of the data preparation table

(analise_preparacao_pontosxy) shown in Figure 5. Note that this data preparation

step maps raw data from distinct tables and attributes into a single column (bruto) of

the data preparation table. Such data preparation tables will be the base for

construction of data analysis workflows. In a recent publication (DRIEMEIER et al,

2014) we described a version of the analysis workflow for experiments in sugarcane

precision agriculture.

Figure 4: Table of types of attributes specifying the applied linearization and filtering

functions, as visualized in pgAdmin

Page 9: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

8 BDAgro – CTBE Database of Agricultural Experiments

8. Conclusion

CTBE Database of Agricultural Experiments (BDAgro) was created and it is

described in the present Technical Memorandum. This database will store data of

CTBE agricultural experiments in structured form, enabling long-term data

readability, statistical analysis integrated to the database, and data-driven

collaboration with other research groups.

Figure 5: Table of data preparation for analysis, as visualized in pgAdmin. The table contains

three data states, raw (bruto), linearized (linearizado), and filtered (filtrado), for all types of

attributes

Page 10: MEMORANDO TÉCNICOlnbr.cnpem.br/wp-content/uploads/2016/08/MeT-112014.pdf · MEMORANDO TÉCNICO _____ BDAgro – CTBE Database of Agricultural Experiments Angélica O. Pontes, Liu

O CTBE integra o CNPEM, Organização Social qualificada pelo Ministério da Ciência, Tecnologia e Inovação (MCTI)

9 BDAgro – CTBE Database of Agricultural Experiments

9. References

DRIEMEIER, C. E.; LING, L. Y.; PONTES, A. O.; SANCHES, G. M.; FRANCO, H. C.

J.; MAGALHÃES, P. S. G.; FERREIRA, J. E. Data analysis workflow for experiments

in sugarcane precision agriculture. In: E-Science (e-Science) IEEE 10th Int. Conf. São

Paulo, 2014.

ELMASRI, R.; NAVATHE, S. B. Fundamentals of Database Systems. 6. ed. Addison-

Wesley, 2010.

FERREIRA, J. E.; FINGER, M. Controle de concorrência e distribuição de dados: a

teoria clássica, suas limitações e extensões modernas. XII Escola de Computação,

IME-USP, 2000.

HEUSER, C. A. Projeto de Banco de Dados. 6. ed. Bookman, 2008.

HEY, A. J. G.; TREFETHEN, A. E. The data deluge: An e-science perspective. 2003.

TANSLEY, S.; TOLLE, K. M. (Ed.). The fourth paradigm: data-intensive scientific

discovery, 2009.