Introducing

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

BIG DATA

EDB Software Italia srl

Updated April 4, 2017

Transcript

postgres=# EXPLAIN (COSTS off) SELECT *

FROM pgbench_accounts

WHERE filler LIKE '%a%';

QUERY PLAN

---------------------------------------------

Gather

Number of Workers: 5

-> Parallel Seq Scan on pgbench_accounts

Filter: (filler ~~ '%a%'::text)

(4 rows)

postgres=# EXPLAIN ANALYSE SELECT * FROM pgbench_branches b JOIN pgbench_accounts a ON a.bid=b.bid WHERE a.filler LIKE '%a%';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=0.00..288938.25 rows=1 width=461) (actual time=7218.031..7218.031 rows=0 loops=1)

Join Filter: (b.bid = a.bid)

-> Seq Scan on pgbench_accounts a (cost=0.00..288935.00 rows=1 width=97) (actual time=7218.029..7218.029 rows=0 loops=1)

Filter: (filler ~~ '%a%'::text)

Rows Removed by Filter: 10000000

-> Seq Scan on pgbench_branches b (cost=0.00..2.00 rows=100 width=364) (never executed)

Planning time: 0.476 ms

Execution time: 7218.088 ms

(8 rows)

Time: 7219,480 ms

postgres=# SET max_parallel_degree = 5;

SET

postgres=# EXPLAIN ANALYSE SELECT * FROM pgbench_branches b JOIN pgbench_accounts a ON a.bid=b.bid WHERE a.filler LIKE '%a%';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------

Gather (cost=1000.00..189937.35 rows=1 width=461) (actual time=2820.992..2820.992 rows=0 loops=1)

Number of Workers: 5

-> Nested Loop (cost=0.00..188937.25 rows=1 width=461) (actual time=2806.913..2806.913 rows=0 loops=6)

Join Filter: (b.bid = a.bid)

-> Parallel Seq Scan on pgbench_accounts a (cost=0.00..188935.00 rows=0 width=97) (actual time=2806.909..2806.909 rows=0 loops=6)

Filter: (filler ~~ '%a%'::text)

Rows Removed by Filter: 1666667

-> Seq Scan on pgbench_branches b (cost=0.00..2.00 rows=100 width=364) (never executed)

Planning time: 0.416 ms

Execution time: 2823.231 ms

(10 rows)

$ EXPLAIN ANALYZE SELECT sum(length(txt)) FROM test;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------

Finalize Aggregate (cost=165005.37..165005.38 rows=1 width=8) (actual time=730.439..730.439 rows=1 loops=1)

-> Gather (cost=165005.15..165005.36 rows=2 width=8) (actual time=730.401..730.435 rows=3 loops=1)

Number of Workers: 2

-> Partial Aggregate (cost=164005.15..164005.16 rows=1 width=8) (actual time=728.536..728.536 rows=1 loops=3)

-> Parallel Seq Scan on test (cost=0.00..143171.43 rows=4166743 width=31) (actual time=0.012..217.970 rows=3333333 loops=3)

Planning time: 0.177 ms

Execution time: 731.256 ms

(7 rows)

Quando si può parlare di "Big Data"?

non esistono metriche esatte per discriminare i casi

dipende dalle tipologie di carico e dalle necessità operative

OLTP/OLAP

Giuseppe Broccolo

DBA e consulente PostgreSQL

BigData

@giubro

gbroccolo7

giuseppe.broccolo@2ndquadrant.it

con PostgreSQL

Francesco Canovai

DBA e consulente PostgreSQL

#bigdata

#postgresql

#postgres

#pycon7

@fcanovai

francesco.canovai

francesco.canovai@2ndquadrant.it

prestazioni

risorse

PostgreSQL

&

Big Data

Risorse

Performance

Parallelizzazione

BRIN

Partizionamento

Tablespace

FDW

Aggregazioni/OLAP

Postgres-XL

Indici & Big Data

Le dimensioni possono essere talvolta superiori a quelle dei dati stessi

Gli indici di notevoli dimensioni sono difficilmente usabili

Indici

strutture binarie su disco, complementari ai dati

organizzazione ad albero

complessità della ricerca dei dati ridotta da ~O(N) a ~O(logN)

funzionano tanto meglio quanto più sono contenibili in RAM

Indici & Big Data

S. Riggs, A. Herrera

range

a| |

d| |

b|c|

a| |

d| |

b|c|

BitMapHeapScan

...WHERE col<vall AND col>val2;

indice "non specifico"

popolamento dei nodi basato sul range di distribuzione dei dati all'interno dei singoli blocchi
estremi di un intervallo
contorno delle geometrie, ...

blocco: insieme di pagine da 8kB di una tabella fisicamente adiacenti

SeqScan

Block Range INdex

(PostgreSQL 9.5)

Partizionamento

SELECT * FROM parent;

QUERY PLAN

---------------------------------------------------------------------------

Result (cost=0.00..84.00 rows=4400 width=44)

-> Append (cost=0.00..84.00 rows=4400 width=44)

-> Seq Scan on parent (cost=0.00..21.00 rows=1100 width=44)

-> Seq Scan on partition1 (cost=0.00..21.00 rows=1100 width=44)

-> Seq Scan on partition2 (cost=0.00..21.00 rows=1100 width=44)

-> Seq Scan on partition3 (cost=0.00..21.00 rows=1100 width=44)

inserimento

nuovi dati

SELECT * FROM parent WHERE col1=condition1;

dati "storici"

inutilizati

QUERY PLAN

---------------------------------------------------------------------------

Result (cost=0.00..84.00 rows=4400 width=44)

-> Append (cost=0.00..84.00 rows=4400 width=44)

-> Seq Scan on parent (cost=0.00..21.00 rows=1100 width=44)

-> Seq Scan on partition2 (cost=0.00..21.00 rows=1100 width=44)

indici poco utili se le tabelle sono molto grandi

problemi nella eliminazione dei dati storici che non servono

PRO

CONTRO

ESEMPIO

Routing non automatico in caso di INSERT
UPDATE non spostano i record compatibilmente ai CHECK

I trigger possono essere usati come workaround, ma hanno un costo

Check su più colonne
Condizioni miste:

Indici diversi tra partizioni
Diversi stats & optimizers costs

CREATE TABLE parent (

id serial PRIMARY KEY,

col integer,

...

);

CREATE TABLE partition1 (

CHECK(col>1 AND col<9)

) INHERITS (parent);

CREATE TABLE partition2 (

CHECK(col=10)

) INHERITS (parent);

ROUTING

INSERT INTO parent

VALUES (

...

);

CREATE TRIGGER router

INSTEAD OF INSERT ON parent

FOR EACH ROW

EXECUTE PROCEDURE route_func();

Ereditarietà

parent

CREATE TABLE parent (

pkey integer PRIMARY KEY,

col1 datatype1 NOT NULL,

col2 datatype2

);

CREATE TABLE partition1 (

CHECK (col1=condition1)

) INHERITS (parent);

CREATE TABLE partition2 (

CHECK (col2=condition2)

) INHERITS (parent);

...

partitions

tablespace

è possibile allocare tabelle, viste, indici, etc. in specifiche locazioni del filesystem

Big Data

storage

CREATE TABLESPACE tbls

LOCATION '/some/specific/path';

CREATE TABLE table1 (

...

) TABLESPACE tbls;

ALTER TABLE table2

SET TABLESPACE tbls;

Possibilità di partizionare

i dati logicamente...

...è possibile farlo fisicamente?

dedicare specifici dischi a differenti oggetti del database

Casi d'uso

Crescita dei dati

new created tables

CREATE TABLESPACE increased

LOCATION '/new/mountpoint/';

ALTER DATABASE mydb

SET TABLESPACE increased;

ALTER TABLE the_increasing_one

SET TABLESPACE increased;

Big indexes

dedicare spazio ai soli indici (es. SSD)
dedicare diversi tipi di filesystem per dati e indici

Partizionamento

dedicare dischi diversi per diverse partizioni logiche dei dati
es. dischi meno performanti per dati storici

CONCLUSIONI

Domande?

PostgreSQL offre molte soluzioni in ambito Big Data, e molte sono in via di sviluppo

Alcune di queste coprono alcuni aspetti piuttosto che altri - scegliete quella più vicina al vostro caso d'uso

Non sempre problemi complessi richiedono soluzioni complesse

Parallelismo

Postgres-XL

Foreign Data Wrapper

Aggregazioni parallele

Sequential scan parallelo

Join parallele

Sequential scan parallelo

Grouping sets

Tablesample

Grouping sets

Viste materializzate

Foreign Data Wrapper

Robert Haas - 20/01/2016
Gather rimandato a dopo la JOIN

Insiemi di campi su cui eseguire GROUP BY e unire i risultati

David Rowley - 21/03/2016
Calcolo finale delle aggregazioni rimandato

Estensibile:

Attenzione: query su foreign table possono avere problemi di ottimizzazione differenti

Benchmark: cluster con 16 nodi AWS i2.xlarge

Gestione oggetti esterni al database:

CUBE e ROLLUP sono shortcut per particolari GROUPING SETS

Un problema ignorato a lungo

CREATE MATERIALIZED VIEW pgmatview AS SELECT * FROM pgtab;

Robert Haas - 11/11/2015
Nuovi parametri:
max_parallel_degree
parallel_tuple_cost
parallel_setup_cost

Migliorati piani e tempi di esecuzione

Data node:

Esecuzione query
Sharding vs Replication
Comunicazione dati
Possibilità di aggiungere nodi

Non sempre sono necessari tutti i dati di una tabella

Metodi di campionamento:
SYSTEM
BERNOULLI

Dalla 9.5

PostgreSQL 9.5:

GROUPING SETS
CUBE
ROLLUP

CREATE EXTENSION postgres_fdw;

CREATE SERVER pg_srv FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (connstring);

CREATE USER MAPPING FOR user SERVER pg_srv

OPTIONS (user 'username', password 'secret');

CREATE SCHEMA srv_schema;

IMPORT FOREIGN SCHEMA public FROM SERVER pg_srv INTO srv_schema;

Fino alla 9.4: UNION ALL

tsm_system_rows
Numero di record preciso
tsm_system_time
Numero di record ottenibili nel tempo

Global Transaction Manager:

XID e snapshot centralizzati
GTM standby
GTM proxy

CUBE: tutte le combinazioni possibili dei campi

Coordinatori:

Gestione connessioni in arrivo
Query planning
Gestione 2PC
Aggregazione dei risultati parziali

ETL
Database federati
Sharding:
9.5 - Foreign Table Inheritance

Numerosi aggiornamenti nella 9.6

Sequential scan parallelo
Join parallele
Aggregazioni parallele

Dalla versione 9.3
9.4: REFRESH CONCURRENTLY
Supporto agli indici

SELECT l.regione, p.categoria, sum(prezzo) AS totale

FROM vendite v

JOIN luoghi l ON (v.id_luogo=l.id)

JOIN prodotti p ON (v.id_prodotto=p.id)

GROUP BY GROUPING SETS ((l.regione, p.categoria), ())

ORDER BY regione, categoria;

regione | categoria | totale

-------------+---------------+---------

Toscana | cat1 | 104

Toscana | cat2 | 100

Umbria | cat1 | 14

Umbria | cat2 | 55

| | 273

ROLLUP: tutti i campi su cui è stato dichiarato l’operatore, tutti meno l’ultimo, tutti meno gli ultimi due...

SELECT expression

FROM table

TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ]

CREATE TABLE products (

product_no integer,

name text,

price numeric

) DISTRIBUTE BY HASH(product_no);

CREATE TABLE products (

product_no integer,

name text,

price numeric

) DISTRIBUTE BY REPLICATION;

http://www.tpc.org/information/benchmarks.asp#tpc-h

Non solo zucchero sintattico

9.6(?): pushdown JOIN

www.depesz.com

32 core

range

Database cluster scalabile basato su PostgreSQL

Soluzione ibrida OLTP/OLAP
AXLE: Advanced Analytics for Extremely Large European Databases

range

Choose a template

Modular - Dark (AI Assisted)

Revolutionize your presentations with our Modular Prezi AI-assisted presentation template, a versatile and customizable solution that adapts to your unique content, providing a visually stunning and cohesive framework for professionals, educators, and creatives.

Sheet Music (AI Assisted)

Elevate your presentations with our Sheet Music Prezi AI-assisted presentation template, seamlessly blending aesthetics and functionality for a harmonious visual experience.

Colorful Nature - Dark (AI Assisted)

A whimsical flower motif sets the fun tone for this Prezi AI-assisted presentation template. Just add your own text, images, videos, or other content to create a memorable and engaging presentation your audience will love. Like all Prezi templates, it’s easily customizable.

See more templates →

Presentations from around the world

Epistemologia - Fase 2

Yasser Echavez

Эсийн мембрагаар ион болон бодисын шилжил

Э. Номин

PSICOLOGÍA CULTURAL

maria palmer

See staff picks →

Learn more about creating dynamic, engaging presentations with Prezi

Why Prezi is better