SQL Extensions for Decision Support (Cube, Aggregation sets, Pivot, etc.) Jens-Peter Dittrich. Yovisto Academic Video Search. Node Storage join table Data Model Decomposition size Partitioning ship Index Fact Assume Approach Semijoin Eidgenössische Technische Hochschule Zürich (ETHZ) work that lin cod hard mar siz pictur integration information merg project explored part converging techniqu exampl good merged evidenc ther cidr this personal som already databas specific domain they futur does functionality add-on with bloated engin oltp current application well different history long laes mostly olap structured writ read model operator pull-based oltp engin prediction current siz pictur stream impractical strategy front-end featur market includ market both from pressur system selling fact vendor faction syst singl illusion con common behind engin multipl hid vendor approach current siz pictur outbound processing inbound gull stream exampl siz pictur icd abstract gon com tim whos fat ugur stonebrak management information personal search text scientific includ exampl engin separat work does this application processing data every idea application oth cop abl over that extended architectur relational centric oltp with started vendor phras singl summed dbms year last siz pictur laes favor therefor tim indexing long unexpectedly cardinaliti high dimension many with problem report actually they peopl tell analysis essbas tool based off compani som rolap industry fight discussion molap applied well lectur this taught techniqu cas business wheth questonabl charecteristics current giv recommendation anoth data part olap hybrid pen systerm conferenc roussopoulos nick deligiannakis antonios yannis literatur stout index tabl fact dwarf plez exam molap scenarios certain work howev well scal guaranteed approach this therefor exampl high cardinaliti larg hug dominat leav giv nod numb level then cardinality dimension each assum analysis cas worst dwarf blli pen syst conferenc roussopoulos nick deligiannakis antonios yannis literatur stout index tabl fact dwarf plez exam molap light-weight valu look-up tim mann compressed highly stor aggregat almost materializ that structur index clev invent processing query dbms idea cor approach non-relational olap multidimensional molap thiel prof lectur hardwar architectur siz hierachy memory about knowledg hull approach specialized widmay tailored level obliviousness algorithm cache-oblivious optimiz approach rat cach accelerator applix sybas product factor tremendous gain performanc waist designed algorithm stall spac waist els everything touched that into should data relevant only world ideal onlin storag memory main byt regist goal rat cach directiv pragma through pattern guessing automatically prefetch abl cpus many into loaded until wait accessing than slow much access from read miss data memory main considerably disk hard low compression partition vertical learned lesson onlin storag thin rmin rat cach fragmentation possibl whenev operation avoid answ read amoun queri data high seek iot requir syst larg olap siz pag small with best work oltp partition vertical compressed book gigabyt managing compressing exist schem encoding different many bit-sliced effect sam singl directly compress bit-slic into cutting instead alternativ partition binary tabl fact true valu each list typ numb countabl column bit need limited numb domain assumption index reading gigabyt doz possibly read mdex projection rows million assuming improvement performanc tim into roughly translat partition vertical required spac original only this bit-sliced representation binary tabl fact exampl index calculated numb calculation simpl with valu appropriat retriev slot prop access num giv pag disk kbyt singl length byt numb unused exist hol ordered from valu sequenc consist then column assum detail index projection quass representation otherwis read hav valu used partition vertical represent attribut singl projection redundant additional creat with stick idea index projection vldb databas tradeoff performanc literatur tabl attribut width tupl byt selected wjjj stor column study recent som projection redundant additional creat with stick idea index projection quass representation otherwis read hav valu used partition vertical represent attribut singl vldb databas tradeoff performanc literatur tabl attribut width tupl byt selected stor column study recent som disk hard tap what student trend hardwar current giv important mor becom accelerator memory main monetdb early sinc sybas product several stor column system sigmod setrag copeland georg databas tran fil transposed searching batory distributed rows many inefficient disadvantag accessed need only when very advantag attribut accessing optimized mit model storag decomposition disk hard tap what student trend hardwar current giv important mor becom accelerator memory main monetdb early sinc sybas product several stor column system sigmod setrag copeland georg databas tran fil transposed searching batory literatur distributed rows many inefficient disadvantag accessed need only when very advantag attribut accessing optimized model storag decomposition storing one-column lnam tabl two-column into tabl split schmidt fnam hgds frank model storag decomposition representation array-l implicitly fil operator user syst operating cache-controll compil program granularity control omin nearlin archiv ttis onlin main memory byt bwgs tim access hierarchy storag capacity maim frank tbii meier lnam model storag decomposition representation array-l implicitly storing one-column lnam tabl two-column into tabl split schmidt fnam hgds frank model storag decomposition sigmod setrag copeland georg databas tran fil transposed searching batory distributed rows many inefficient disadvantag accessed need only when very advantag attribut accessing optimized mit model storag decomposition maim frank tbii meier lnam model storag decomposition representation array-l implicitly storing one-column lnam tabl two-column into tabl split schmidt fnam hgds frank model storag decomposition onlin maln byt main cach regist tim access hierarchy storag capacity ful operator user syst cache-controll compil program granularity control offlin nearlin archiv ttis hierarchy memory happ question togeth record attribut sequentially stored record frank sinon hik squad hugo schrnidt schmidt meier sifl ffat hung model storag n-ary standard rat clock l986 malt year increas much tim access year over considerably improved ratio memory main development hardwar blli capacity much tim access disk year over considerably improved ratio disk hard development hardwar zundt system nod holding machin result intermediat send step second nod replicat step2 techniqu mor many parallelism data machin redundant tim queri machin sam cub multipl distribution load challeng optimization furth zundt system nod holding machin result intermediat send step second nod locally join step first bull tabl dimension replicat exampl performing machin separat result intermediat send exploited with index join howev ignoring cub partitoned query process steps with directly start could algorithm semijoin index join providing them optimiz join distributed allow remedy high becom point som replicating cost schema galaxy schem multipl among shared that tabl tabl dimension larg than mor having consid limited partitioning discussion nod replicat step2 dimension tabl fact partition step1 schema star simpl exampl combination practic dimension small good query parallel sid multipl tabl replicat degre som allow idea relation small much help does partitioning join presenc replication data instanc singl join hash analogi sam this avoid machin distribut repartition non repartitioned shipped then runtim tabl directed option best slid previous partitioned both join co-located partitioning data variant dbms architectur implementation from join hash grac rememb join presenc partitioning data that paralleliz serv this wheth looking worth algorithm som hashing apply whenev fact tim indexing don phas wher doing thus sam idea cor lectur labl nod join presenc partitioning data exampl becom effect which element wher groups into mbles column multipl singl eith column hash local that plac hrst partition idea join presenc partitioning data hull good choic sensitiv very method this gain filt tupl superset probl sent need that keys numb than short network reduc list small send only goal bitmaps valu function hash techniqu tilt bloom column foreign instead bit-vector ship approach semijoin sam algorithm nod local perform predicat join with assum semijoin optimization cost-based selectivity based sens mak method wheth decid need consequenc plac entir shipping than costly mor approach this then discarded non anyway needed rows assum shipped tabl siz reduc allows selection well work discussion assum semijoin nod local perform predicat join with optimization cost-based selectivity semi-join based sens mak method wheth decid need consequenc plac entir shipping than costly mor approach this then discarded non anyway needed rows assum shipped tabl siz reduc allows selection well work discussion result send that tupl select perform ship nod predicat join with assum semijoin send nod projection tam comput ship predicat join with assum semijoin aft becom bigg not sending befor apply selection contain query ssde oth small always ship optimization scan full equivalent disk hard sam roughly network entir shipping cost high tabl nod problem discussion aharu versa vic locally join sam aand both oth tabl nod sid ship approach naiv operator join standard execut data ship approach naiv very small really work only idea good relation entir shipping basically pag access random high this cost probl that cach from pag fetch whenev index machin machin resid wher comput nod general processing join distributed small really work only idea good relation entir shipping basically pag access random high this cost probl that cach from pag fetch whenev index machin operator join standard execut data ship approach naiv very machin resid wher comput nod general processing join distributed small really work only idea good relation entir shipping basically pag access random high this cost probl that cach from pag fetch whenev index machin operator join standard execut data ship approach naiv very machin resid wher comput nod general processing join distributed warehous data parallel indexing processing query idea expensiv still could this each siz bitmaps operation requir intersect index bitmap using plan probl possibl tim processing memory main sav effect tupl redundant eliminat join following corrrect result overall not superset produc typ replacing join algorithm optimiz filt bloom week last slid them exist they keys generat abl anyway input scan applied only therefor selection pass tupl which determin cannot returned from reversibl hlter used function hash general data required sup retum bitmap small allows bit bloom join star bloom-filt apply wher question llll material additional indexing processing query yawning dialect oracl this wher from select som data only consid restricted query tupl creat year contain assum tim partition exampl certain belong element that partition into split attribut idea rang partitioning horizontal split idea partitioning horizontal queri computation concurrent allow machin multipl distribut operation singl data remov processing considered from tupl less certain skip allows that predicat contain query equal schema sam hav piec partition disjoint into tabl fact system information institut ethz jen dittrich jens-pet sos warehousing data