Autore Topic: Azure Cosmos DB – A polymorphic database for an expanding data universe  (Letto 117 volte)

0 Utenti e 1 Visitatore stanno visualizzando questo topic.

Offline Flavio58

Azure Cosmos DB – A polymorphic database for an expanding data universe
« Risposta #1 il: Novembre 01, 2018, 06:08:05 pm »
Advertisement
Azure Cosmos DB – A polymorphic database for an expanding data universe

You may have been hearing a type of debate that has been gaining popularity recently in the database world. Database Administrators and NoSQL Developers have been voicing their concerns and you may have seen the following.
     

There is a discussion that occurs frequently in the data world today, which centers around comparisons between traditional databases that are based on relational theory e.g. Oracle or SQL Server and a more modern wave of platforms commonly referred to as “NoSQL” databases. Proponents from both types of databases tend to get into disputes concerning, “which database is best?” However, this can be a misguided point of contention. To understand why, it can help to trace back through history, and reflect on how NoSQL databases first rose to prominence.



In the past 15 years, database technology has radically expanded beyond what could be described, to use a physics analogy, as a singularity in the initial conditions of our data universe: transactional processing using relational databases. This expansion has grown with improved technology and adoption, fueled by demand for the capability to process more data, as well as different kinds of data. There has been a revolution in the exchange of data, precipitated by the social media and mobile age. This has given rise to the increased popularity of different transient, flexible storage mediums, and protocols such as XML and JSON. While these became de facto standards in various forms of web publishing and messaging, methods of building applications have also evolved and matured. Object-oriented design in applications has increased in popularity, which has given rise to object-relational impedance mismatch. This further throttled the way in which we can build and maintain applications using relational databases. In addition, we have started to store various kinds of unstructured data, log files, binary images, text, sensory data, and more. This has given rise to distributed computing architectures like Hadoop and Spark that have allowed us to perform big queries on large data sets, without the need to apply structure such as schema to it at design time. In short, the variety of data structures we now need to manage has changed dramatically.



These changing approaches to modelling data in response to demands for greater flexibility can be thought of as emerging trade-offs between structural integrity vs agility/productivity when it comes to building applications that store any kind of data. This need to be more flexible has been characterized as a paradigm known as schema on read. The idea that data structures can be self-describing or semi-structured, and therefore the schema for applying meaning to data can in some sense live within the consumer/client code rather than being tight-coupled to databases at design time. This leaves databases free to be more flexible in how they ingest data.



In parallel to this shift in consumer demand for more flexible data structures, there has been a phenomenal increase in the volume and velocity of data we are handling in databases. This has given rise to the need to balance transactional integrity with physical availability, latency, and concurrency. Volumes of data being processed have become so large that traditional relational databases can sometimes struggle to offer the levels of overall performance that end users demand. In the case of ACID transactions, this has led to a common practice of relaxing the isolation element of ACID semantics in order to provide greater concurrency. In the case of availability and latency, this has given rise to the emergence of distributed database architectures, which in turn require us to balance trade-offs between consistency (of replicas) and availability. Learn more by reading about the CAP Theorem and PACELC theorem



In the context of deciding which type of database engine to use, we now have an emerging set of spectrums for data storage and persistence in an “expanding data universe”. The below diagram illustrates this, and suggests where we might place some of the emergent paradigms that have been solidifying within them.



Data Universe



Ultimately these spectrums have emerged and solidified through the increasing variety, velocity, and volume of data that modern day applications need to handle, and the advances in technology to support them.



The isolation trade-offs in ACID databases have been known for some time, and CAP Theorem/PACELC theorem have received a lot of recent press in exposing some of the trade-offs that relate to replication consistency spectrum in distributed databases. The emergence of a data structure spectrum is perhaps less discussed, but just as important in understanding the shifting paradigms in the database world. This proliferation of data paradigms means that the questions we ask about database technologies should really be centered around the business use case, and where along these spectrums that use case would be optimally served by applying the appropriate paradigm, rather than asking which database is the best. We now live in an “expanding data universe”, in which there is no single paradigm that fits best for all data structures or data persistence scenarios.



Of course, there is one database that covers more ground than other databases across these expanding and maturing paradigms, Microsoft’s polymorphic Azure Cosmos DB!



Azure Cosmos DB



The data structure spectrum



On the data structure spectrum, Azure Cosmos DB provides a revolutionary common type system referred to as atom-record-sequence (ARS). This facilitates multiple data models at an API and wire protocol level, each representing the different data models shown in the earlier diagram. Although these models may seem unrelated, they conceptually occupy points along a spectrum, as each represents a different level of trade-off between applying structure/meaning to data at design-time vs query time, or schema-on-write vs schema-on-read. From left to right, in the case of column-family, this is provided in the form of the open source Cassandra API. For the document data model, users have a choice between the native SQL API, or the open source MongoDB API. For graph data, users can adopt the Gremlin API. Finally for a key-value store, users can opt for the Table API.



The data persistence spectrums



Similarly on the data persistence spectrum, Azure Cosmos DB is one of the only databases in the world to offer multiple consistency abstractions with turnkey enablement for replication, which can be overridden per request. Azure Cosmos DB also offers the ability to do ACID transactions with snapshot isolation.



As shown in the earlier diagram, all points along the data persistence spectrum for replication consistency are supported, and were in fact uniquely pioneered as well-defined abstractions in Cosmos DB. Read my blog title “Azure Cosmos DB – Tunable Consistency!” for a discussion of the benefits and use cases for each consistency setting, along with descriptions using real world examples. For a more in-depth exploration of the data consistency models we created for Cosmos DB, please take a look at our e-book.



Global distribution and SLAs



In addition to this wide coverage across all these burgeoning spectrums, Azure Cosmos DB is also one of the only turnkey globally distributed databases in the world, enabling ultra-low latencies for both reads and writes in geo-distributed applications through seamless replication and multi-master write region capability, with automatic conflict resolution. Through a tightly controlled resource governance model made possible through a cloud-native software architecture, it also offers financially backed SLAs across consistency, availability, latency and throughput.



Flexible options



Azure Cosmos DB does not circumvent the need to make informed decisions between different points along each spectrum, nor does it completely abdicate the need in some cases to choose a different database platform entirely such as a relational database. However, as illustrated above it does provide very wide support across a growing number of spectrum points in an “expanding data universe”, with turnkey convenience and efficiency. It is thus very strongly placed to give excellent coverage across a high number of real world business use cases.


Source: Azure Cosmos DB – A polymorphic database for an expanding data universe


Consulente in Informatica dal 1984

Software automazione, progettazione elettronica, computer vision, intelligenza artificiale, IoT, sicurezza informatica, tecnologie di sicurezza militare, SIGINT. 

Facebook:https://www.facebook.com/flaviobernardotti58
Twitter : https://www.twitter.com/Flavio58

Cell:  +39 366 3416556

f.bernardotti@deeplearningitalia.eu

#deeplearning #computervision #embeddedboard #iot #ai

 

Related Topics

  Oggetto / Aperto da Risposte Ultimo post
0 Risposte
76 Visite
Ultimo post Luglio 14, 2018, 06:01:19 pm
da Flavio58
0 Risposte
162 Visite
Ultimo post Luglio 16, 2018, 08:01:34 pm
da Flavio58
0 Risposte
53 Visite
Ultimo post Agosto 08, 2018, 02:01:23 am
da Flavio58
0 Risposte
98 Visite
Ultimo post Agosto 29, 2018, 12:03:46 pm
da Flavio58
0 Risposte
85 Visite
Ultimo post Settembre 29, 2018, 10:04:10 pm
da Flavio58

Sitemap 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326