I have written a number of posts in the past working on shared databases and the concepts around these can be read at Sharding or No Sharding of Database – Working on my Weekend Project for the starters point of view. In a recent discussion at a user group, I had someone ask me what is the rationale around building a sharding key and what should be used for sharding their database. The concept, though common, it is not in black-and-white to what should be used as an sharding key. There are guiding principles which can be applied. Let us learn about Database Sharding.

As the discussion matured, I thought to pen down some of the thoughts that were discussed. These can be used as guiding principles in general.

Database Sharding – How to Identify a Shard Key? shard-key-decision-01-800x390

Identifying an Appropriate Shard Key

One of the most difficult tasks in designing a data model is to identify an appropriate shard key. Many of the modeling decisions depend on this choice, and once committed, you cannot easily change the key. Hence, getting this right at the initial design phase is critical.

A best practice is to choose the most granular level of detail for the shard key.

Consider a SaaS solution provider that offers a service to multiple companies, each of which has a division with numerous assets. Each asset may generate a large amount of data or be used as the pivot point for many database transactions. One data modeler may choose to shard based on company, another on division, and yet another on asset. According to the best practice, asset is a good starting point.

Consider whether any DML queries will traverse the shards

In an ideal data model, no DML actions traverse across shards. As this ideal is very unlikely, the goal is to keep such requirements to a minimum. Such requirements can add complexity to the Data Access layer, reduce the usefulness and availability of RDBMS semantics, and expose your solution to greater risk should a shard become unavailable.

Depending on the database queries, you can decide to have multiple, logical groupings of shards, each one capable of been sharded

A logical grouping of shards is referred to as a shard set. A shard set is a collection of entities and database objects that are identical across shards within the shard set. For instance, a logical data model may have distinct functional areas, such as Inventory, Sales, and Customers, each of which could be considered a shard set. Each shard set has a shard key, such as ProductID for inventory and CustomerID for both Sales and Customers. A less common alternative for the Sales shard set is a shard key based on SalesOrderID. The choice depends on whether cross-shardlet queries can be handled.

It is common to encounter a case where logical relationships exist among shard sets—a big consideration when defining appropriate boundaries for the functional areas. When a relationship exists, the application tier must compensate for cross-area transactions. In this example, the Sales shard set has a logical relationship with Products shard set and a reference to ProductID. The Products shard set owns the metadata of the product. Of course, a reasonable option is to treat the Products table in the Sales shard set as a reference table. But this cannot be always possible because there can be a reference for Products even in the Orders shard and Shipment/Delivery shards etc. Think before you take a decision.

Finally, consider the data type of your shard key

The choice of shard key data type impacts database maintenance, troubleshooting, and resource consumption. The most efficient data type has an acceptable domain, is small, has a fixed storage size, and is well-suited for the processor. These factors tend to constrain the candidates for data types to integers (smallint, int, and bigint), char (4 -> 8), or binary (4 -> 8). Of these, bigint (Int64) is the best trade-off, but you can use smaller integer types if your business rules require.

The shard key must uniquely separate shardlets from one other. For example, if CustomerID is the shard key, then its value is unique among customers. In an entity that has child records of Customer, the shard key serves as part of the record’s primary key.

I am sure some of these discussion points have brought some insights that made a great write for me too. Do let me know if there are areas that I missed in my considerations. I would love to learn from you about them too.

Reference: Pinal Dave (http://blog.sqlauthority.com)

First appeared on Database Sharding – How to Identify a Shard Key?

Database Sharding – How to Identify a Shard Key?

Identifying an Appropriate Shard Key

A best practice is to choose the most granular level of detail for the shard key.

Consider whether any DML queries will traverse the shards

Depending on the database queries, you can decide to have multiple, logical groupings of shards, each one capable of been sharded

Finally, consider the data type of your shard key

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112