Horizontally Scaling NoSQL data across multiple servers with MongoDB sharded clustering for High availability.
When we start dealing with too much data our current system can handle we have 2 options, scale horizontally or Vertically. Vertically scaling is often way more expensive because we have to put too much power in a single machine. it also became vulnerable for a term called a single point of failure. Adtech firms like ours needs MongoDBs clustering with HA for better data reliability and blazing fast speed.
Most of the example in the web show example using a single machine, it means they setup everything in the single machine but here I'll explain everything how to setup in actual production using multiple cloud servers that communicate between private but mongos router is open to the public to receive connections from anywhere in the world. Our Setup will look like the below image.
I’ll do the MongoDB clustering on multiple servers in digital ocean. In “mongo-shard-01” I’ll setup 3 shard replica sets and in each replica set, there will 3 replica servers running, In “mongo-config-server” I’ll put my configuration of all my shards. and “mongo-router-01” will have mongos setup for data routing.
Now first lets setup mongo in all the servers first by following this instruction.
Step 1- Setting up the Sharding server
We will setup 9 replica or 3 replica sets and start them as 3 shards in “mongo-shard-01” using the below commands.
After executing this command check using “ps -ef | grep mongod”. You’ll see 9 replica is running. Now we have 3 shards every shard is consist of 3 replicas now we have to configure every shard.
I chose 27017,37017,47017 as primary replicas, where shard1 primary replica is 27017, for shard2 37017, and for shard3 is 47017.
login into mongodb with port 27017 and run these codes. i’ve commented out which code executes where.
After every primary shard setup you’ll have something like this.
And after a few seconds, the node will become primary and other nodes will become secondary. Just like the next image. Do this for all 3 ports mentioned and you’ll get results like mentioned in screenshots. using rs.status() you’ll can you’re connected to other nodes as well all the nodes send each other heartbeat so that each node knows which one is alive and which isn’t. And don’t forget to run rs.secondaryOk() in all other secondary nodes.
Now You finally Completed setting up the hard part all you have to do now is to setup a config server cluster for three shards and connect them to mongos router.
Config servers are the backbone of this clustered structure, they seem to do nothing but they actually control and know everything about available shards, that’s why I put the config instances in a different machine. In “Config-server” the machine I executed the below command to spawn new 2 new config servers 1 primary and the other is secondary.
Check using “ps -ef | grep mongod” command that 2 instances are running. Then run this below config in 57040 mongo port. Your config server setup will be done.
Now let move into the Mongos which is mongo data router. Lets execute this so you’ll have your mongo
And There you go! mongos is connected and configured.
After adding all the shards into the mongos router you’ll get something like this
To see full data of shards including their replica set you can check using “db.adminCommand({listShards:1})”
Congratulations! We finally setup a multi-server clustering with 2 replication servers on standby!