SQL Server Interview Questions on Replication

We have compiled most frequently asked SQL Server Interview Questions which will help you with different expertise levels.

SQL Server Interview Questions on Replication

SQL server replication interview questions

Question 1.
What’s the best way to update data between SQL Servers?
Answer:
By using Replication we can solve this problem. Many of the developers end up saying DTS, BCP, or distributed transaction management. But this is one of the most reliable ways to maintain consistency between databases.

Question 2.
What are the scenarios you will need multiple databases with schema?
Answer:
Following are the situations you can end up into multi-databases architecture:-

24×7 Hours uptime systems for online systems
This can be one of the major requirements for duplicating SQL Servers across networks. For instance, you have a system that is supposed to be 24 hours online. This system is hosted in a central database which is far away in terms of Geographic’s. As said first that this system should be 24 hours online, in case of any break over from the central server we hosted one more server which is inside the premises. So the application detects that it can not connect to the online server so it connects to the premises server and continues working. Later in the evening using replication all the data from the local SQL Server is sent to the central server.

License problems

SQL Server per user usage has a financial impact. So many of the companies decide to use MSDE which is free so that they do not have to pay for the client licenses. Later every evening or in some specific interval this all data is uploaded to the central server using replication.

Note MSDE supports replication.

Geographical Constraints

It is if the central server is far away and speed is one of the deciding criteria.

Reporting Server

In big multinational sub-companies are geographically far away and the management wants to host a central reporting server for the sales, which they want to use for decision making and marketing strategy. So here the transactional SQL Server’s entire database is scattered across the sub-companies and then weekly or monthly we can push all data to the central reporting server.

SQL Server Interview Questions on Replication chapter 10 img 1

You can see from the above figure how data is consolidated into a central server that is hosted in India using replication.

Question 3.
(DB)How will you plan your replication?
Answer:
Following are some important questions to be asked before going for replication.

Data planning

It’s not necessary that you will need to replicate the complete database. For example, you have a Sales database that has Customer, Sales, Event logs, and History tables. You have a requirement to host a centralized reporting server which will be used by top management to know “Sales by Customer”. To achieve this you do not need the whole database on reporting server, from the above you will only need “Sales” and “Customer” tables.

Frequency planning

As defined in the top example let’s say management wants only “Sales by Customer weekly”, so you do not need to update every day, rather you can plan weekly. But if the top management is looking for “Sales by Customer per day” then probably your frequency of updates would be every night.

The schema should not have a volatile “baseline”

Note I like the word “baseline” it really adds weight while speaking as a project manager. It’s mainly used to control change management in projects. You can say “Baseline” is a process by which you can define a logical commit to a document. For example, you are coding a project and you have planned different versions for the project. So after every version, you do a baseline and create a setup and deploy it to the client-side. Any changes after this will be a new version.

One of the primary requirements of replication is that the schemas which should be replicated across should be consistent. If you are keeping on changing the schema of the server then replication will have huge difficulty in synchronizing. So if you are going to have huge and continuous changes in the database schema rethink over replication option. Or else proper project management will help you solve this.

Question 4.
What arc publisher, distributor, and subscriber in “Replication”?
Answer:
The publisher is the one who owns the database and is the main source of data. Publisher identifies what data should be distributed across.
The distributor is a bridge between publisher and subscriber. The distributor gathers all the published data and holds it until it sends it across to ail subscriber. So as it’s a bridge that sits in between publisher and subscriber, it supports multiple publishers and subscriber concepts.
Subscriber is the end source or the final destination to which data has to be transmitted.

SQL Server Interview Questions on Replication chapter 10 img 2

Question 5.
What is “Push” and “Pull” subscription?
Answer:
Subscription can be configured in two ways:-

  • Push subscription

In push subscription, the publisher has full rights when updating data to subscribers. Subscriber completely plays a passive role in this scenario. This model is needed when we want full control of data in hands of the publisher.

  • Pull subscription

In pull subscription the subscriber requests for new or changed data. Subscriber decides when to update himself. This model is needed when we want the control to be in the hands of subscribers rather than publishers.

Question 6.
(DB)Can publication support push and pull at one time?
Answer:
A publication mechanism can have both. But a subscriber can have only one model for one publication. In short a subscriber can either be in push mode or pull mode for a publication, but not both.

Question 7.
What are different models/types of replication?
Answer:

  • Snapshot replication
  • Merge replication
  • Transactional replication
Note  Below I will go through each of them in a very detailed way.

Question 8.
What is Snapshot replication?
Answer:
A complete picture of data to be replicated is taken at the source. Depending on the schedule defined when the replication should happen, destination data is completely replaced by this. So over a period of time changes are accumulated at the publisher end and then depending on the schedule it’s sent to the destination.

Note In a snapshot you still also be sending data that has not changed.

Question 9.
What are the advantages and disadvantages of using Snapshot replication?
Answer:
Advantages:

  • Simple to set up. If the database is small or you only want to replicate master data (State code, Pin code, etc) it’s the best approach, as these values do not change heavily.
  • If you want to keep a tight control over when to schedule the data this is the best approach. For example, you will like to replicate when the network traffic is low (probably during Saturday and Sunday).

Disadvantages:

Note  This is probably the least used approach. So definitely the interviewer is expecting the 
         disadvantages points to be clearer, rather than advantages.
  • As data start growing the time taken to complete the replication will go on increasing.

Question 10.
What type of data will qualify for “Snapshot replication”?
Answer:

  • Read-only data are the best candidates for snapshot replication.
  • Master tables like zip code, pin code, etc are some valid data for snapshot replication.

Question 11.
What’s the actual location where the distributor runs?
Answer:
You can configure where the distributor will run from SQL Server. But normally if it’s a pull subscription it runs at the subscriber end and for push subscription, it runs on the publisher side.

Question 12.
Can you explain in detail how exactly “Snapshot Replication” works?
Answer:
Following are the basic steps for “Snapshot Replication” to work:-

Note There are two important components “Snapshot Agent “and “Distribution Agent” which we will define first. Snapshot agent creates an image of the complete published data and copies it to the distributor. The distribution Agent sends the copied image and replaces the data on the subscriber side.

  • Snapshot agent places a shared lock on the data to be published.
  • The whole snapshot is then copied to the distributor end. There are three files that are created one for database schema, BCP files, and the index data.
  • Finally, the snapshot agent releases a lock over the published data.
  • The distribution agent then replaces the subscriber data using files created by the snapshot agent.

SQL Server Interview Questions on Replication chapter 10 img 3

Question 13.
What is merge replication?
Answer:
If you are looking forward to managing changes on multiple servers which need to be consolidated merge replication is the best design.

Question 14.
How does merge replication works?
Answer:
The merge Agent component is one of the important components which makes merge replication possible. It consolidates all data from subscribers and applies them to publishers. Merge agent first copies all data from publishers to the subscribers and then replicates them vice-versa so that all stakeholders have consistent data.

SQL Server Interview Questions on Replication chapter 10 img 4

Merge agent stands in between subscriber and publisher. Any conflicts are resolved through a merge agent in turn which uses conflict resolution. Depending on how you have configured the conflict resolution the conflicts are resolved by the merge agent.

Question 15.
What are the advantages and disadvantages of Merge replication?
Answer:

Advantages:

  • This is the only way you can manage consolidating multiple server data.

Disadvantage:

  • It takes a lot of time to replicate and synchronize both ends.
  • There is low consistency as a lot of parties have to be synchronized.
  • There can be conflicts while merging replication if the same rows are affected in more than one subscriber and publisher. Definitely, there is conflict resolution in place but that adds complication.

Question 16.
What is conflict resolution in Merge replication?
Answer:
There can be practical situations where the same row is affected by one or many publishers and subscribers. During such critical times, the Merge agent will look at what conflict resolution is defined and make changed accordingly.

SQL Server uniquely identifies a column using a globally unique identifier for each row in a published table. If the table already has a unique identifier column, SQL Server will automatically use that column. Else it will add a rowguid column to the table and create an index based on the column.

Triggers will be created on the published tables at both the Publisher and the Subscribers. These are us^d to track data changes based on a row or column changes.

Question 17.
What is a transactional replication?
Answer:
Transactional replication as compared to snapshot replication does not replicate full data, but only replicates when anything changes or something new is added to the database. So whenever on the publisher side we have INSERT, UPDATE and DELETE operations, these changes are tracked and only these changes are sent to the subscriber end. Transactional Replication is one of the most preferred replication methodologies as they send the least amount of data across the network.
Can you explain in detail how transactional replication works?

  • Any change made to the publisher’s database is logged in to a log file.
  • Later log reader agent reads the changes and sends it to the distribution agent.
  • The distribution agent sends the data across to the subscribers.

SQL Server Interview Questions on Replication chapter 10 img 5

Question 18.
What are data type concerns during replications?
Answer:

  • If it’s a transactional replication you have to include a “Timestamp” column.
  • If it merges replication you will need a “unique identifier” column. If you do not have one replication creates one.

Note As this is an interview question book we will try to limit it only to a theoretical basis. The best way is to practically do one sample of replication with a sample project. But just for your knowledge, I will show some important screens of replication vizard.

  • In the “SQL Server Management Studio,” you can see the publication folder. When you right-click on it you can see the “New Publication” menu.

SQL Server Interview Questions on Replication chapter 10 img 6

This wizard will be used to specify the “Distribution”.

SQL Server Interview Questions on Replication chapter 10 img 8