SQL Server follows client server architecture. When ever user performs any action on client machine, it converts in the form of query. This query moves from client to server in the form of network packets using protocols for connection and communication between source and destination servers.
SQL server is mainly divided into two Engines,
- Relational Engine: (Query Processor) prepares execution plan and hand over to storage Engine.
- Storage Engine: It is a central repository, responsible in execution of query using execution plan, response sent to user.
Buffer pool is another important component contains plan cache and data cache which is used for query execution.
SQL OS is a core to SQL Server architecture, used for scheduling, I/O completion, Memory Management and resource management. It is a thin layer between windows OS and SQL server.
Components of SQL Server
Here you can find the different SQL server components,
- SQL Server Network Interface (SNI)
- Command Parser
- Query Executor
- Access Methods
- Buffer Manager
- Plan Cache
- Data Cache
SQL Server Network Interface (SNI)
SNI is a Protocol layer that establishes the network connection between the client and the server. It uses TCP/IP protocol to send queries in the form of TOS packets.
Command Parser first checks for syntax errors, then it generates query plan (or) find an existing plan query plan contains detail steps how query is going to execute. Command parser checks whether a plan already exists in plan cache of buffer pool. If finds plan passes to query executor for execution. If it does not find then query passes to optimizer.
Optimizer prepares query plans for one query in that SQL server select best plan based on response time, the query plan passes to query executor for execution.
Query executor requires data to read the query plan it passes to access methods of storage engine.
Access methods requires data to complete the query it asks buffer managers to provide data page. Once it receives required data, the query results passes back to relational engine and there to user.
Buffer manager checks in data cache of buffer pool to see if it has page already in cache memory. If page exists, it passes results to Access methods. If not exists it pulls required pages from mdf datafile put in data cache and passes it back to Access methods.
Part of sql servers buffer pool used to store previously executed execution plans in case they are needed later.
Datacache is largest part of buffer pool. Every data page that is read from disk is written a copy here before using.Under memory pressure these pages are flushed from cache using LRU (Least recently used) policy.
Steps in executing a query:
- Server Network Interface (SNI) of user establish connection between client and server using TCP/IP protocol, send query in TDS packets.
- Query at command parser checks syntax errors then checks plan in plan cache of buffer pool. If plan not exists, passes the query to optimizer.
- Optimizer generates best plan and pass to query executor, it reads the plan and passes to access method of storage engine through OLEDB.
- Access method requests buffer manager to provide the data.
- Buffer manager checks in data cache of buffer pool for existing page. If page not exists it pulls the required pages from data (mdf) file, put in data cache and pass to access method.
- Finally Access method passes the results back to relational engine, from there it sent back to user who executed the query.
Protocols available in SQL Server
SQL Server Network Interface (SNI) is a protocol layer that establishes the network connection between the client and sever. Sql server supports 4 protocols.
- Shared memory
- Named pipes
Shared Memory: It is default protocol used to connect client and sql server on the same machine
Named Pipes: Client and server will connect with in a LAN. It has certain limitation.
TCP/IP: TCP/IP is most used protocol for SQL Server client establishes connection with sql server using an IP Address and a port number 1433 we can access the databases using internet hence there are no boundaries for this protocol.
VIA (Virtual Interface Adapter): VIA is a wireless internet protocol for connecting client and server with in a certain range.
To establish a secure sql connection we need a port number along with protocol. Default port number for TCP/IP protocol is 1433 we can change the port number from configuration manager — sql server network configure — protocols we can change.
SQL SERVER STORAGE ARCHITECTURE
In sql server all the data will be stored in the form of records, these records also called as row data. All these records further grouped into a page.
Page is a default storage unit of sql server. The size of page is 8kb.
Page consists of 3 sections,
- Page Header
- Actual Data
- Row offset array
Page Header – It consists of Page ID, Page Type, Object ID Header version.
Page ID – To identify particular page using unique page ID.
Page Type – What type of page it is either data page or Index page.
In Row offset location of record will be stored (2 bytes).
Types of Pages:
- Data Page – stores data entered by user.
- Index Page – Indexes are pointer which store address of original pages for quickly locating data
- Free space page – It stores page allocation information and unused space available on pages.
- Text/Image – It stores large object data (LOB) like Text, Image and XML Data.
- GAM (Global Allocation Map) or SGAM (Shared Global Allocation Map) – It stores extent allocation information.
- BCM (Bulk Changed Map) – Stores extents information in a Bulk Operation
- DCM (Differential Change Map) – It stores modified extents information after Full BackUp.
- I AM (Index Allocation Map) – Stores extents information that are used by a table (or) Index.
- These are important types of pages. All these pages are further grouped into a Extent.
SQL Server Extents:
Extent is a storage structure consists of 8 consecutive SQL Server pages. Pages in a Extent can be one table (or) upto Eight tables.
There are 2 types of Extents
- Uniform Extent: If all pages are going to store same table data
- Mixed Extent: If the pages shared by 2 (or) more tables.
When a table is created and a row is inserted table gets 1 page in mixed extent, when a table grows then these tables moved to uniform extent. This is to manage space efficiently.
SQL Server File
All the extents further group into a File. A file we will have better control in SQL Server.
There are 2 types of files mainly,
- MDF (Master Data File)
- LDF (Log Data File)
MDF – Stores Permanent Data
LDF – Stores changes information will be recorded later this changes apply on MDF Data.
SQL Server Database
Files combine to form database. We require minimum 2 files 1 MF and 1 LDF to create a database. Maximum we can ‘n’ number of files means No limit.
Some files stored system data and some store user database data. Logically dividing databases into groups called File Groups.
SQL Server data mainly in 2 types of files,
- Data File (MDF)
- Log File (LDF)
Data file stores actual data with .mdf extension. It stores permanent data.
Log files stores modified recorded information with .ldf extension.
We have another file called secondary data file .ndf file extension. A database may or may not have these secondary data files.
SQL Server Transactions
A Transaction is a set of T-sql statements that read and wt=rite data into the database. There are 2 types of Transactions.
- Implicit Transactions
- Explicit Transactions
Implicit Transaction: Implicit transactions are these without begin transaction
Explicit Transaction: Explicit transactions are started using begin transaction and are controlled by using T-SQL Command commit transaction (or) Rollback transaction. Any transaction should process ACID properties then only changes more from LDF to MDF.
ACID Properties in SQL Server
Atomicity means all the statements of a transaction must complete successfully or rolled back completely means either all (or) None to updated.
Consistency means a transaction never leaves database in half-finished state, whenever any change happen on parent object it should automatically reflect on dependent child object to ensure that database in a consistent state.
Isolation keeps changes of incomplete transactions independent from one another.
Once a transaction is committed, it must be permanent even if there is a system failure means it cannot be rolled back.
SQL Server Database Architecture
SQL Server stores data mainly in two types of files.
- Data File (MDF)
- Log File (LDF)
MDF – It contains Permanent Data
LDF – LDF contains what ever changes we are performing on database all the change related information will be recorded in LDF file.
SQL Server Buffer
Buffer is a ram to perform modifications on a copy of permanent page. Once it commits record the information will record in LDF and same changes apply on MDF when checkpoint runs.
How Buffer Works:
SQL Server will not allow to do modifications directly on MDF. SQL Server will make a copy of pages from MDF to buffer. Once transaction is full committed it records the information that what type of data he is inserting, Number of pages affecting, what he is performing all these change related information will record in same sequential way in Log File. Pages will stay some time in buffer for faster retrieval read and write operations from buffer will be very faster comparing to operations from MDF Data. Using recorded information whenever check point runs on log file. It applies same changes permanently on MDF file.
SQL Server Checkpoint Process
Check point is internal mechanism performs regular based on number of transaction (or) number of pages there is no time interval for running this. Checkpoint scans log file, checks how many committed transaction are there, how many failed and how many still running committed transactions more to MDF, failed transactions will be rolled back. Currently running transactions will not to be touched by checkpoint.
Advantages of checkpoint in SQL Server,
- Checkpoint help in speeding recovery process
- Checkpoint helping in committing data permanently
SQL Server Recovery Process
Whenever sql server restarts checkpoint verifies pending transactions before restart, sql server will perform recovery process. This process will analyze what is the state of log file and perform 2 properties.
- Redo (or) Roll forward – committed changes will be moved from LDF to MDF permanently.
- Undo (or) Roll Back – failed transactions and running transactions will be deleted from log file.
Once this recovery process complete then only users can able to access the database.
Lazy Writer in SQL Server
Modified pages will be in buffersome time, whenever buffer is about to fill with these modified pages, Lazy writer is another internal mechanism usually in sleep mode invokes and clear the buffer pages.
It uses LRU algorithm in clearing, LRU stands for L. Recently used pages, on page header of page there will e reference counter means how many times this page being used, based on counter least used pages will be deleted in buffer.
Pages commit in log file and waiting for check point to more mdf, those called dirty pages.
SQL Server Transaction Log Architecture
SQL Server uses LSN (Log Sequential Number) in identifying the transaction. Each and every transactions that comes to log file will associate with a LSN number. Roll forward and roll back will be done internally using these LSN numbers only.
WAL (Write Ahead Logging):
Before commit in MDF every transaction should written a entry in log file is called WAL. Transactions never comes to mdf directly.
Log file divided into 2 parts.
- Active portion (or) Physical Log
- Inactive Portion (or) Virtual Log
Active Log Portion: when ever performs transactions it will have 3 states.
- Committed in log file and waiting for check point.
- Failed in the middle
- Transactions still running
All these 3 states transactions will be in Active Portion of Log file. When checkpoint runs committed transactions make a copy in inactive portion and moves to mdf.
SQL Server maintains fully committed transaction in these Inactive portion. This portion only used for taking the backup of log. Whenever we take log backup it copies the inactive portion and truncates the inactive portions.
We have 2 types of backup for log portions.
Full backup takes backup of MDF and Active log portion log backup takes backup of inactive log portion. This portion we call as virtual log. SQL Server not uses these records that’s why it call as Inactive virtual logs.
SQL Server Log Architecture
Inactive portion further divided into more virtual logs we have a property called log reusability. Log backup copy inactive portion to a file and truncates the log data. Same space can be used multiple times called log reusability concept.
Transaction log is a cyclic process of writing log record into virtual log file by sql server. When ever one virtual log is filled up it will goes to next virtual log. If all virtual logs files are filled up the inactive portion will grow further and creates more virtual logs, till we have log space allocated. If it cannot grow further it will throw an error “ Transaction log for database is full and transaction will fail”.
The only way to clear inactive virtual log is to take log backup released logs. After truncation this space will be released. Backup will not active portion.
Advantages of T-Log:
- IT provides Transactional consistency.
- It provides transactional recoverability
- It provides log reusability.
Enhancement in SQL Server 2012 (Indirect checkpoint):
Checkpoint occurs automatically based on work hard (or) by a certain operations internally. We can set sql server level recovery interval to run checkpoint by using SP_configure (or) manually issue a checkpoint using checkpoint T-SQL command.
In SQL Server 2012, we have enhancement on checkpoint at database level TARGET T- RECOVERY-TIME option. Increasing time of recovery to seconds (or) minutes.
Alter database sql test 2012 set Target-recovery-time = 5 seconds.
Here every 5 seconds checkpoint will occur.