Showing posts with label POSTGRESQL. Show all posts
Showing posts with label POSTGRESQL. Show all posts

Jul 3, 2023

PostgreSQL Architecture and Back-end Process

Hey guys, in this blog I am going to explain to you about PostgreSQL Architecture and Back-end Process.

PostgreSQL-

PostgreSQL is an open-source general-purpose object-relational database management system that is one of the most advanced. Because it is open-source software, its source code is available under the PostgreSQL license, which is a broad open-source license. Anyone with the necessary knowledge can use, edit, and distribute PostgreSQL in any form. This DBMS requires extremely little work to maintain because of its great stability.

Why should you want to use PostgreSQL-

  • Customized types
  • Inheritance in tables
  • Complex locking mechanism
  • Integrity of foreign key references
  • Rules, views, and subquery
  • Save points transactions that are nested.
  • MVCC or multi-version concurrency control
  • In-between-time replication
  • Native Version of Microsoft Windows Server
  • Tablespaces
  • Moment-by-moment recovery.  

Why is PostgreSQL special in its own right?

  • PostgreSQL introduced the multi-version concurrency control (MVCC) capability first.
  • PostgreSQL can be expanded using custom functions created in languages like C/C++, Python, Java, etc.
  • One can define their own data types, index types, functional languages, etc. because PostgreSQL is expandable by design.
  • One can always create a unique plugin to customize the system to match their own needs if they choose to delete any aspect of it. 

Large Scale users of PostgreSQL-

PostgreSQL has been used by a lot of businesses to create goods and solutions. Apple, Fujitsu, Red Hat, Cisco, Juniper Network, and a few other companies are among them.

PostgreSQL's architecture is client-server. A PostgreSQL service consists of two processes-

  • Server-side process
  • Client-side process (Front-end applications)

Server-side process-

The "Postgres" programmers manage connections, operations, and static and dynamic assets.

Client-side process (Front-end applications)-

Users utilize these programmers to communicate with the database. It often has a straightforward user interface (UI) and uses APIs to facilitate communication between the user and the database.

Client-side Process –

When a user runs PostgreSQL queries, the client application can communicate with the Postmaster Daemon Process (the PostgreSQL server) and submit queries using one of the many Databases Client Application Programmed Interfaces supported by PostgreSQL, such as JDBC, Perl DBD, ODBC, etc., which helps to provide client-side libraries. As seen in the following diagram, the Client Process uses Library API to facilitate communication between the Client Application and Client Application library.

PostgreSQL System Architecture


Postmaster Daemon Process –

PostgreSQL's system design is built on the Process-Per-Transaction Model (Client/Server Model). Postmaster, a central coordinating procedure, manages a functioning PostgreSQL site. It is sometimes referred to as the Server Process.

The postmaster daemon process is in charge of-

  • Setting up the server
  • Turning off the server
  • Handling new client connection requests.
  • Carry out Recovery.
  • Execute Background Processes.

Shared Memory-

Shared memory is memory that is accessible by numerous programmers at the same time in order to give fast and efficient results with minimal redundancy. This RAM is set aside for database caching and transactional log caching. In PostgreSQL, shared disc buffers and shared tables are used, and their operation is described below.

Shared Disc Buffer-

The shared disc buffer's aim is to reduce disc input/output. If it is not used, disc input/output takes longer, resulting in redundancy and an inefficient system. The following are the benefits of employing a shared buffer:

  • Time should be reduced.
  • Can easily access a big amount of data.
  • Reduce heating when numerous users access the system at the same time.

Shared Tables-

This method involves using the same set of tables to hold data from numerous clients. The following are the primary benefits of employing this approach:

  • The Most Affordable Hardware
  • The Most Affordable Backup
  • It enables working with massive amounts of data in a single database.

UNIX System-

The Kernel Disc Buffer in UNIX System maintains a Memory buffer and provides physical storage to data in disc storage. Also, PostgreSQL instructions are evaluated to ensure that the syntax is valid and produce an error message indicating what is lacking in the command, etc.

Back-end process-

The Postmaster is in charge of initial client connections. As a known port, it constantly waits for new connections. Following an initialization process, such as user authentication, the postmaster will launch a new backend server process to service the new client. The client solely interacts with the Backend server process, such as query submission and query results. This demonstrates that PostgreSQL employs the Process-per-Transaction model.

The Backend Server is in charge of executing client-submitted queries by conducting particular processes. Each backend server will only handle one query at a time. Many clients are connected to the system at the same time, resulting in many backend servers running queries concurrently. The back-end server accesses data from the shared memory main-memory buffer pool.

The obtained result is then supplied to the Client Process by the Back-end Process.

 WAL(Write Ahead Log)Writer                   

This procedure WAL data is written to and flushed from the WAL buffer.                                                                                                                                              

logging collector

This procedure is also known as logger. An error notice will be written to the log file.

Auto vacuum launcher

When auto vacuum is enabled, this process is in charge of the auto vacuum daemon performing vacuum operations on bloated tables. For perfect table analysis, this process relies on the stats collector process.

Archiver

When Achiever is enabled, the process is in charge of copying the WAL log file to the specified directory.

stats collector

This statistics collects data such as pg_stat_activity and pg_stat_all_tables.

check pointer 

The dirty buffer is written to the file when a checkpoint occurs.

writer

It will write the dirty buffer to the file on a regular basis.



Shared Pool-

A shared pool is a RAM region within the RAM Heap that is established at startup. SGA (System Global Area) includes a common pool. If the Shared Pool is not present in RAM or is not used, high library cache reloads and high row cache reloads occur.

Why did PostgreSQL not employ Shared Pool?

PostgreSQL does not have a shared pool, although in most database systems, such as Oracle, the shared pool is a key component of the structure. It does not have it since, unlike Shared Pool, PostgreSQL will give a mechanism to share SQL information at the process level. Simply put, if a user executes the same SQL query multiple times in one process, it will hard-parse only once, which is advantageous over other Database systems because, in another database system that uses a shared pool, hard-parsing occurs for a single SQL statement that is loaded from the Shared pool. More loads will be generated if the user executes a single SQL query numerous times at the same time.

OID in PostgreSQL -

In PostgreSQL, OID stands for Object Identifier types. PostgreSQL uses OID as a Primary Key in several system tables. It's implemented as an unsigned four-byte integer. We can also utilize OID in the user-defined table as "WITH OIDS ", but this is discouraged because it is not large enough to give uniqueness in a large user-defined table. OIDs are typically appropriate for system tables. It essentially provides a constructed id for each row contained in the system column.

The feature of OID for User tables has been removed indirectly in PostgreSQL 12, which means that we can utilize OID explicitly.

 

PostgreSQL Advantages-

  • PostgreSQL is a highly risk-tolerant database with low maintenance costs.
  • To run dynamic websites and web applications, it employs the LAMP (Linux, Apache, MySQL, PHP) stack.
  • The source code for PostgreSQL is publicly available under an open-source license. This gives you the ability to use, adapt, and implement it as needed for your business.
  • Postgres requires little training to learn because it is simple to use.
  • Geographic objects are supported by PostgreSQL, allowing it to be used for location-based services and geographic information systems.

PostgreSQL Disadvantages-

  • It is a little slower than a commercial database.
  • In comparison to MYSQL, it does not support many open-source apps.
  • Many open-source applications support MySQL, but not PostgreSQL.
  • Changes for performance improvement necessitate more work than in MySQL since PostgreSQL focuses on compatibility.