< Back To All Road Maps

Database Administrator Roadmap

A complete guide to becoming a PostgreSQL Database Administrator in 2022

Database Administrator Roadmap

A complete guide to becoming a PostgreSQL Database Administrator in 2022

Learn basics of RDBMS

Learn basics of RDBMS

To become a database administrator, start your learning journey with relational database management systems (RDBMS) and relational databases. The relational database model is one of the most widely used across organizations to meet a wide range of information needs. 

  • What is SQL
  • What is SQL

    SQL is an acronym for Structure Query Language. It is a standard language for manipulating and retrieving data stored in relational databases. In short, SQL lets us communicate with relational databases. With SQL, we can create databases, views, stored procedures, and tables and insert, update, and delete records from a database. 
     

  • What is RDBMS
  • What is RDBMS

    RDBMS stands for a relational database management system, which is a software program for storing, manipulating, querying, and retrieving data from relational databases. Relational databases store only structured data in the form of tables, i.e., rows and columns. They leverage SQL to query data. 
     

  • Object models in RDBMS
  • Object models in RDBMS

    Object Models database is a combination of Relational Database and Object Oriented Data Models. These types of databases support the class, objects, inheritance, etc. properties of Object Oriented Modals and Tabular Like structures of Relational Database. 

    This type of Database is used with the frameworks of Object-oriented Programming languages. So the developer can write the code in object-oriented programming languages and it can build a database in relational databases. 

     

  • Relational model
  • Relational model

    Proposed by E.F. Codd in 1969, a relational model is an approach to organizing and managing data in the form of relations called tables. It represents a database as a set of tables or relations, where each relation stores data in tuples (rows) and attributes (columns). 
     

  • Database Concepts
  • Database Concepts

    It is obligatory for every database administrator to have a strong understanding of all the database concepts, from the basics to the advanced. Some important database concepts include a database schema, constraint, metadata, instance, query, data engine, and data manipulation. 
     

Pick a SQL

Pick a SQL

Once you master database concepts and learn about RDBMS, now it is time to choose among the available relational database management systems. After you decide on the RDBMS, download, install, and configure it so that you can use it and practice database concepts. 

  • PostgreSQL
  • PostgreSQL

    PostgreSQL, also called Postgres, is a free and open-source relational database management system. It is an object-relational database that has been popular since its inception for its reliability, performance, and reliability. It is available for Windows, macOS servers, FreeBSD, OpenBSD, and Linux. 
     

  • MySQL
  • MySQL

    MySQL is an open-source and free relational database management system. It is also available under a wide range of proprietary licenses. It is compatible with Windows, macOS, FreeBSD, Linux, and Solaris. In addition, it is one of the components of the LAMP (Linux, Apache, MySQL, PHP/Perl/Python) stack. 
     

  • Oracle
  • Oracle

    Oracle or OracleDB is a relational database management system developed and marketed by the Oracle Corporation. It was the first RDBMS designed for data warehousing and enterprise grid computing. It comes in five editions: Enterprise, Standard, Express, Oracle Lite, and Personal. 
     

  • Microsoft SQL Server
  • Microsoft SQL Server

    Microsoft SQL Server, a Microsoft product, is a relational database management system. Along with Oracle and IBM’s DB2, SQL Server is one of the three popular and leading database technologies. It is compatible with Linux, Microsoft Windows, and Windows Server. 
     

  • MariaDB
  • MariaDB

    MariaDB is a free and open-source relational database management system and is the fork of MySQL. It is available for Windows, Linux, and macOS platforms. In spite of the fact that it is a fork of MySQL, it incorporates features that differ from MySQL. 
     

Install and Run a SQL (PostgreSQL)

Install and Run a SQL (PostgreSQL)

Among the available RDBMS, in this roadmap, we shall help you with the PostgreSQL RDBMS. So, the first step would be installing PostgreSQL on your system. In the following section, we shall discuss how to install PostgreSQL on different platforms. 
 

  • Install PostgreSQL on Windows
  • Install PostgreSQL on Windows

    The installation process of PostgreSQL has been easier than before since the release of PostgreSQL 8.0. This version comes with the installer for Windows to make the installation process quick. To install PostgreSQL on Windows, you need to follow the three simple steps: 

    1. Download the PostgreSQL installer for Windows. 
    2. Install PostgreSQL.
    3. Verify the installation. 
       
  • Install PostgreSQL on Linux
  • Install PostgreSQL on Linux

    To install PostgreSQL on Linux, initially, it is essential for you to choose your Linux distribution and then download the corresponding file. However, on most Linux platforms, PostgreSQL comes integrated with package management.  So, if it is available in package management, install it from there itself. 
     

  • Install PostgreSQL using Docker
  • Install PostgreSQL using Docker

    Docker is a containerization platform for developing, running, and deploying software applications. To use PostgreSQL with Docker, you need to set up a Docker PostgreSQL environment.

    You can do this in three simple steps: 

    1. Download and install Docker
    2. Download Docker PostgreSQL Image
    3. Install PGAdmin on Docker
       
  • Learn how to configure PostgreSQL on Servers
  • Learn how to configure PostgreSQL on Servers

    There are various server providers on the internet and they all have some steps to configure the PostgreSQL database on their servers. As a Database Administrator, you are also supposed to know how to set up a PostgreSQL database on different servers. 

    1. How to Create and Connect PostgreSQL on AWS
    2. Setup PostgreSQL on Azure
    3. Setup PostgreSQL on Digital Ocean
    4. Setup PostgreSQL on Google Clouds

     

Learn Basic SQL concepts

Learn Basic SQL concepts

As PostgreSQL is a relational database management system, learning SQL is a must to query and manipulate data stored in PostgreSQL databases. The following are some essential SQL concepts that you must master in order to use the PostgreSQL database. 
 

  • Basic of SQL data types
  • Basic of SQL data types

    A data type in SQL describes the type of value that you can store in a table column. SQL data types are broadly categorized into the following categories: 

    • Numeric
    • Date and time
    • Character and string
    • Unicode character string
    • Binary 
    • Miscellaneous
       
  • Data Manipulation Language (DML) Queries
  • Data Manipulation Language (DML) Queries

    The data manipulation language (DML) queries are SQL commands for manipulating or modifying data stored in a database. They control access to data and databases. The following is the list of data manipulation language (DML) commands: 

    • INSERT: Add new data to a database. 
    • UPDATE: Modify existing data in a database. 
    • DELETE: Delete specific data from a database. 
       
  • Data Query Language (DDL) Queries
  • Data Query Language (DDL) Queries

     
    Data Definition Language is the SQL commands that define the database schema. 

    Here is a List of DDL commands: 

    • CREATE: To create the database
    • DROP: To delete the objects from the database.
    • ALTER: To alter the structure of the database.
    • TRUNCATE: To remove the records from the Database table. 
    • RENAME: To Rename the existing table.
  • Data Query Language (DQL) Queries
  • Data Query Language (DQL) Queries

    There is a single data query language (DQL) command in SQL used for fetching the required data from a database. This command is the SELECT statement. You can leverage the WHERE clause with SELECT to specify conditions for data retrieval.

  • Data Control Language (DCL) Queries
  • Data Control Language (DCL) Queries

    The data control language (DCL) queries are useful in providing the rights and permissions to users to access databases. There are two DCL commands, namely GRANT and REVOKE. 

    • GRANT: Provides users with the access privilege to databases. 
    • REVOKE: Takes back the database access privilege from users.  

Learn how to Configure Postgres with other services

Learn how to Configure Postgres with other services

If you want to use PostgreSQL for your application, it becomes necessary to configure it first with the programming language and other related services you leverage. So, you need to learn how to configure Postgres with programming languages, frameworks, and other development environments. 
 

  • Configure Postgres with Java
  • Configure Postgres with Java

    To use PostgreSQL with Java programs, it is essential to have PostgreSQL JDBC and Java installed and set up on your system. The PostgreSQL JDBC driver will help you use PostgreSQL in your Java programs. 
     

  • Configure Postgres with Python
  • Configure Postgres with Python

    To connect to the PostgreSQL database using Python, you will need to use Psycopg2. Psycopg2 is a PostgreSQL connector library in Python. This library implements the Python DB API 2.0 specification and wraps the libpq PostgreSQL library using the C programming language. 
     

  • Configure Postgres with PHP
  • Configure Postgres with PHP

    PHP is a scripting language well-known for its faster connectivity with databases. The two popular methods to connect to the PostgreSQL database through PHP are as follows: 
    PostgreSQL native functions
    PDO (PHP Data Objects)
     

  • Configure Postgres with Ruby on Rails
  • Configure Postgres with Ruby on Rails

    It is pretty simple to use Ruby on Rails with the PostgreSQL database. You need Ruby on Rails and PostgreSQL installed on your system, and by running a few commands, you will be able to use the database with Ruby on Rails applications. 

  • Configure Postgres with Node
  • Configure Postgres with Node

    The ecosystem of Node.js offers an array of tools to connect with databases. To connect with PostgreSQL, it comes with node-postgres containing modules. This tool allows the Node.js interface to communicate with the PostgreSQL database. 
     

Learn PostgreSQL security concepts

Learn PostgreSQL security concepts

Data is the lifeblood of any organization, and protecting it from being accessed by unauthorized users has been a major concern. It will be your responsibility as a database administrator to keep data and databases safe from any dangers. Therefore, it is imperative that you understand the fundamentals of database security. 
 

  • Authentication, permissions, and SSL settings
  • Authentication, permissions, and SSL settings

    Authentication implies an individual trying to log in to a database is authorized or has the right to access the database. 
    PostgreSQL makes it easy to manage permissions with the help of roles. Different users can be assigned different roles, and each role has different privileges. For example, a superuser is a role in Postgres with all privileges. 
    The PostgreSQL database supports SSL (Secure Socket Layer) connection that allows secure connection with databases. To access databases securely, learn how to enable SSL in PostgreSQL. 
     

  • Object previleges
  • Object previleges

    In PostgreSQL, whenever an object is created, an owner is assigned to it. For most objects, only their owners can access and modify them initially. However, PostgreSQL allows other users or roles to use or access those objects through privileges or permissions. 
     

  • SQL injection
  • SQL injection

    SQL injection is one of the most common web hacking approaches that can destroy all data stored in databases. It involves injecting malicious SQL statements through web page input that get executed and destroys databases. 
     

  • Row-Level Security
  • Row-Level Security

    The open-source PostgreSQL database provides a security feature called row-level security. As a database administrator, you will be tasked with leveraging this feature to define security policies on tables so that they can control the viewing and modification of data on a per-user basis. 
     

Learn Infrastructure Database Administrator Skills

Learn Infrastructure Database Administrator Skills

Data assets, the bodies that maintain those data assets, and guides that describe the ways to use the collected data all collectively are called database infrastructure. In short, we can say that it is a collection of processes, technology, and organization. 
 

  • Data Replication
  • Data Replication

    Data replication is the mechanism of creating multiple copies of data and storing them in different locations. It ensures the high availability of data across a network and no data loss. Even if the data at one location gets lost, its copies are available at different locations. 
     

  • Backup and Data Recovery
  • Backup and Data Recovery

    A backup is generally a copy of data, which safeguards organizations against unexpected data losses. Recovery entails operations for restoring, rolling forward, and rolling back a backup. In short, both backup and recovery involve several operations to protect data stored in databases. 
     

  • Upgrading precedures
  • Upgrading precedures

    Procedures, often referred to as stored procedures, in DBMS are sub-routines containing a set of SQL statements to be executed to accomplish a specific task. 
     

  • Connection Pooling
  • Connection Pooling

    Connection pooling in a database is an approach to keeping databases open with the intent of reusing them by others. Generally, opening a database connection is an expensive operation. Connection pooling helps in reducing the cost required to create a new connection with a database. 
     

  • High Data cluster management
  • High Data cluster management

    A data cluster is a small group of data that shares common characteristics and significantly differs from other clusters in a database. Cluster management entails a variety of operations, such as node management, configuration management, upgrades, backup management, schema and user management, and private keys and certificates management. 
     

  • Resources usages and planning
  • Resources usages and planning

    As a database administrator, you must be able to outline a plan for resource usage and management effectively. According to DBA’s resource planning, resources are allocated to each user of a database. 
     

Learn Automation

Learn Automation

Database automation refers to the employment of processes and self-updating procedures to accomplish various administrative tasks in a database. It significantly reduces errors on deployments and boosts the reliability and speed of implementing changes. 
 

  • Automate the SQL shells script using Bash
  • Automate the SQL shells script using Bash

    Bash is a shell or command line interpreter for GNU operating systems. It is an acronym for Bourne-Again SHell. More often, as a database administrator, you will be tasked with checking a few details of a database timely, updating or inserting data, or executing a SQL file.  In such cases, you can write a script and automate the execution of SQL files using Bash. 
     

  • Automate the SQL shells script using Python
  • Automate the SQL shells script using Python

    Python is one of the most popular script languages. And with its robust libraries and frameworks, we can easily connect with any relational database. 

    1. Python PostgreSQL tutorial
    2. How to use MySQL with Python

Learn DBA Application Skills and Advance Topics

Learn DBA Application Skills and Advance Topics

Once you master basic database and SQL concepts, you can now move on to learning and mastering the following advanced concepts. 
 

  • Data Migration
  • Data Migration

    Data migration is the method of transferring data from one location to another, from one application to another, or from one format to another. In short, it is a method for introducing a new location or system for data. 
     

  • Data import, export, loading and processing
  • Data import, export, loading and processing

    Understand how to import, export, load, and process data with PostgreSQL. 
    Import: Copying data into a database from external files created by exporting from another database. 
    Export: Copying data from a database to external files that need to import to another database. 
    Load: Copying data into a database from external files that are in any formats supported by a database. 
     

  • Data partitioning and sharding patterns
  • Data partitioning and sharding patterns

    Data partitioning, also sometimes known as data sharding, is the process of distributing large amounts of data across multiple tables, each table holding a small chunk of data called a shard, with the aim of improving query processing performance. 
     

  • Processes and memory architecture
  • Processes and memory architecture

    There are four types of processes in the PostgreSQL database: the Postmaster (Daemon) Process, Background Process, Backend Process, and Client Process. 
    The PostgreSQL database consists of a memory classified into two different areas, namely the local memory area and the shared memory area. Every backend process allocates itself to the local memory area, while all the processes use the shared memory area. 
     

  • Vacuum processing
  • Vacuum processing

    In PostgreSQL, databases require regular maintenance called vacuuming. The vacuum command processes each table in a PostgreSQL database for various reasons, such as: 

    1. Recover or reuse the disk space occupied by deleted or updated rows. 
    2. Upgrade data statistics the PostgreSQL query planner leverages. 
    3. Update the visibility map. 
    4. Protect the old data from getting lost. 
       
  • Physical storage and file layout
  • Physical storage and file layout

    In databases, physical storage contains all the files containing data and is independent of logical storage, such as tables, views, and indexes. On the flip side, the file layout in databases is the storage format at the level of files and directories. 
     

  • Storage parameters
  • Storage parameters

    For database tables, you can set storage parameters that affect the way the tables are managed in databases. For instance, a storage parameter defines the space and size in which the table should be created. 
     

  • Workload-dependant tuning: OLTP, OLAP, HTAP
  • Workload-dependant tuning: OLTP, OLAP, HTAP

    OLTP is an acronym for Online Transaction Processing. It is a kind of data processing that involves executing transactions that occur concurrently. 
    OLAP is the short form for Online Analytical Processing. It is an effective approach to answering multi-dimensional analytical queries in computing. 
    HTAP stands for Hybrid Transactional/Analytical Processing. It is the ability of a database to perform OLTP and OLAP for real-time operational intelligence processing. 
     

  • Recursive CTE
  • Recursive CTE

    CTE stands for Common Table Expression. It is simply known as the ‘with’ clause. It returns a temporary data set that you can use in another SQL query. A recursive CTE references itself, i.e., it returns the subset of the result and continuously references itself until it returns the entire result. 
     

Learn SQL Troubleshooting and Optimization techniques

Learn SQL Troubleshooting and Optimization techniques

The next step is to learn various SQL troubleshooting and optimization techniques to improve the query performance significantly. 
 

  • Experience with operating system tools
  • Experience with operating system tools

    As a DBA, you should be familiar with operating systems tools. This is because a database server depends on the operating system of a host computer to provide users with access to resources, such as  CPU, memory, files, and unbuffered I/O disks. Also, it is important to note that every operating system comes with its own utilities to address the usage of system resources. 
     

  • Work with pg_stat_activity and pg_stat_statements
  • Work with pg_stat_activity and pg_stat_statements

    The pg_stat_statements table contains information about queries that were executed in the past. On the other hand, the pg_stat_activity table entails information about the queries that are currently running. 
     

  • Query Analyzing
  • Query Analyzing

    In the database, query analysis is the process of leveraging SQL statements to determine the ways to optimize queries for better performance. It assists in boosting the overall performance of query processing that speeds up various database functions. 
     

  • Log Analyzing
  • Log Analyzing

    Log analysis is a procedure of analyzing data from log entries and translating it into data that is knowledgeable and usable. This knowledgeable data can be utilized for making business decisions. 
     

  • Troubleshooting with USE, RED, Golden signals
  • Troubleshooting with USE, RED, Golden signals

    USE stands for utilization, saturation, and errors, whereas RED is an acronym for rate, errors, and duration. There are four golden signals, namely latency, traffic, errors, and saturation. All these aspects are ideal for monitoring and troubleshooting database solutions. 
     

  • Optimize SQL database with B-tree, Hash, GiST, SP-GiST, GIN, BRIN
  • Optimize SQL database with B-tree, Hash, GiST, SP-GiST, GIN, BRIN

    A B-tree or multi-way search tree is a tree whose all the leaves are at the same level. It is also referred to as a balanced stored tree. 
    In databases, hashing is a process of finding the desired data on the disc without using the index structure. 
    GiST is an acronym for Generalized Search Tree. It is a balanced, tree-structured access technique that serves as a basis for the implementation of various arbitrary indexing techniques.
    SP-GiST is a short form for space-partitioned GiST. It supports partitioned trees that help in the development of non-balanced data structures. 
    GIN is an inverted index, whereas BRIN is Block Range INdexes. 
     

  • Optimized SQL schema design patterns
  • Optimized SQL schema design patterns

    A database schema is generally a formal description of the structure of a database. Only relational databases require to have the schema defined before creating tables. For faster data retrieval, it is essential to choose the optimized schema design pattern. 
     

Get Certified

Get Certified

After you learn all the basic and advanced concepts of SQL and databases, you can move on to validate your skills by pursuing a certification. A plethora of database certifications is out there that you can choose from to demonstrate your expertise in databases. 
 

  • PostgreSQL v12 Database Administration on Windows and Linux
  • PostgreSQL v12 Database Administration on Windows and Linux

    It is a certification program from Udemy for absolute beginners intended to make them well-versed in PostgreSQL v12. This course will help individuals understand every concept of PostgreSQL and how to install it on Windows and Linux systems. You just need to have a basic understanding of database concepts and working knowledge of Windows and Linux. 
     

  • Oracle Certified Associate - Oracle9i Database Administrator (OCA)
  • Oracle Certified Associate - Oracle9i Database Administrator (OCA)

    Oracle 9i DBA OCA is an entry-level certification issued by Oracle Corporation. It validates individuals for their foundational skills in database administration. One who holds this certification can work as a junior team member with other database administrators in the team. It serves as a prerequisite for many professional-level Oracle DBA certifications. 
     

  • Oracle Certified Professional - Oracle 9i Database Administrator (OCP)
  • Oracle Certified Professional - Oracle 9i Database Administrator (OCP)

    Oracle9i DBA OCP is a high-level certification that demonstrates individuals’ in-depth knowledge of Oracle database design and implementation skills. Professionals with this certification are capable of designing, developing, and implementing Oracle-based solutions. 
     

  • Microsoft Certified Database Administrator (MCDBA)
  • Microsoft Certified Database Administrator (MCDBA)

    MCDBA is a professional who has cleared the certified database certification exam on Microsoft Windows 2000 or 2003, SQL Server Design, Microsoft SQL Server Administration, and the elective approved by Microsoft. However, the Microsoft Certified IT Professional (MCITP) Database certification has replaced MCDBA. 

  • Pro: Windows Server 2008, Enterprise Administrator (MCITP)
  • Pro: Windows Server 2008, Enterprise Administrator (MCITP)

    Professionals holding this certification are in charge of managing the overall Windows Server 2008 R2 environment and architecture. They are also responsible for translating business goals into technology decisions. Also, they make significant decisions regarding the network infrastructure, security policies, business continuity, and many other aspects.
     

  • TS: SQL Server 2008, Implementation and Maintenance (MCTS)
  • TS: SQL Server 2008, Implementation and Maintenance (MCTS)

    For individuals having two to three years of experience working with Windows Server 2008 or any other previous versions, this certification is a perfect choice. It tests applicants’ knowledge of installing, maintaining, monitoring, troubleshooting, and optimizing the performance of Windows Server 2008.