Splunk is a popular extensible data platform that finds use in the internet of things. In this article, we are going to discuss some of the most important Splunk interview questions. But before that, let’s know a little more about Splunk.
What is Splunk?
Splunk is a popular tool for storing and analyzing machine-generated data. It comes in both free and paid versions. In fact, it is available with many licenses. It uses machine data to identify patterns, offer metrics, diagnose issues, and provide intelligence for business operations.
Moreover, Splunk is not just a tool but also a platform that offers many features, which makes it easy for organizations to deal with machine-generated data. It is used for application management, business analytics, compliance, security, and web analytics.
Splunk Interview Questions and Answers
Here we have categorized Splunk interview questions in three sections listed below:
Beginner-Level Interview Questions
1: What is Splunk?
Splunk is the Google Search of machine-generated data. It helps users to monitor, report, search and visualize enterprise data. With alerts, charts, reports, and so on, Splunk helps in getting real-time insights from the collected data.
2: Enumerate some important features of Splunk.
Some of the most notable features of Splunk are:
- Splunk turns hidden patterns within data into real-time business insights that help in making informed business decisions.
- It offers operational visibility.
- Splunk monitors systems in real-time, which helps to detect issues and vulnerabilities in the same.
3: What is the difference between Splunk and Spark?
Parameter | Splunk | Spark |
Type | Splunk is a platform that allows users to analyze machine-generated data. | Apache Spark is an open-source unified analytics engine for large-scale data processing. |
Used for | Gathering and analyzing vast amounts of machine-generated data. | In-memory processing and iterative applications. |
License | Proprietary | Open-source |
Working mode | Streaming mode | Batch and streaming modes |
4: Can you list some of the common port numbers used by Splunk?
Splunk uses a plethora of port numbers. Some of the most common ones are as follows:
- 514 - Splunk Network port for getting UDP data from the Network port
- 8000 - Splunk Web port
- 8080 - Splunk Index Replication port
- 8089 - Splunk Management port
- 8191 - KV Store
- 9997 - Splunk Indexing port
5: Enumerate the components of Splunk.
There are four components of Splunk:
- Search Head - It provides a graphical user interface for searching.
- Indexer - It indexes the machine data.
- Forwarder - It forwards logs to the Indexer.
- Deployment Server - It helps to manage Splunk components in a distributed environment.
6: What are the functions of Splunk Indexer?
Splunk Indexer is responsible for creating and managing indexes. It indexes incoming data and searches the indexed data.
7: Explain the types of Splunk Forwarders.
The Splunk Forwarder works as an intermediate forwarder, data filter, and remote collector. It is of two types:
- HWF (Heavyweight Forwarder) - It is a full instance of Splunk with advanced features.
- UF (Universal Forwarder) - It is the Splunk agent installed on a non-Splunk system for gathering data locally. It is unable to index or parse data.
Because the Splunk Forwarder parses data, it is not recommended for production systems.
8: What is the summary index in Splunk?
The default Splunk index is the index that Splunk Enterprise uses if there is no other defined index. The summary index is the default Splunk index. Users must create additional summary indexes if they want to run many summary index reports.
9: Explain Splunk DB Connect.
Splunk DB Connect is a SQL database plugin for Splunk. It allows users to integrate database information with Splunk reports and queries easily.
10: Can you name some important Splunk search commands?
- abstract
- addtotals
- anomalies
- erex
- filldown
- rename
11: What is a license violation in Splunk?
When the user exceeds the data limit, the license violation error is displayed on the dashboard, and it remains for 14 days. In the free version of Splunk, users get only three warnings. For commercial licenses, users have five warnings in a 30-day window. After that, the Indexer’s search results and reports will not trigger.
12: What are some alternatives to Splunk?
Loggly, LogLogic, Logstash , and Sumo Logic are some of the best alternatives to Splunk.
13: Enumerate the important configuration files in Splunk.
- indexes.conf
- inputs.conf
- props.conf
- server.conf
- transforms.conf
14: How many licenses does Splunk have?
Splunk is available in a variety of licenses to meet different user requirements. These are as follows:
- Free license
- Beta license
- Enterprise license
- Forwarder license
- Licenses for cluster members/index replication
- Licenses for search heads/distributed search
15: What is the latest version of Splunk?
The latest version of Splunk Enterprise is 8.2 released in May 2021.
16: Do you know where is Splunk default configuration stored?
$splunkhome/etc/system/default
17: How will you extract an IP address from logs?
The following are the two ways to extract an IP address from logs:
rex field=_raw "(?<ip_address>\d+\.\d+\.\d+\.\d+)"
or
rex field=_raw "(?<ip_address>([0-9]{1,3}[\.]){3}[0-9]{1,3})"
Intermediate-Level Splunk Interview Questions
18: Explain Buckets and Splunk Bucket Lifecycle.
Splunk stores indexed data in directories, which we refer to them as buckets. Splunk sets the bucket size 750MB and 10GB for 32-bit systems and 64-bit systems, respectively, by default. A bucket in Splunk is a directory storing events of a certain period. The buckets reside by default in $SPLUNK_HOME/var/lib/splunk/defaultdb/db. It has hot and warm buckets. A bucket goes through four phases during its life, known as the Splunk Bucket Lifecycle. These are:
- Hot - A hot bucket is open for writing and contains newly indexed data. For each index, there can be more than one hot bucket.
- Warm - A warm bucket has data rolled out from a hot bucket.
- Cold - A cold bucket is one that stores data rolled out from a warm bucket.
- Frozen - A bucket that has data rolled out from a cold bucket is a frozen bucket. Data available in a frozen bucket is not searchable. Although the indexer deletes frozen buckets by default, we can archive them. This archived data can be thawed at some later time.
19: How is the stats command different from the eventstats command?
Both eventstats and stats commands generate summary stats of all fields in the search results and save them as values in new fields. However, the eventstats command is slightly different as it aggregates computed requested statistics to the original raw data.
20: Explain the Time Zone property in Splunk.
A web browser sets the current Time Zone depending on the system that it is running on. Splunk sets the default Time Zone for a system depending on the browser settings. If the user searches for any event other than the set Time Zone, then there will be no relevant results. Hence, the Time Zone property is important for searching for events from a security standpoint.
21: What is Sourcetype in Splunk?
Sourcetype is a way of identifying data in Splunk. It refers to the default field that helps to identify the data structure of an incoming event. How Splunk Enterprise will format the data during the indexing process is determined by Sourcetype. Thus, it is important to ensure that the data has been assigned the right Sourcetype. You need to provide accurate timestamps, and event breaks to the indexed data for making data search easier. Sourcetype must be set at the forwarder level for indexer extraction to identify various data formats.
22: Explain Btool in Splunk.
It is a command-line tool that helps to troubleshoot configuration file issues. Moreover, it lets you know the values being used by a user’s Splunk Enterprise installation in the existing environment.
23: Do you know the command for restarting the Splunk web server?
You can restart the Splunk web server with the splunk start splunkweb command.
24: Explain disabling Splunk boot-start.
We can use the $SPLUNK_HOME/bin/splunk disable boot-start command to disable Splunk boot-start.
25: Give the commands to start and stop Splunk service.
For starting the Splunk service, we can use the ./splunk start command. By using the ./splunk stop command, you can stop the Splunk service.
26: What is the difference between Splunk App and Add-on?
Splunk Add-ons have only inbuilt configurations. They don’t have dashboards or reports. On the other hand, Splunk App is the collection of dashboards, alerts, field extractions, lookups, and reports.
27: Give the command for restarting Splunk Daemon.
We use the splunk start splunkd command to restart Splunk Daemon.
28: Explain how to clear the Splunk search history.
It is very simple to clear the Splunk search history. For doing so, you need to delete the searches.log file from the Splunk server. It can be found in $splunk_home/var/log/splunk.
29: How to check the running Splunk processes on Linux?
We can use the following command to check the running Splunk processes on a Linux system: ps aux | grep splunk We can use the same command for checking the running Splunk processes on a Unix system.
30: What are Splunk Alerts? What are the options for setting up Splunk Alerts?
Splunk Alerts help users to get notified of any errors on their system. There are various options for setting up Splunk Alerts. These are:
- Additional details via CSV or PDF files - Users can add results in CSV or PDF formats or inline with the message body to help the recipient(s) understand the conditions that triggered the alert and the actions that have been taken to deal with the same.
- Creating a webhook - Allows users to write to GitHub or HipChat . Users can write emails to several machines.
- Ticket creation - It is possible to create tickets and throttle Splunk Alerts on the basis of different conditions. Users can control these alerts from the alert window.
31: Explain Fishbucket Index.
Fishbucket stores CRCs and seeks pointers for the indexed files. This helps splunkd to check files that have already been read. It is an index directory that is present at /opt/splunk/var/lib/splunk. Users can access it via the GUI by searching for index=_thefishbucket.
32: Explain the Dispatch Directory.
Dispatch Directory has a directory that stores completed and running searches. It is stored at $SPLUNK_HOME/var/run/splunk/dispatch. The directory contains a CSV file containing all the search results and a search.log file that contains details about search execution in addition to other important information. In the default configuration, the user can delete this directory within ten minutes after the search completion. The search results are deleted after seven days, however, if the user saves them.
33: How will you set the default search time in Splunk 6?
To set the default search time in Splunk 6, we need to use the ui-prefs.conf file. It is available in $SPLUNK_HOME/etc/system/local. To set the default time range, you need to set the dispatch.earliest_time and dispatch.latest_time fields.
34: Explain the process of adding folder access logs from a Windows system to Splunk.
Following are the steps to add folder access logs from a Windows machine to Splunk:
- Go to Group Policy and enable Object Access Audit on the Windows system where the folder is located.
- Enable auditing for the folder for which you wish to monitor access logs.
- Next, install Splunk Universal Forwarder.
- Finally, configure the Splunk Universal Forwarder for sending security logs to the Splunk Indexer.
Advanced-Level Splunk Interview Questions
35: Compare search head pooling and search head clustering.
The search head cluster is managed by a captain that controls its slaves. Search head pooling and search head clustering facilitates the high availability of Splunk search head when any search head goes down. While search head clustering is a newly introduced feature, search head pooling is an old feature that serves the same purpose. However, search head clustering is more efficient and reliable than search head pooling. As such, search head pooling will be removed in the subsequent releases of Splunk.
36: How does Splunk avoid duplicate indexing of logs?
To avoid duplicate indexing of logs, Splunk Indexer tracks all the indexed events in the Fishbuckets directory. It contains seek pointers and CRCs for all the files that are being indexed. If there is any CRC or seek pointer that has already been read, Splunk points it out.
37: Explain the MapReduce algorithm.
The MapReduce algorithm is what facilitates faster data searching in Splunk. Technically, it is an algorithm for batch-based large-scale parallelization. Inspired by the map() and reduce() functions of functional programming , it is also used by popular big data processing tools like Hadoop and Apache Spark.
38: Explain Search Factor and Replication Factor.
Search Factor or SF is the number of searchable copies of data maintained by the Indexer cluster. On the other hand, the Replication Factor or RF is the number of copies of data maintained by the Indexer cluster. The Search Head cluster only has SF, while the Indexer cluster has both SF and RF. Search Factor must always be either less than or equal to the Replication Factor.
39: How is Splunk SDK different from Splunk App Framework?
Splunk App Framework allows users to customize the Splunk Web UI and develop Splunk apps with the Splunk web server. It resides within the Splunk web server. Splunk Framework is an indispensable part of Splunk, and it doesn’t come with a separate license. Splunk SDK, on the other hand, allows users to develop apps on their own and doesn’t necessitate Splunk Web or any other component of the Splunk App Framework. Moreover, it is licensed separately from Splunk.
40: Enumerate the types of search modes in Splunk.
There are three types of modes in Splunk that are as follows:
- Fast Mode: This mode limits the type of data and accelerated the search result.
- Verbose Mode: This mode is comparatively slower than the fast mode. However, it returns results for most of the possible events.
- Smart Mode: This mode provides maximum results in a short span of time by toggling between various modes and search behaviors.
41: Enlist the types of dashboards in Splunk.
The following are the different types of dashboards available in Splunk:
- Real-time dashboards
- Dynamic form-based dashboards
- Dashboards for scheduled reports
42: Can you state the difference between sort+ and sort-?
We use sort+ to display search results in ascending order and sort- to display them in descending order.
43: Which command will you use to remove duplicate events having common values?
We can remove duplicate events having common values using the dedup command.
44: What features does Splunk free lack?
Splunk free does not include the following features:
- Authentication and scheduled searches
- Distributed search
- Forwarding in TCP/HTTP
- Deployment management
45: How will you disable the Splunk launch message?
To disable the Splunk launch message, we need to set the value OFFENSIVE=Less in splunk_launch.conf.
46: Mention the precedence of the .conf files in Splunk.
The following is the precedence order of the .conf files in Splunk:
- System local directory: highest priority
- App local directories
- App default directories
- System default directory: lowest priority
47: How will you reset the Splunk admin password?
To reset the Splunk admin password, firstly, we need to log in to the server on which Splunk is installed. Later, we need to go to the following location, rename the password file, and restart Splunk.
$splunk-homeetcpasswd
After that, log in to Splunk using the username and the new password.
48: How many roles are there in Splunk?
There are three roles in Splunk that are as follows:
- Admin
- Power
- User
49: What different layout options are available in Splunk for search results?
The following are different layout options for search results in Splunk:
- List
- Table
- Row
50: Explain different boolean operators in Splunk.
Splunk supports the following three boolean operators:
- AND: It is always used between two terms. For example, web server is the same as web AND server . If you search for web AND server , the search result should contain both these terms.
- OR: If you search for web OR server , the resulting records can contain any of the search terms.
- NOT: This operator helps you exclude a specific word from your search.
Conclusion
These were some of the best Splunk interview questions that will give you an idea about the questions that you might face in an IoT-based job interview. Even if you are not preparing for an IoT interview, you can use these questions to check how well you know Splunk.
Also, do let us know other questions in the comments section below that you have encountered if you have been to any Splunk interviews.
People are also reading:
Leave a Comment on this Post