What is the use of Splunk?

Splunk is a software application used to supervise, search, analyze, and visualize machine-generated data in real-time. It captures, indexes, and correlates real-time data in a searchable container and generates visualizations.

What are the different versions of Splunk?

The three different versions of Splunk are: Splunk Enterprise, Splunk Light, and Splunk Cloud.

Is it easy to learn Splunk?

With proper resources, you can easily learn Splunk. A plethora of courses on Splunk is available on the internet. You can choose the one that fits your learning requirements and budget. Also, the courses will allow you to work on some real-world Splunk projects.

What are the alternatives to Splunk?

Popular alternatives to Splunk include SolarWinds Log Analyzer, Sumo Logic, Fluentd, ELK Stack, and LogFaces.

Which companies employ Splunk?

Some popular companies that make use of Splunk are IBM, Cisco, Facebook, Bosch, Walmart, Salesforce, Visa, Adidas, Adobe, and PepsiCo.

50+ Best Splunk Interview Questions and Answers to Prepare for Interviews

Splunk is a popular extensible data platform that finds use in the internet of things. In this article, we are going to discuss some of the most important Splunk interview questions. But before that, let’s know a little more about Splunk.

What is Splunk?

Splunk is a popular tool for storing and analyzing machine-generated data. It comes in both free and paid versions. In fact, it is available with many licenses. It uses machine data to identify patterns, offer metrics, diagnose issues, and provide intelligence for business operations.

Moreover, Splunk is not just a tool but also a platform that offers many features, which makes it easy for organizations to deal with machine-generated data. It is used for application management, business analytics, compliance, security, and web analytics.

Splunk Interview Questions and Answers

Here we have categorized Splunk interview questions in three sections listed below:

Beginner-Level Interview Questions

1: What is Splunk?

Splunk is the Google Search of machine-generated data. It helps users to monitor, report, search and visualize enterprise data. With alerts, charts, reports, and so on, Splunk helps in getting real-time insights from the collected data.

2: Enumerate some important features of Splunk.

Some of the most notable features of Splunk are:

Splunk turns hidden patterns within data into real-time business insights that help in making informed business decisions.
It offers operational visibility.
Splunk monitors systems in real-time, which helps to detect issues and vulnerabilities in the same.

3: What is the difference between Splunk and Spark?

Parameter	Splunk	Spark
Type	Splunk is a platform that allows users to analyze machine-generated data.	Apache Spark is an open-source unified analytics engine for large-scale data processing.
Used for	Gathering and analyzing vast amounts of machine-generated data.	In-memory processing and iterative applications.
License	Proprietary	Open-source
Working mode	Streaming mode	Batch and streaming modes

4: Can you list some of the common port numbers used by Splunk?

Splunk uses a plethora of port numbers. Some of the most common ones are as follows:

514 - Splunk Network port for getting UDP data from the Network port
8000 - Splunk Web port
8080 - Splunk Index Replication port
8089 - Splunk Management port
8191 - KV Store
9997 - Splunk Indexing port

5: Enumerate the components of Splunk.

There are four components of Splunk:

Search Head - It provides a graphical user interface for searching.
Indexer - It indexes the machine data.
Forwarder - It forwards logs to the Indexer.
Deployment Server - It helps to manage Splunk components in a distributed environment.

6: What are the functions of Splunk Indexer?

Splunk Indexer is responsible for creating and managing indexes. It indexes incoming data and searches the indexed data.

7: Explain the types of Splunk Forwarders.

The Splunk Forwarder works as an intermediate forwarder, data filter, and remote collector. It is of two types:

HWF (Heavyweight Forwarder) - It is a full instance of Splunk with advanced features.
UF (Universal Forwarder) - It is the Splunk agent installed on a non-Splunk system for gathering data locally. It is unable to index or parse data.

Because the Splunk Forwarder parses data, it is not recommended for production systems.

8: What is the summary index in Splunk?

The default Splunk index is the index that Splunk Enterprise uses if there is no other defined index. The summary index is the default Splunk index. Users must create additional summary indexes if they want to run many summary index reports.

9: Explain Splunk DB Connect.

Splunk DB Connect is a SQL database plugin for Splunk. It allows users to integrate database information with Splunk reports and queries easily.

10: Can you name some important Splunk search commands?

abstract
addtotals
anomalies
erex
filldown
rename

11: What is a license violation in Splunk?

When the user exceeds the data limit, the license violation error is displayed on the dashboard, and it remains for 14 days. In the free version of Splunk, users get only three warnings. For commercial licenses, users have five warnings in a 30-day window. After that, the Indexer’s search results and reports will not trigger.

12: What are some alternatives to Splunk?

Loggly, LogLogic, Logstash , and Sumo Logic are some of the best alternatives to Splunk.

13: Enumerate the important configuration files in Splunk.

indexes.conf
inputs.conf
props.conf
server.conf
transforms.conf

14: How many licenses does Splunk have?

Splunk is available in a variety of licenses to meet different user requirements. These are as follows:

Free license
Beta license
Enterprise license
Forwarder license
Licenses for cluster members/index replication
Licenses for search heads/distributed search

15: What is the latest version of Splunk?

The latest version of Splunk Enterprise is 8.2 released in May 2021.

16: Do you know where is Splunk default configuration stored?

$splunkhome/etc/system/default

17: How will you extract an IP address from logs?

The following are the two ways to extract an IP address from logs:

rex field=_raw  "(?<ip_address>\d+\.\d+\.\d+\.\d+)"

rex field=_raw  "(?<ip_address>([0-9]{1,3}[\.]){3}[0-9]{1,3})"

Intermediate-Level Splunk Interview Questions

18: Explain Buckets and Splunk Bucket Lifecycle.

Splunk stores indexed data in directories, which we refer to them as buckets. Splunk sets the bucket size 750MB and 10GB for 32-bit systems and 64-bit systems, respectively, by default. A bucket in Splunk is a directory storing events of a certain period. The buckets reside by default in $SPLUNK_HOME/var/lib/splunk/defaultdb/db. It has hot and warm buckets. A bucket goes through four phases during its life, known as the Splunk Bucket Lifecycle. These are:

Hot - A hot bucket is open for writing and contains newly indexed data. For each index, there can be more than one hot bucket.
Warm - A warm bucket has data rolled out from a hot bucket.
Cold - A cold bucket is one that stores data rolled out from a warm bucket.
Frozen - A bucket that has data rolled out from a cold bucket is a frozen bucket. Data available in a frozen bucket is not searchable. Although the indexer deletes frozen buckets by default, we can archive them. This archived data can be thawed at some later time.

19: How is the stats command different from the eventstats command?

Both eventstats and stats commands generate summary stats of all fields in the search results and save them as values in new fields. However, the eventstats command is slightly different as it aggregates computed requested statistics to the original raw data.

20: Explain the Time Zone property in Splunk.

A web browser sets the current Time Zone depending on the system that it is running on. Splunk sets the default Time Zone for a system depending on the browser settings. If the user searches for any event other than the set Time Zone, then there will be no relevant results. Hence, the Time Zone property is important for searching for events from a security standpoint.

21: What is Sourcetype in Splunk?

Sourcetype is a way of identifying data in Splunk. It refers to the default field that helps to identify the data structure of an incoming event. How Splunk Enterprise will format the data during the indexing process is determined by Sourcetype. Thus, it is important to ensure that the data has been assigned the right Sourcetype. You need to provide accurate timestamps, and event breaks to the indexed data for making data search easier. Sourcetype must be set at the forwarder level for indexer extraction to identify various data formats.

22: Explain Btool in Splunk.

It is a command-line tool that helps to troubleshoot configuration file issues. Moreover, it lets you know the values being used by a user’s Splunk Enterprise installation in the existing environment.

23: Do you know the command for restarting the Splunk web server?

You can restart the Splunk web server with the splunk start splunkweb command.

24: Explain disabling Splunk boot-start.

We can use the $SPLUNK_HOME/bin/splunk disable boot-start command to disable Splunk boot-start.

25: Give the commands to start and stop Splunk service.

For starting the Splunk service, we can use the ./splunk start command. By using the ./splunk stop command, you can stop the Splunk service.

26: What is the difference between Splunk App and Add-on?

Splunk Add-ons have only inbuilt configurations. They don’t have dashboards or reports. On the other hand, Splunk App is the collection of dashboards, alerts, field extractions, lookups, and reports.

27: Give the command for restarting Splunk Daemon.

We use the splunk start splunkd command to restart Splunk Daemon.

28: Explain how to clear the Splunk search history.

It is very simple to clear the Splunk search history. For doing so, you need to delete the searches.log file from the Splunk server. It can be found in $splunk_home/var/log/splunk.

29: How to check the running Splunk processes on Linux?

We can use the following command to check the running Splunk processes on a Linux system: ps aux | grep splunk We can use the same command for checking the running Splunk processes on a Unix system.

30: What are Splunk Alerts? What are the options for setting up Splunk Alerts?

Splunk Alerts help users to get notified of any errors on their system. There are various options for setting up Splunk Alerts. These are:

Additional details via CSV or PDF files - Users can add results in CSV or PDF formats or inline with the message body to help the recipient(s) understand the conditions that triggered the alert and the actions that have been taken to deal with the same.
Creating a webhook - Allows users to write to GitHub or HipChat . Users can write emails to several machines.
Ticket creation - It is possible to create tickets and throttle Splunk Alerts on the basis of different conditions. Users can control these alerts from the alert window.

31: Explain Fishbucket Index.

Fishbucket stores CRCs and seeks pointers for the indexed files. This helps splunkd to check files that have already been read. It is an index directory that is present at /opt/splunk/var/lib/splunk. Users can access it via the GUI by searching for index=_thefishbucket.

32: Explain the Dispatch Directory.

Dispatch Directory has a directory that stores completed and running searches. It is stored at $SPLUNK_HOME/var/run/splunk/dispatch. The directory contains a CSV file containing all the search results and a search.log file that contains details about search execution in addition to other important information. In the default configuration, the user can delete this directory within ten minutes after the search completion. The search results are deleted after seven days, however, if the user saves them.

33: How will you set the default search time in Splunk 6?

To set the default search time in Splunk 6, we need to use the ui-prefs.conf file. It is available in $SPLUNK_HOME/etc/system/local. To set the default time range, you need to set the dispatch.earliest_time and dispatch.latest_time fields.

34: Explain the process of adding folder access logs from a Windows system to Splunk.

Following are the steps to add folder access logs from a Windows machine to Splunk:

Go to Group Policy and enable Object Access Audit on the Windows system where the folder is located.
Enable auditing for the folder for which you wish to monitor access logs.
Next, install Splunk Universal Forwarder.
Finally, configure the Splunk Universal Forwarder for sending security logs to the Splunk Indexer.

Advanced-Level Splunk Interview Questions

35: Compare search head pooling and search head clustering.

The search head cluster is managed by a captain that controls its slaves. Search head pooling and search head clustering facilitates the high availability of Splunk search head when any search head goes down. While search head clustering is a newly introduced feature, search head pooling is an old feature that serves the same purpose. However, search head clustering is more efficient and reliable than search head pooling. As such, search head pooling will be removed in the subsequent releases of Splunk.

36: How does Splunk avoid duplicate indexing of logs?

To avoid duplicate indexing of logs, Splunk Indexer tracks all the indexed events in the Fishbuckets directory. It contains seek pointers and CRCs for all the files that are being indexed. If there is any CRC or seek pointer that has already been read, Splunk points it out.

37: Explain the MapReduce algorithm.

The MapReduce algorithm is what facilitates faster data searching in Splunk. Technically, it is an algorithm for batch-based large-scale parallelization. Inspired by the map() and reduce() functions of functional programming , it is also used by popular big data processing tools like Hadoop and Apache Spark.

38: Explain Search Factor and Replication Factor.

Search Factor or SF is the number of searchable copies of data maintained by the Indexer cluster. On the other hand, the Replication Factor or RF is the number of copies of data maintained by the Indexer cluster. The Search Head cluster only has SF, while the Indexer cluster has both SF and RF. Search Factor must always be either less than or equal to the Replication Factor.

39: How is Splunk SDK different from Splunk App Framework?

Splunk App Framework allows users to customize the Splunk Web UI and develop Splunk apps with the Splunk web server. It resides within the Splunk web server. Splunk Framework is an indispensable part of Splunk, and it doesn’t come with a separate license. Splunk SDK, on the other hand, allows users to develop apps on their own and doesn’t necessitate Splunk Web or any other component of the Splunk App Framework. Moreover, it is licensed separately from Splunk.

40: Enumerate the types of search modes in Splunk.

There are three types of modes in Splunk that are as follows:

Fast Mode: This mode limits the type of data and accelerated the search result.
Verbose Mode: This mode is comparatively slower than the fast mode. However, it returns results for most of the possible events.
Smart Mode: This mode provides maximum results in a short span of time by toggling between various modes and search behaviors.

41: Enlist the types of dashboards in Splunk.

The following are the different types of dashboards available in Splunk:

Real-time dashboards
Dynamic form-based dashboards
Dashboards for scheduled reports

42: Can you state the difference between sort+ and sort-?

We use sort+ to display search results in ascending order and sort- to display them in descending order.

43: Which command will you use to remove duplicate events having common values?

We can remove duplicate events having common values using the dedup command.

44: What features does Splunk free lack?

Splunk free does not include the following features:

Authentication and scheduled searches
Distributed search
Forwarding in TCP/HTTP
Deployment management

45: How will you disable the Splunk launch message?

To disable the Splunk launch message, we need to set the value OFFENSIVE=Less in splunk_launch.conf.

46: Mention the precedence of the .conf files in Splunk.

The following is the precedence order of the .conf files in Splunk:

System local directory: highest priority
App local directories
App default directories
System default directory: lowest priority

47: How will you reset the Splunk admin password?

To reset the Splunk admin password, firstly, we need to log in to the server on which Splunk is installed. Later, we need to go to the following location, rename the password file, and restart Splunk.

$splunk-homeetcpasswd

After that, log in to Splunk using the username and the new password.

48: How many roles are there in Splunk?

There are three roles in Splunk that are as follows:

Admin
Power
User

49: What different layout options are available in Splunk for search results?

The following are different layout options for search results in Splunk:

List
Table
Row

50: Explain different boolean operators in Splunk.

Splunk supports the following three boolean operators:

AND: It is always used between two terms. For example, web server is the same as web AND server . If you search for web AND server , the search result should contain both these terms.
OR: If you search for web OR server , the resulting records can contain any of the search terms.
NOT: This operator helps you exclude a specific word from your search.

51. What’s a forwarder in Splunk, and also how many kinds exist, and why pick one over another?

A forwarder grabs info, like logs or events, from different spots, then ships it off to Splunk so it can be indexed.

Universal Forwarder (UF): It's light on resources, using almost nothing from your system while sending untouched data. It's ideal when you’ve got loads of devices running without slowing things down.

Heavy Forwarder (HF) breaks down, tweaks, or cleans up data while sending it along, ideal if you're cutting junk, changing event details like hiding private stuff, or deciding where things go based on conditions.

Go with UF if you're only sending raw logs and want a light setup; opt for HF once you start tweaking, sorting, or managing what gets sent.

52. What’s an index inside Splunk? What does "bucket" mean? Also, how does it move through stages, like hot, then warm, after that cold, and finally frozen?

A spot in Splunk holds cleaned-up event logs, so stuff stays tidy while searches run faster through use of structured spots.

A bucket's just a folder inside an index that stores a bunch of events. When data gets older or follows Splunk’s keep-or-delete rules, it shifts between different bucket stages:

Warm: currently taking in fresh info

Cozy: closed yet open to searches, just added to the list

Cold means data that's old and rarely used these days

Frozen: data that is archived or deleted according to retention settings CertLibrary+1

Knowing how buckets work makes it easier to handle storage and keep data longer; this matters most when systems get big, since searching stays quick if managed right.

53. What’s Splunk built like? What key pieces make it work? Also, how do they connect (like forwarder, indexer, search head, things like that)?

The main Splunk parts are forwarders; these grab data. Then indexers step in; they sort, tag, and store it. Search heads run searches and show outcomes through a visual interface. In spread-out systems, you might also see deployment servers or license managers pop up, along with cluster masters handling coordination.

Data moves through forwarders, then reaches indexers. After that, it gets saved in buckets by those indexers. When you need info, search heads reach out to these indexers, called search peers, to pull data. Instead of showing raw logs, they turn it into reports or dashboards for viewing.

In big company systems, grouping servers, like indexers or search heads, works alongside load sharing to help things grow smoothly while staying online even if parts fail.

54. What is Search Processing Language (SPL) in Splunk, and how does it differ from SQL?

Splunk uses a special language called SPL; it digs through logs and time-stamped records. Not like SQL that works with tables, this one handles live streams of events instead. You can narrow results by time ranges or group info as it flows in real-time. It runs number-crunching actions such as stats, eventstats, or timecharts; each does something different. Chaining steps together? Yep, possible using pipes or subsearches. Joins aren’t rare either; external lookups plug right into the flow.

Besides handling messy logs or JSON stuff, Splunk lets you pull out bits easily using SPL; meanwhile, regular SQL struggles here since it's made for tidy tables. While working with mixed-up texts, pulling values fits better in SPL than rigid database queries.

55. What is the difference between a Splunk “app” and a Splunk “add-on”?

A Splunk app works like a container holding dashboards alongside reports you’ve saved; it brings in interface bits, maybe some scripts, and basically everything needed for one specific task or type of data.

A Splunk add-on grabs logs, pulls out fields, and sets up rules while also shaping data to fit certain sources, yet rarely shows dashboards or summaries since its main job is helping Splunk collect info right. It hooks into systems, adapts formats, and lines up timestamps, whereas visuals are left for later tools that build charts after ingestion works smoothly.

Conclusion

These were some of the best Splunk interview questions that will give you an idea about the questions that you might face in an IoT-based job interview. Even if you are not preparing for an IoT interview, you can use these questions to check how well you know Splunk.

Also, do let us know other questions in the comments section below that you have encountered if you have been to any Splunk interviews.

People are also reading: