Linux Security Breach Audit using your LLM

Linux Security Breach Audit using your LLM
by Marc Gloor

Abstract

You already run your local LLM for example OpenWebUI or AnythingLLM and you would like to audit your system for security breaches. Learn how to leverage LLMs for automated Linux System Auditing. This tutorial covers importing /var/log data into an AI workspace and using targeted queries to detect security breaches, verify system integrity, and generate executive-level audit summaries.

Turns out, feeding ten years of /var/log/ into an LLM is a massive data-cleansing headache but the RAG results are incredible. I've essentially built a security advisor that has the 'memory' of my entire server history. It's time to conduct a regular Linux Security Breach Audit using your LLM. Here is how I did it.

Overview of Steps involved

Prerequisite: Install and run a local LLM (e.g. OpenWebUI and AnythingLLM)
Create a workspace for the RAG ingestion
Create LLM RAG Security Data Staging Area for data cleansing
Dump all logfiles from /var/log into LLM RAG staging area
Data cleansing of security data staging area
RAG ingestion of security data into your LLM's workspace
Security Audit (query the respective workspace of your LLM)

Generate Security Staging Area of your /var/log for the LLM:

 Switch to admin (root)  : su -

 Create staging area dir : DIR="$(date +%Y%m%d_%H%M)_security-audit" && mkdir -p ~/$DIR && cd ~/$DIR 

 Clone logs into staging : cp -r /var/log/* .

 Pre-Check (count files) : find . -type f | wc -l

 Move files to root dir  : find . -mindepth 2 -type f -exec mv --backup=numbered {} . \;

 Remove dotfiles         : find . -mindepth 1 -name ".*" -exec rm -rf {} +

 Remove empty dirs       : find . -mindepth 1 -type d -empty -delete

 Remove zero byte files  : find . -type f -size 0 -delete

 Remove tmp files        : find . -type f -name "*~*" -delete

 Remove redundant files  : find . -name "*\([0-9]*" -exec rm -f {} \;

 Remove job control files: find . -name "nohup.out" -exec rm -f {} \;

 Unzip all .gz files     : gunzip -f *.gz

 Show broken links       : find . -xtype l

 rm broken links         : find . -xtype l -delete

 Generate ASCII journal  : journalctl > alljournal.txt

 Find binary data        : find . -type f -exec file {} + | grep -v "text"

 rm binary data          : find . -type f -exec file {} + | grep -v "text" | cut -d: -f1 | xargs rm

 rm binary journal files : rm *.journal

 Add .txt extension      : for f in *; do mv "$f" "$f.txt"; done

 Post-Check (count files): find . -type f | wc -l

Create a workspace and set the model parameters

Create a new workspace in your LLM, name it accordingly e.g. "<HOSTNAME> Security Audit from <DATE>". Set Model Parameters of the workspace to e.g. "Act as a system security advisor". Also set the other parameters such as e.g. the capabilities and features of the workspace. For example, get rid of the image generation and OCR capabilities in order to keep your model lean.

Start RAG Ingestion (ETL)

Lauch the data import of the staging area (RAG ingestion) into your workspace, this may take a while, without GPU a couple of minutes to hours.

Query your LLM (examples)

Query: "Analyze auth.log or secure logs for an unusual volume of 'Failed password' or 'Invalid user' attempts. Group them by source IP address and identify any IPs that have more than 20 failures. Are there any successful logins immediately following a series of failures from the same IP?"

Query: "Scan the logs for sudo command usage. Identify any users who attempted to run commands as root but were denied ('not in the sudoers list'). Also, highlight any successful sudo sessions that occurred at unusual hours (e.g., between 12 AM and 5 AM)."

Query: "Review the access.log for HTTP status codes in the 400-500 range. Specifically, look for 'Directory Traversal' attempts (e.g., strings like ../ or /etc/passwd) or unusual POST requests to non-existent scripts. Are there any suspicious outbound connection attempts recorded in the system logs?"

Query: "Search the logs for any evidence of new user account creation, group modifications, or changes to cron jobs (scheduled tasks). Cross-reference these actions with the timeframe of any suspicious login activity identified in previous queries."

Query: "Based on the provided /var/log data, generate a 3-5 sentence Security Breach Audit Summary. Identify any confirmed indicators of compromise (IoCs), categorize the severity of found anomalies (Low, Medium, High), and state whether the system's integrity appears intact or compromised. Conclude with a 'Pass' or 'Fail' assessment regarding current unauthorized access attempts."

Security Notes

There is a security matter with /var/log that will not be solved by using an LLM, it's called log tampering. Generally, a paranoid admin must always assume the root account was hacked and /var/log holds fake data to hide the intrusion, never trust your /var/log input data as it might be already compromised by an intruder that has control over your system. In best case, you offline hold a chronologically collected checksum from /var/log since OS installation and manually reconcile the integrity of /var/log or you use more sophisticated approaches such as the Forward Secure Sealing (FSS) feature from systemd/journald or WORM (Write Once, Read Many).

Download

You can download the RAG ingestion script that creates a staging area of your /var/log. Never upload your staged data into a public LLM as it contains sensitive data. The files in the staging area can be imported into a workspace of your local secured LLM. Licensed under the GPL. Don't use the script if you feel you don't have admin skills. Use at your own risk and carefully assess which commands you are using prior to execution.

$Id: linux-llm-security-audit.html,v 1.9 2026/04/26 09:26:08 gloor Exp $
Author: marc_dot_gloor_at_u_dot_nus_dot_edu

home