Signal Sciences WAF – Netenrich

Table of Contents:

Overview

This SOP refers to running the python script, where it calls the API and copies the logs in JSON format and sends to GCS Bucket.

Prerequisites

Need to have Administrator login credentials.
Requires Storage Object reader permission in GCP so the data can be accessed.

Signal Science API to GCS Bucket

Generating the API key

1. Go to My Profile > API Access Tokens.

2. Under “API Access Tokens” click on Add API access token.

3. Enter the Name ‘Chronicle SecOps’ for the access token and click Create API access token.

sig science.png

4. The new token will be displayed. Record the token in a secure location.

5. Click I understand to finish creating the token.

Deployment of Script in the Linux Box:

*The script requires this python lib to copy the WAF JSON to a GCS bucket

pip install google-cloud-storage

*The workload needs permission for the GCP service account to copy the data to the bucket. Set this env variable to call the JSON file that has the credentials from GCP

export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/service-account-key.json"

*The script can extract JSON for multiple sites or a single site:

site_names = [ 'site123', 'site345']

*The script in production will call the following env variables because this info should not be hardcoded.

email = os.environ.get('SIGSCI_EMAIL') # Signal Sciences account email

token = os.environ.get('SIGSCI_TOKEN') # API token for authentication

corp_name = os.environ.get('SIGSCI_CORP') # Corporation name in Signal Sciences

*There is a env variable checker that you will uncomment once the env variables are set.

if 'SIGSCI_EMAIL' not in os.environ or 'SIGSCI_TOKEN' not in os.environ or 'SIGSCI_CORP' not in os.environ:

print("ERROR: You need to define SIGSCI_EMAIL, SIGSCI_TOKEN, and SIGSCI_CORP environment variables.")

print("Please fix and run again. Existing...")

sys.exit(1) # Exit if environment variables are not set

*The script needs the GCS bucket defined that the JSON data will be copied to. You will also define a new name for the file created in the bucket as well.

bucket_name = 'sig_sci' # Replace with your GCS bucket name

output_file_name = 'signal_sciences_logs.json'

Below is the Script that need to be executed in the Linux Server.

_________________________________________________________________________________________________

import sys

import requests

import os

import calendar

import json

from datetime import datetime, timedelta

from google.cloud import storage

# Check if all necessary environment variables are set

#if 'SIGSCI_EMAIL' not in os.environ or 'SIGSCI_TOKEN' not in os.environ or 'SIGSCI_CORP' not in os.environ:

# print("ERROR: You need to define SIGSCI_EMAIL, SIGSCI_TOKEN, and SIGSCI_CORP environment variables.")

# print("Please fix and run again. Existing...")

# sys.exit(1) # Exit if environment variables are not set

# Define the Google Cloud Storage bucket name and output file name

bucket_name = 'sig_sci' # Replace with your GCS bucket name

output_file_name = 'signal_sciences_logs.json'

# Initialize Google Cloud Storage client

storage_client = storage.Client()

# Function to upload data to Google Cloud Storage

def upload_to_gcs(bucket_name, data, destination_blob_name):

bucket = storage_client.bucket(bucket_name)

blob = bucket.blob(destination_blob_name)

blob.upload_from_string(data, content_type='application/json')

print(f"Data uploaded to {destination_blob_name} in bucket {bucket_name}")

# Signal Sciences API information

api_host = 'https://dashboard.signalsciences.net'

email = 'user@domain.com' # Signal Sciences account email

token = 'XXXXXXXX-XXXX-XXX-XXXX-XXXXXXXXXXXX' # API token for authentication

corp_name = 'Domain' # Corporation name in Signal Sciences

site_names = ['testenv'] # Replace with your actual site names

# List of comma-delimited sites that you want to extract data from

site_names = [ 'site123', 'site345' ]

#email = os.environ.get('SIGSCI_EMAIL') # Signal Sciences account email

#token = os.environ.get('SIGSCI_TOKEN') # API token for authentication

#corp_name = os.environ.get('SIGSCI_CORP') # Corporation name in Signal Sciences

# Calculate the start and end timestamps for the previous hour in UTC

until_time = datetime.utcnow().replace(minute=0, second=0, microsecond=0)

from_time = until_time - timedelta(hours=1)

until_time = calendar.timegm(until_time.utctimetuple())

from_time = calendar.timegm(from_time.utctimetuple())

# Prepare HTTP headers for the API request

headers = {

'Content-Type': 'application/json',

'x-api-user': email,

'x-api-token': token

}

# Collect logs for each site

collected_logs = []

for site_name in site_names:

url = f"{api_host}/api/v0/corps/{corp_name}/sites/{site_name}/feed/requests?from={from_time}&until={until_time}"

while True:

response = requests.get(url, headers=headers)

if response.status_code != 200:

print(f"Error fetching logs: {response.text}", file=sys.stderr)

break

# Parse the JSON response

data = response.json()

collected_logs.extend(data['data']) # Add the log messages to our list

# Pagination: check if there is a next page

next_url = data.get('next', {}).get('uri')

if not next_url:

break

url = api_host + next_url

# Convert the collected logs to a newline-delimited JSON string

json_data = '\n'.join(json.dumps(log) for log in collected_logs)

# Save the newline-delimited JSON data to a GCS bucket

upload_to_gcs(bucket_name, json_data, output_file_name)

_________________________________________________________________________________________________

Chronicle feed

1. Now Go to Settings -> FEEDS -> Click ADD NEW

2. Enter below Fields:

FEED NAME: Signal Sciences WAF

SOURCE TYPE: Choose ‘Google Clod Storage’

LOG TYPE: Choose ‘Signal Sciences WAF’

CHRONICLE SERVICE ACCOUNT: Click on the ‘GET A SERVICE ACCOUNT’ and automatically it adds Service Account.

sig science_2.png

3. Click Next.

4. Now Enter below Fields:

STORAGE BUCKET URI: Enter the ‘GCS Storage Bucket URI’

URI IS A: Choose ‘Directory which includes subdirectories’

SOURCE DELETION OPTION: Choose ‘Never Delete files’

ASSET NAMESPACE: Provide ‘Environment Name’ (Site name)

INGESTION LABELS: Provide as below.

sig science_3.png

5. Now Click ‘Next’ and ‘Submit’.

Once the configuration is completed, need to validate the logs in chronicle using a regular expression as (".*") this expression or with specific hostname, will provide the log source types which are ingesting to chronicle, below is the screen shot for reference.

sig science_4.png