Overview
This SOP refers to running the python script, where it calls the API and copies the logs in JSON format and sends to GCS Bucket.
Prerequisites
- Need to have Administrator login credentials.
- Requires Storage Object reader permission in GCP so the data can be accessed.
Signal Science API to GCS Bucket
Generating the API key
1. Go to My Profile > API Access Tokens.
2. Under “API Access Tokens” click on Add API access token.
3. Enter the Name ‘Chronicle SecOps’ for the access token and click Create API access token.
4. The new token will be displayed. Record the token in a secure location.
5. Click I understand to finish creating the token.
Deployment of Script in the Linux Box:
*The script requires this python lib to copy the WAF JSON to a GCS bucket
pip install google-cloud-storage
*The workload needs permission for the GCP service account to copy the data to the bucket. Set this env variable to call the JSON file that has the credentials from GCP
export GOOGLE_APPLICATION_CREDENTIALS="path/to/your/service-account-key.json"
*The script can extract JSON for multiple sites or a single site:
site_names = [ 'site123', 'site345']
*The script in production will call the following env variables because this info should not be hardcoded.
email = os.environ.get('SIGSCI_EMAIL') # Signal Sciences account email
token = os.environ.get('SIGSCI_TOKEN') # API token for authentication
corp_name = os.environ.get('SIGSCI_CORP') # Corporation name in Signal Sciences
*There is a env variable checker that you will uncomment once the env variables are set.
if 'SIGSCI_EMAIL' not in os.environ or 'SIGSCI_TOKEN' not in os.environ or 'SIGSCI_CORP' not in os.environ:
print("ERROR: You need to define SIGSCI_EMAIL, SIGSCI_TOKEN, and SIGSCI_CORP environment variables.")
print("Please fix and run again. Existing...")
sys.exit(1) # Exit if environment variables are not set
*The script needs the GCS bucket defined that the JSON data will be copied to. You will also define a new name for the file created in the bucket as well.
bucket_name = 'sig_sci' # Replace with your GCS bucket name
output_file_name = 'signal_sciences_logs.json'
Below is the Script that need to be executed in the Linux Server.
_________________________________________________________________________________________________
import sys
import requests
import os
import calendar
import json
from datetime import datetime, timedelta
from google.cloud import storage
# Check if all necessary environment variables are set
#if 'SIGSCI_EMAIL' not in os.environ or 'SIGSCI_TOKEN' not in os.environ or 'SIGSCI_CORP' not in os.environ:
# print("ERROR: You need to define SIGSCI_EMAIL, SIGSCI_TOKEN, and SIGSCI_CORP environment variables.")
# print("Please fix and run again. Existing...")
# sys.exit(1) # Exit if environment variables are not set
# Define the Google Cloud Storage bucket name and output file name
bucket_name = 'sig_sci' # Replace with your GCS bucket name
output_file_name = 'signal_sciences_logs.json'
# Initialize Google Cloud Storage client
storage_client = storage.Client()
# Function to upload data to Google Cloud Storage
def upload_to_gcs(bucket_name, data, destination_blob_name):
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_string(data, content_type='application/json')
print(f"Data uploaded to {destination_blob_name} in bucket {bucket_name}")
# Signal Sciences API information
api_host = 'https://dashboard.signalsciences.net'
email = 'user@domain.com' # Signal Sciences account email
token = 'XXXXXXXX-XXXX-XXX-XXXX-XXXXXXXXXXXX' # API token for authentication
corp_name = 'Domain' # Corporation name in Signal Sciences
site_names = ['testenv'] # Replace with your actual site names
# List of comma-delimited sites that you want to extract data from
site_names = [ 'site123', 'site345' ]
#email = os.environ.get('SIGSCI_EMAIL') # Signal Sciences account email
#token = os.environ.get('SIGSCI_TOKEN') # API token for authentication
#corp_name = os.environ.get('SIGSCI_CORP') # Corporation name in Signal Sciences
# Calculate the start and end timestamps for the previous hour in UTC
until_time = datetime.utcnow().replace(minute=0, second=0, microsecond=0)
from_time = until_time - timedelta(hours=1)
until_time = calendar.timegm(until_time.utctimetuple())
from_time = calendar.timegm(from_time.utctimetuple())
# Prepare HTTP headers for the API request
headers = {
'Content-Type': 'application/json',
'x-api-user': email,
'x-api-token': token
}
# Collect logs for each site
collected_logs = []
for site_name in site_names:
url = f"{api_host}/api/v0/corps/{corp_name}/sites/{site_name}/feed/requests?from={from_time}&until={until_time}"
while True:
response = requests.get(url, headers=headers)
if response.status_code != 200:
print(f"Error fetching logs: {response.text}", file=sys.stderr)
break
# Parse the JSON response
data = response.json()
collected_logs.extend(data['data']) # Add the log messages to our list
# Pagination: check if there is a next page
next_url = data.get('next', {}).get('uri')
if not next_url:
break
url = api_host + next_url
# Convert the collected logs to a newline-delimited JSON string
json_data = '\n'.join(json.dumps(log) for log in collected_logs)
# Save the newline-delimited JSON data to a GCS bucket
upload_to_gcs(bucket_name, json_data, output_file_name)
_________________________________________________________________________________________________
Chronicle feed
1. Now Go to Settings -> FEEDS -> Click ADD NEW
2. Enter below Fields:
FEED NAME: Signal Sciences WAF
SOURCE TYPE: Choose ‘Google Clod Storage’
LOG TYPE: Choose ‘Signal Sciences WAF’
CHRONICLE SERVICE ACCOUNT: Click on the ‘GET A SERVICE ACCOUNT’ and automatically it adds Service Account.
3. Click Next.
4. Now Enter below Fields:
STORAGE BUCKET URI: Enter the ‘GCS Storage Bucket URI’
URI IS A: Choose ‘Directory which includes subdirectories’
SOURCE DELETION OPTION: Choose ‘Never Delete files’
ASSET NAMESPACE: Provide ‘Environment Name’ (Site name)
INGESTION LABELS: Provide as below.
5. Now Click ‘Next’ and ‘Submit’.
Once the configuration is completed, need to validate the logs in chronicle using a regular expression as (".*") this expression or with specific hostname, will provide the log source types which are ingesting to chronicle, below is the screen shot for reference.
Comments
0 comments
Please sign in to leave a comment.