Starting and Stopping the Radar

Manual Start-up

To more easily start the radar, there is a script called steamed_hams.py. The name of this script is a goofy reference to a scene in an episode of The Simpsons in which Principal Skinner claims there is an aurora happening in his house. The script takes two arguments and can be invoked as follows:

$BOREALISPATH/scripts/steamed_hams.py experiment_name code_environment scheduling_mode

An example invocation to run twofsound in release mode during common time would be:

/home/radar/borealis/scripts/steamed_hams.py twofsound release common

Another example invocation running normalscan in debug mode during discretionary time:

/home/radar/borealis/scripts/steamed_hams.py normalscan debug discretionary

Another example invocation running epopsound in debug mode during special time would be:

/home/radar/borealis/scripts/steamed_hams.py epopsound debug special

The experiment name must match to an experiment in the src/borealis_experiments folder, and does not include the .py extension. The code environment is the type of compilation environment that was compiled using scons such as release, debug, etc. NOTE This script will kill the Borealis software if it is currently running, before it starts it anew. The scheduling mode is one of common, special, or discretionary depending upon the DARN-SWG schedule (see the scheduling working group page here)

The script will boot all the radar processes in a detached screen window that runs in the background. This window can be reattached in any terminal window locally or over ssh (screen -r) to track any outputs if needed.

To start the radar without the optional realtime module, pass the flag --realtime-off and the module will not be run. For example:

/home/radar/borealis/scripts/steamed_hams.py normalscan release discretionary --realtime-off

If starting the radar in normal operation according to the schedule, there is a helper script called start_radar.sh, shown below.

Automated Start-up

In order to start the radar automatically, the script start_radar.sh should be added to a startup script of the Borealis computer. It can also be called manually by the non-root user (typically radar).

The scheduling Python script, remote_server.py, is responsible for automating the control of the radar to follow the schedule, and is started via the start_radar.sh script (shown below) with the appropriate arguments.

This script should be added to the control computer boot-up scripts so that it generates a new set of scheduled commands.

Automated Restarts

Occasionally, the Borealis software stops due to some software or computer issue. To automatically restart the radar software when this occurs, and to avoid lengthy downtimes, the scripts restart_borealis.daemon and restart_borealis.py were created.

restart_borealis.py finds the directory Borealis writes to and checks the file most recently written to. If the file hasn’t been written to within a specified time period, the script assumes the radar has stopped running and tries to restart it using stop_radar.sh and start_radar.sh.

restart_borealis.daemon runs continuously, periodically executing restart_borealis.py. If the radar is restarted consecutive times, an alert is sent to our group’s Slack workspace to notify us that the radar likely has a problem requiring manual intervention. For more information on integrating Slack alerts, see here.

To set up the daemon using systemd, add a .service file within /usr/lib/systemd/system/ (for openSUSE). For example,

[Unit]
Description=Restart borealis daemon

[Service]
User=radar
ExecStart=/home/radar/borealis/scripts/restart_borealis.daemon
Restart=always

[Install]
WantedBy=multi-user.target

Then, enable and start the daemon using the systemctl commands.

Alternatively, restart_borealis.py can be run via crontab, as shown below:

*/10 * * * * . $HOME/.profile; /usr/bin/python3 /home/radar/borealis/scripts/restart_borealis.py >> /home/radar/borealis/restart_log.txt 2>&1

Stopping the Radar

There are several ways to stop the Borealis radar. They are ranked here from most acceptable to last-resort:

  1. Run the script stop_radar.sh from the Borealis scripts/ directory. This script kills the scheduling server, removes all scheduled entries from the at queue and kills the screen session running the Borealis software modules. stop_radar.sh is shown below.

  2. While viewing the screen session running the Borealis software modules, type ctrl-A, ctrl-\\. This will kill the screen session and all software modules running within it.

  3. Restart the Borealis computer. NOTE In a normal circumstance, the Borealis software will start back up again once the computer reboots.

  4. Shut down the Borealis computer.

UPS & Power Outages

To protect the Borealis computer from power outages and ensure the computer can safely turn off, the computer should be powered by an Uninterruptible Power Supply (UPS). Additionally, powering the Borealis hardware (N200s, octoclocks, and network equipment) with the UPS will mitigate potential radar restarts due to power brownouts or short power outages. In this scenario, the UPS should shut temporarily turn off the radar while the Borealis equipment is on battery power during the power outage (since the transmitters will be powered off). This can be done as follows:

  1. Use apcupsd to communicate between the radar computer and the UPS. Follow the APCUPSD User Manual to install and configure for your setup.

  2. Copy the offbattery and onbattery scripts from borealis/scripts/apcupsd/ to /etc/apcupsd/. These scripts will be executed when each event occurs on the UPS:

    1. onbattery: This occurs when the power outage starts. This script will schedule the radar to turn off via stop_radar.sh, and stop the restart_borealis.service daemon so the radar doesn’t restart during the power outage.

    2. offbattery: This occurs when the power outage ends. This script will cancel the scheduled stop_radar.sh script call, and restart the restart_borealis.service daemon.

Scripts

start_radar.sh
 1#!/bin/bash
 2source "/home/radar/.profile"
 3source "${BOREALISPATH}/borealis_env${PYTHON_VERSION}/bin/activate"
 4LOGFILE="/home/radar/logs/start_stop.log"
 5
 6# Stop current remote_server.py process
 7/usr/bin/pkill -9 -f remote_server.py
 8
 9# Start new remote_server.py process
10nohup python3 $BOREALISPATH/scheduler/remote_server.py \
11		--scd-dir=/home/radar/borealis_schedules \
12		>> /home/radar/logs/remote_server.log 2>&1 &
13
14PID=$!	# Get pid of remote_server.py process
15sleep 3
16
17NOW=$(date +'%Y-%m-%d %H:%M:%S')
18if ! ps -p $PID &> /dev/null; then	 # Check if remote_server.py process still running
19	echo "${NOW} START: FAIL - remote_server.py failed to start." | tee -a $LOGFILE
20	exit 1
21fi
22
23if [[ -z $(atq) ]]; then		# Check if atq is empty
24	echo "${NOW} START: FAIL - atq is empty. No radar processes scheduled." | tee -a $LOGFILE
25	exit 1
26fi
27
28echo "${NOW} START: SUCCESS - radar processes scheduled." | tee -a $LOGFILE
stop_radar.sh
 1#!/bin/bash
 2source "/home/radar/.profile"
 3LOGFILE="/home/radar/logs/stop_radar.log"
 4
 5NOW=$(date +'%Y-%m-%d %H:%M:%S')
 6
 7# Kill remote_server.py process
 8PID=$(pgrep -f remote_server.py) # Get PID of remote_server.py process
 9if [[ -z $PID ]]; then
10	echo "$NOW STOP: NOTE - remote_server.py is not running"
11fi
12pkill -9 -f remote_server.py
13
14# Remove all scheduled experiments from at queue
15for i in $(atq | awk '{print $1}')
16do
17	atrm $i
18done
19
20# Check if Borealis screen is still running
21retVal=0
22if screen -ls | grep -q borealis; then
23	# Kill Borealis processes
24	screen -X -S borealis quit
25	retVal=$?
26else
27	echo "$NOW STOP: FAIL - Radar not running, no Borealis screens found"
28	exit 1
29fi
30
31sleep 1
32if ps -p $PID &> /dev/null; then	# Check if remote_server.py process still running
33	echo "${NOW} STOP: FAIL - could not kill remote_server.py process." | tee -a $LOGFILE
34	exit 1
35fi
36
37if [[ -n $(atq) ]]; then			# Check if atq is not empty
38	echo "${NOW} STOP: FAIL - could not clear atq. Radar processes still scheduled." | tee -a $LOGFILE
39	exit 1
40fi
41
42if [[ $retVal -ne 0 ]]; then
43	echo "${NOW} STOP: FAIL - could not kill Borealis screen." | tee -a $LOGFILE
44	exit 1
45fi
46
47echo "${NOW} STOP: SUCCESS - radar processes stopped." | tee -a $LOGFILE