Starting and Stopping the Radar

Manual Start-up

To more easily start the radar, there is a script called steamed_hams.py. The name of this script is a goofy reference to a scene in an episode of The Simpsons in which Principal Skinner claims there is an aurora happening in his house. The script takes two arguments and can be invoked as follows:

$BOREALISPATH/scripts/steamed_hams.py experiment_name code_environment scheduling_mode

An example invocation to run twofsound in release mode would be:

/home/radar/borealis/scripts/steamed_hams.py twofsound release

Another example invocation running normalscan in debug mode:

/home/radar/borealis/scripts/steamed_hams.py normalscan debug

Another example invocation running epopsound in debug mode during special time would be:

/home/radar/borealis/scripts/steamed_hams.py epopsound debug special

The experiment name must match to an experiment in the src/borealis_experiments folder, and does not include the .py extension. The code environment is the type of compilation environment that was compiled using scons such as release, debug, etc. NOTE This script will kill the Borealis software if it is currently running, before it starts it anew. The scheduling mode is one of common, special, or discretionary depending upon the DARN-SWG schedule (see the scheduling working group page here)

The script will boot all the radar processes in a detached screen window that runs in the background. This window can be reattached in any terminal window locally or over ssh (screen -r) to track any outputs if needed.

If starting the radar in normal operation according to the schedule, there is a helper script called start_radar.sh.

Automated Start-up

In order to start the radar automatically, the script start_radar.sh should be added to a startup script of the Borealis computer. It can also be called manually by the non-root user (typically radar).

The scheduling Python script, remote_server.py, is responsible for automating the control of the radar to follow the schedule, and is started via the start_radar.sh script (shown below) with the appropriate arguments.

This script should be added to the control computer boot-up scripts so that it generates a new set of scheduled commands.

Automated Restarts

Occasionally, the Borealis software stops due to some software or computer issue. To automatically restart the radar software when this occurs, and to avoid lengthy downtimes, the scripts restart_borealis.daemon and restart_borealis.py were created.

restart_borealis.py finds the directory Borealis writes to and checks the file most recently written to. If the file hasn’t been written to within a specified time period, the script assumes the radar has stopped running and tries to restart it using stop_radar.sh and start_radar.sh.

restart_borealis.daemon runs continuously, periodically executing restart_borealis.py. If the radar is restarted consecutive times, an alert is sent to our group’s Slack workspace to notify us that the radar likely has a problem requiring manual intervention. For more information on integrating Slack alerts, see here.

To set up the daemon using systemd, add a .service file within /usr/lib/systemd/system/ (for openSUSE). For example,

[Unit]
Description=Restart borealis daemon

[Service]
User=radar
ExecStart=/home/radar/borealis/scripts/restart_borealis.daemon
Restart=always

[Install]
WantedBy=multi-user.target

Then, enable and start the daemon using the systemctl commands.

Alternatively, restart_borealis.py can be run via crontab, as shown below:

*/10 * * * * . $HOME/.profile; /usr/bin/python3 /home/radar/borealis/scripts/restart_borealis.py >> /home/radar/borealis/restart_log.txt 2>&1

Stopping the Radar

There are several ways to stop the Borealis radar. They are ranked here from most acceptable to last-resort:

  1. Run the script stop_radar.sh from the Borealis scripts/ directory. This script kills the scheduling server, removes all entries from the schedule and kills the screen session running the Borealis software modules. stop_radar.sh is shown below.

  2. While viewing the screen session running the Borealis software modules, type ctrl-A, ctrl-\\. This will kill the screen session and all software modules running within it.

  3. Restart the Borealis computer. NOTE In a normal circumstance, the Borealis software will start back up again once the computer reboots.

  4. Shut down the Borealis computer.

Scripts

start_radar.sh
 1#!/bin/bash
 2source "/home/radar/.profile"
 3source "${BOREALISPATH}/borealis_env${PYTHON_VERSION}/bin/activate"
 4LOGFILE="/home/radar/logs/start_stop.log"
 5
 6# Stop current remote_server.py process
 7/usr/bin/pkill -9 -f remote_server.py
 8
 9# Start new remote_server.py process
10nohup python3 $BOREALISPATH/scheduler/remote_server.py \
11		--scd-dir=/home/radar/borealis_schedules \
12		>> /home/radar/logs/scd.out 2>&1 &
13
14PID=$!	# Get pid of remote_server.py process
15sleep 1
16
17NOW=$(date +'%Y%m%d %H:%M:%S')
18if ! ps -p $PID &> /dev/null; then	 # Check if remote_server.py process still running
19	echo "${NOW} START: FAIL - remote_server.py failed to start." | tee -a $LOGFILE
20	exit 1
21fi
22
23if [[ -z $(atq) ]]; then		# Check if atq is empty
24	echo "${NOW} START: FAIL - atq is empty. No radar processes scheduled." | tee -a $LOGFILE
25	exit 1
26fi
27
28echo "${NOW} START: SUCCESS - radar processes scheduled." | tee -a $LOGFILE
stop_radar.sh
 1#!/bin/bash
 2source "/home/radar/.profile"
 3LOGFILE="/home/radar/logs/start_stop.log"
 4
 5# Kill remote_server.py process
 6PID=$(pgrep -f remote_server.py) # Get PID of remote_server.py process
 7pkill -9 -f remote_server.py
 8
 9# Remove all scheduled experiments from at queue
10for i in $(atq | awk '{print $1}')
11do 
12	atrm $i
13done
14
15# Check if Borealis screen is still running
16retVal=0
17if screen -ls | grep -q borealis; then
18	# Kill Borealis processes
19	screen -X -S borealis quit
20	retVal=$?
21fi
22
23sleep 1
24NOW=$(date +'%Y%m%d %H:%M:%S')
25if ps -p $PID &> /dev/null; then	 # Check if remote_server.py process still running
26	echo "${NOW} STOP: FAIL - could not kill remote_server.py process." | tee -a $LOGFILE
27	exit 1
28fi
29
30if [[ -n $(atq) ]]; then		# Check if atq is not empty
31	echo "${NOW} STOP: FAIL - could not clear atq. Radar processes still scheduled." | tee -a $LOGFILE
32	exit 1
33fi
34
35if [[ $retVal -ne 0 ]]; then
36	echo "${NOW} STOP: FAIL - could not kill Borealis screen." | tee -a $LOGFILE
37	exit 1
38fi
39
40echo "${NOW} STOP: SUCCESS - radar processes stopped." | tee -a $LOGFILE