KBMOD Demo#

[ ]:
import os
import numpy as np

from pathlib import Path

from kbmod.configuration import SearchConfiguration
from kbmod.fake_data.demo_helper import make_demo_data
from kbmod.run_search import *
from kbmod.search import *
from kbmod.work_unit import WorkUnit

Setup file paths#

In order to run KBMOD you need to have the location of the input data and a res_filepath that provides a path to the directory where the output results will be stored. Input data can come from a variety of formats including Rubin’s Bulter, fits files, and WorkUnit files. In this demo we use the WorkUnit file which is an internal storage format used. For more information on generating a WorkUnit from the Butler or fits, see the standardizer notebooks.

If you already have data files, you can use those. Below we create and use data in data/demo_data.fits.

[ ]:
input_filename = "../data/demo_data.fits"

# Create the fake data usering a helper function.
if not Path(input_filename).is_file():
    make_demo_data(filename=input_filename)
[ ]:
res_filepath = "./fake_results"
if not Path(res_filepath).is_dir():
    print(f"Directory {res_filepath} does not exist. Creating.")
    os.mkdir(res_filepath)

Run KBMOD#

The standard approach to running KBMOD is to perform a grid search over all starting pixels and a grid of velocities. The velocities are defined by steps in velocity space (in pixels per day) and angles. Let’s do a grid search with:

  • 21 different velocity steps from 0 pixels per day and 20 pixels per day

  • 11 different angles from 0.5 below the default angle (computed based on the ecliptic) to 0.5 above.

KBMOD needs a series of configuration parameters to specify all the information about the search. In this notebook we explicitly provide the configuration parameters as a dictionary so users can see what is being specified. However most users will want to use the SearchConfiguration class. A SearchConfiguration object uses reasonable defaults when created:

config = SearchConfiguration()

Users can then override values one at a time or by passing a dictionary:

d = {"result_filename": "Here", "encode_num_bytes": 2}
config.set_multiple(d)

More importantly SearchConfiguration can read from or written to a YAML file:

config = SearchConfiguration.from_file(file_path)

This allows users to define a per-task configuration and version control it.

Most of the parameters you will not need to change. They are included to provide fine grained options for control.

[ ]:
results_suffix = "DEMO"

# The demo data has an object moving at x_v=10 px/day
# and y_v = 0 px/day. So we search velocities [0, 19].
v_min = 0
v_max = 20
v_steps = 20
v_arr = [v_min, v_max, v_steps]

# and angles [-0.5, 0.5)
ang_below = 0.5
ang_above = 0.5
ang_steps = 10
ang_arr = [ang_below, ang_above, ang_steps]

input_parameters = {
    # Use search parameters (including a force ecliptic angle of 0.0)
    # to match what we know is in the demo data.
    "generator_config": {
        "name": "EclipticCenteredSearch",
        "angles": [-0.5, 0.5, 11],
        "velocities": [0.0, 20.0, 21],
        "angle_units": "radian",
        "given_ecliptic": 0.0,
    },
    # Output parameters
    "result_filename": "./fake_results/results.ecsv",
    # Basic filtering (always applied)
    "num_obs": 15,  # <-- Filter anything with fewer than 15 observations
    "lh_level": 10.0,  # <-- Filter anything with a likelihood < 10.0
    # SigmaG clipping parameters
    "sigmaG_lims": [15, 60],  # <-- Clipping parameters (lower and upper percentile)
    "gpu_filter": True,  # <-- Apply clipping and filtering on the GPU
    "clip_negative": True,
}
config = SearchConfiguration.from_dict(input_parameters)

To see the full list of parameters used and their values you can just print the configuration.

[ ]:
print(config)

We load the data as a WorkUnit object. In general WorkUnit’s include a copy of their own configuration so they have all the information they need for a full run. We overwrite the stored configuration with the one we defined above.

[ ]:
input_data = WorkUnit.from_fits(input_filename)
input_data.config = config

KBMOD uses Python’s logging library for output during the run. If you are interested in running with debug output (verbose mode), you can set the level of Python’s logger to WARNING (for warning messages only), INFO (for moderate output), or DEBUG (for comprehensive output). By default logging is set to warning level.

[ ]:
import logging

logging.basicConfig(level=logging.INFO)

# Or for a LOT of detail
# logging.basicConfig(level=logging.DEBUG)

Once we have defined the search parameters, we can create a SearchRunner and use one of the run_search functions. In this case we use run_search_from_work_unit which uses the WorkUnit to define both the image data and the configuration information.

[ ]:
rs = SearchRunner()
results = rs.run_search_from_work_unit(input_data)

We then check that results were written to an output directory. The configuration parameters above specify that KBMOD should write three types of output files:

  1. A combined serialized Results saved as a .ecsv file ("result_filename": "./fake_results/results.ecsv").

  2. (Legacy format) A series of individual output files ("res_filepath": res_filepath). Currently this is just the results file (trajectory information) and a copy of the final configuration used. Recent versions of KBMOD has removed older files, such as the psi curves or phi curves, that were not being used. However we can easily add files that would be useful.

Users can shut off these outputs but passing None to the configuration options.

[ ]:
if os.path.exists(res_filepath):
    files = os.listdir(res_filepath)
    print(files)

Analyzing Results#

The run function we used returns a Results object containing the individual results of the run. We can perform basic actions on this data structure such as sorting it, extracting individual results, or performing additional filtering.

[ ]:
print(f"Search found {len(results)} results.")
print(results)

We can also plot the different curves for the result.

[ ]:
import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 1)
axs[0].plot(results["psi_curve"][0])
axs[0].set_title("Psi")

axs[1].plot(results["phi_curve"][0])
axs[1].set_title("Psi")

For additional analysis steps, including manual filtering and exploration, please refer to the kbmod_analysis_demo notebook which uses the data generated by this notebook.

Debugging Results#

What happens if there is a trajectory we expect to be in an image that we do not find? This can happen if we are using data with injected fakes or known objects. In this case, we can use KBMOD’s track_filtered setting to do additional debugging. We start by making a few changes to the configuration.

[ ]:
# Turn on filtered tracking
input_data.config.set("track_filtered", True)

# Turn up filtering of stamp filtering. This will require 100% of the stamp's flux
# to be at the center pixel and effectively filter every candidate trajectory.
input_data.config.set("center_thresh", 1.0)

Then we rerun the search

[ ]:
rs = SearchRunner()
results = rs.run_search_from_work_unit(input_data)
print(f"Search found {len(results)} results.")

As expected we found no results. Everything was filtered in at least one stage. Let’s see where.

[ ]:
print(results.filtered_stats)

The stamp filter removed the majority of the results. Since we set track_filtered to true, we can look at the actual results removed during that stage.

We revert all the filters and add the reason for the filtering in its own column.

[ ]:
results.revert_filter(add_column="filtered_reason")
print(results)

Now we search for one of the expected trajectories (starting at pixel (50, 40) at the first time step) by using the table’s search functionality.

[ ]:
subset = results.table[(results.table["x"] == 50) & (results.table["y"] == 40)]
print(subset)

As we can see all of the potential trajectories were filtered by the stamp filter. We can use this information to help tune different filtering stages.

[ ]: