Platform Overview

The SJI Network Explorer lets you extract subnetworks centered on your specified proteins from a large, pre-built SJI network. You can then explore these subnetworks interactively and export protein lists for downstream structural or evolutionary analysis.

The pre-built SJI network includes 51 eukaryotic and 355 bacterial UniProt reference proteomes. All proteins are represented by UniProt accessions to make protein annotation retrieval easier. As a result, queries against the pre-built network, as well as subnetwork extraction, are limited to UniProt accessions.

The package also includes the topN software and a set of Python scripts that allow users to attach their own proteins or proteomes to the pre-built network. The added proteins do not necessarily need to correspond to UniProt entries. Their potential annotations can be inferred by examining the network topology and the UniProt annotations of their neighboring proteins.

Guidance on building a custom SJI network is also provided. For these advanced topics, please refer to the end of this tutorial.

Key concept — Neighborhoods (NH)

Proteins are organized into pre-built neighborhood clusters. To reduce memory usage and simplify the initial view, the viewer displays these clusters in a collapsed state by default, with each cluster represented as a single large node. Click a cluster to expand it and view its individual protein members.

First-Time Setup (Docker)

The SJI Network Explorer is distributed as a Docker image. The image includes the web server, the subnetwork extractor, and all required runtime tools, but it does not include the data. Data are downloaded separately and mounted into the container.

  1. Install Docker Desktop. Install Docker Desktop if it is not already installed:
    Link
    https://www.docker.com/products/docker-desktop/
    You may need to create or sign in to a Docker account during installation. After installation, open Docker Desktop and make sure it is running.
  2. Pull the Docker image. Open Terminal and run:
    Shell
    docker pull ghcr.io/gang-fang/sji-network-explorer:latest
  3. Download the data files. Go to the GitHub release page:
    Link
    https://github.com/gang-fang/network-viz-platform/releases/tag/qfo-reference-proteomes-data-2026
    Download all .gz files from the release assets into an empty folder. Also download install_data.sh to the same folder. Your folder should contain the downloaded .gz files and install_data.sh.
  4. Decompress and organize the data. In Terminal, go to the folder containing the downloaded files:
    Shell
    cd /path/to/your/downloaded/files
    Make the installation script executable:
    Shell
    chmod 755 install_data.sh
    Run the script:
    Shell
    ./install_data.sh
    This will create a data/ folder with the required structure.
  5. Start the server. Stay in the same folder that contains the data/ folder, then run:
    Shell
    docker run --rm \
      --name sji-network-explorer \
      -p 3000:3000 \
      -v "$PWD/data:/app/data" \
      ghcr.io/gang-fang/sji-network-explorer:latest
    When the server starts successfully, open Chrome or Firefox and visit http://localhost:3000. Chrome or Firefox is recommended. Safari may be slower when handling large interactive network views.
  6. Stop the server. To stop the server, open another Terminal window and run:
    Shell
    docker stop sji-network-explorer
  7. Restart the server. To restart the server, go back to the folder that contains the data/ folder and run:
    Shell
    docker run --rm \
      --name sji-network-explorer \
      -p 3000:3000 \
      -v "$PWD/data:/app/data" \
      ghcr.io/gang-fang/sji-network-explorer:latest
Optional — Shortcut command

You can create your own shell alias to start the server more easily:

Shell
alias start-sji='docker run --rm --name sji-network-explorer -p 3000:3000 -v "$PWD/data:/app/data" ghcr.io/gang-fang/sji-network-explorer:latest'

Then start the server by running:

Shell
start-sji

Run this shortcut from the folder that contains the data/ folder.

Browser note

Chrome or Firefox is recommended for large interactive network views. Safari may be slower with larger subnetworks.

Run from a Git Clone

Use this path if you want to run the server directly from the repository, modify the source code, or use the advanced preprocessing scripts. The Docker path above is usually simpler for first-time use.

  1. Clone the repository.
    Shell
    git clone https://github.com/gang-fang/network-viz-platform.git
    cd network-viz-platform
  2. Install Node.js dependencies.
    Shell
    npm install
  3. Install Python dependencies. These are needed for subnetwork extraction and the preprocessing workflows when running from a cloned repository. They are not installed by npm install.
    Shell
    python3 -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
  4. Build topN. The preprocessing workflows expect the compiled executable under tools/bin/topn.
    Shell
    cd tools/preprocessing/topN_cpp
    make
    cd ../../..
  5. Download and organize the release data. Follow the data-download steps in the Docker setup section above: download all .gz files and install_data.sh into an empty folder, then run:
    Shell
    chmod 755 install_data.sh
    ./install_data.sh
    This creates the data/ folder used by the local server.
  6. Configure the environment.
    Shell
    cp .env.example .env
    Edit .env as needed. If you created the virtual environment above, set PYTHON_COMMAND=.venv/bin/python.
  7. Ingest the data and start the server.
    Shell
    npm run ingest
    npm start
    Or combine ingestion and startup:
    Shell
    npm run start:ingest-and-serve
    Then open http://localhost:3000 in Chrome or Firefox.
Dependency note

The Docker image installs only the minimal runtime dependencies. A cloned repository uses requirements.txt so that both subnetwork extraction and preprocessing scripts are available.

The Landing Page

Once the server is up, http://localhost:3000 shows the SJI Network Explorer landing page. There are two things to know: the saved networks, and the extraction form on the right where you build a focused subnetwork from seed proteins.

Landing page
            of SJI Network Explorer with navigation and extraction form
            circled
Landing page. Left: open your saved subnetworks. Right: the subnetwork extraction form. The orange Extract Subnetwork button submits the form once you fill it in.

You can add or delete subnetworks directly in /data/subnetworks, then open them by selecting Open existing networks in viewer from the left panel to go directly to the viewer. See the Advanced Topics section below for details.


Extracting a Sub-Network

From the landing page, fill in the form on the right. The platform expands each seed protein outward through the SJI network and returns a focused subnetwork in CSV format, ready for visualization.

Subnetwork extraction form with each field annotated
Extraction form. Fill in Subnetwork name, pick a Domain (Bacteria or Eukaryotes — circled), set Max neighbors (500–1000 recommended), paste up to 10 UniProt accessions in Seed proteins, then click Extract Subnetwork.

The extraction may take from a few seconds to several minutes, depending on your computer. Once the extraction is complete, the server saves a CSV file under your mounted directory, such as data/networks/, and the Open in Viewer button will appear. Click this button to inspect the subnetwork interactively. Your seed protein will be automatically filled into the Highlight Proteins box in the viewer.

To highlight your seed proteins, click Highlight Proteins (UniProt AC), choose a color, and then click anywhere outside the color palette to close it. You can expand or collapse clusters to examine the subnetwork in greater detail. Keep a note of any proteins of interest; you can always enter them in the Highlight Proteins box to locate them within the subnetwork.

Tip: For large subnetworks, network detection may be delayed. If the number of nodes in your subnetwork appears incomplete, refreshing the browser should resolve the issue.

What the SUB-NETWORK EXTRACTION algorithm does

Each seed first gets a guaranteed local neighborhood, then remaining budget is filled globally by merit score — nodes reached by multiple seeds are boosted. The result is a single connected subnetwork centered on your seeds.

Loading a Network in the Viewer

The viewer's left panel — Controls — is used to load, edit, and manage networks.

Viewer left control panel with network selector, Load
            Network, Edit & Save, Highlight, and Clear buttons
            annotated
Controls panel. Pick a CSV from Select Network and click Load Network. The status line below confirms the loaded counts (e.g. "Loaded: 61 nodes, 1364 edges"). Use Edit & Save Network to modify a network. For example, when a subnetwork is too large or contains clearly unrelated proteins, you can remove irrelevant clusters or proteins and save the edited version as a new subnetwork. Use Highlight by Species and Clear Highlights to locate proteins of interest. Note that you need to know the protein identifiers; by default, these are UniProt ACs.

The Load Network button renders a physics-based force layout that runs briefly and then settles. The status line under the dropdown confirms how many nodes and edges loaded.

Refresh the network list

If you have just extracted a  subnetwork and it does not appear in the dropdown list, click the small button next to the dropdown (red circle) to refresh the list without reloading the page.

Highlighting Proteins

You can color-mark proteins of interest in two ways: by UniProt accessions (the input field in the left panel) or by species (next section).

Network with proteins highlighted in yellow on the
            right side
Highlight applied. The proteins matching the highlight query (or the selected species) glow with a colored ring. Use Clear Highlights in the left panel to remove all colored layers at once.

Filtering by Species

Click Highlight by Species on the left panel to open a draggable taxonomic tree. Drill down through the taxonomy, check the species you care about, then pick a color and click Highlight.

Note that the species panel includes Select All and Deselect All buttons. For example, if you want to retain only human and mouse proteins in your subnetwork and remove proteins from all other species, first click Select All, then uncheck only human and mouse. Next, click Highlight so that proteins from all other species are selected. You can then click Edit & Save Network to generate a new species-focused subnetwork.

Highlight by Species panel showing taxonomic tree with
            E. coli K-12 highlighted in the network
Species panel. Expand the tree by clicking the ▾ arrows. Numbers in parentheses show how many proteins in your network match each taxon. Pick a color swatch (bottom-left) and click Highlight (red circle) to color all matching nodes. Hover any node to see its UniProt ID and species (tooltip shown for P30750 in E. coli K-12).
Tip — Move the panel out of the way

The species panel is draggable — grab its header to move it anywhere on screen so it doesn't cover the graph.

Highlighting SJI Edges

Click Highlight SJI Edges in the left control panel to color edges whose SJI values fall within a selected range. This is useful when you want to visually inspect strong or weak relationship bands without editing the network.

Highlight SJI Edges popup with min/max range, color
            swatches, and matching edges highlighted in the network
SJI edges highlighted. Set a Min and Max range, pick a color, and click Highlight to draw matching edges in the selected color with a thicker stroke.
Tip — Highlight before editing

SJI edge highlighting is visual only. To permanently remove weak or strong edges and save a new network, use Edit & Save Network and its Edges by SJI Weight controls.

In dense networks, highlighting low-SJI edges can be hard to interpret because weak edges may appear throughout the graph. A more useful check is to highlight edges above your chosen threshold. If the edges between communities remain unchanged, those communities are weakly connected.

Edit and Save Networks

Use Edit & Save Network to remove proteins or edges from the current network view and save the edited result as a new network file. This is useful when you want to focus on a smaller biological subset, removing less relevant species, trimming weak SJI edges, or reducing the size of an overly large network.

Edit and Save Network popup showing selection, SJI edge
            filtering, UniProt AC, species, and save controls
Edit & Save Network. The popup summarizes visible proteins and edges, then groups editing tools into Selection, Edges by SJI Weight, Protein UniProt ACs, Species, and Save Edited Network sections.
Keep the original network

Save edits under a new name rather than overwriting the source network. This keeps the original extraction available if you need to revisit the full topology later.

Export & Analysis Panel

In the viewer panel, you interact with the network, while the right-side panel displays the proteins selected for detailed analysis. The system therefore has two modes: viewer mode and export mode. These modes are controlled by the blue star button at the top right of the viewer panel.

Both individual protein nodes and entire clusters can be exported. Toggle the blue star button so that it turns gold; this switches the system to export mode. In this mode, clicking a node no longer collapses or expands a cluster. Instead, it selects or deselects the node.

In export mode, click the arrow immediately below the blue star button to send the selected nodes to the export panel.

From the export panel, you can sort proteins by degree centrality, open direct UniProt links, select different UniProt annotation categories, download the protein list, or send the proteins for batch analysis.

Export and
            Analysis panel with UniProt section buttons; Sort by ProtDC
            and Expression button annotated
Export & Analysis panel. The top box lists all selected proteins and clusters. When valid entries are available, proteins are automatically linked to their corresponding UniProt records. The UniProt section buttons below (Function, Names & Taxonomy, Expression, …) re-target every link in the list to that section of UniProt — handy for quickly comparing, say, expression data across the whole selection. Click Function to return to the top of entry link.
Tooltip Sort proteins by ProtDC over the right-arrow
            icon
Sort by ProtDC. The right-pointing arrow icon (red circle) sorts the protein list by ProtDC centrality — proteins most central to their neighborhood appear first. Click again to toggle ascending/descending. Use the icon to download the list as a CSV; the 🗑 icon clears the panel.
What is ProtDC?

ProtDC (Protein Degree Centrality) measures how well-connected a protein is within its neighborhood. Higher ProtDC = more central within its cluster, so sorting top-down surfaces likely "hub" candidates first.

Batch & Comparative Analysis

The Batch & Comparative Analysis popup (accessible from the UniProt section buttons) lets you define named groups of proteins and save them as separate files for cross-group comparison.

Batch and
            Comparative Analysis popup for creating UniProt accession
            groups and saving group files
Batch & Comparative Analysis. Create named groups, paste UniProt ACs from the Export panel, add more groups as needed, and save each group as a text file for downstream comparison.
Example workflows

The popup contains links to two Colab notebooks: one for pairwise structural alignment with USalign, and one for multiple sequence alignment and DIVERGE v4 analysis of evolutionary shifts.

Structural alignment workflow

Multiple sequence alignment and DIVERGE workflow

URL Parameters & Easy Access

The pages accept query parameters for deep-linking and automation.

viewer.html

Parameter Example Effect
network ?network=my_proteins.csv Loads and displays this network automatically on page open
seeds &seeds=P04637,O15151 Pre-populates the highlight search field with these accessions and executes the search immediately

Example — open the viewer on a specific network with two proteins pre-filled in the highlight box :

URL
http://localhost:3000/viewer.html?network=my_proteins.csv&seeds=P04637,O15151

Tip — Bookmarking sessions

Copy the viewer URL containing ?network= to access a specific network and easily highlight seed proteins.

Build Your Own SJI Network

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell
git clone https://github.com/gang-fang/network-viz-platform.git

The package includes a template workflow for building a custom SJI network from a selected set of UniProt reference proteomes. The detailed scripts are located in:

Path
tools/preprocessing/pipelines/build_your_net_scripts

Run the workflow in the order indicated by the script names. Each script validates the expected inputs from earlier steps and writes outputs used by later steps.

Configure paths before running

Before starting, open the scripts and update the editable variables for your environment. In particular, set WORK_ROOT, input proteome locations, output folders, Python environment activation, and script paths such as MAKE_2D, SON, SJI_NET, Leiden scripts, CT_ATTR, and PREPROCESS_GRAPH. Supporting tools live in the cloned repository under directories such as tools/bin and tools/preprocessing, but the analysis data directories should point to your own mounted volume or working filesystem.

Large-scale runs

For large proteome sets, expect to adapt the template workflow to your HPC or AWS environment. The most expensive steps are usually all-against-all topN and son_spectral_final.py. These steps may need array jobs, job batching, or other parallel execution strategies that match your scheduler and storage layout.

Monitor memory

Memory usage can vary strongly with proteome size and the number of intermediate similarity records. If son_spectral_final.py becomes memory-limited, splitting proteomes or 2D inputs into smaller chunks can substantially improve throughput in parallel-computing environments. The best chunking strategy depends on the local HPC or AWS configuration.

Annotate Your Own Proteins

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell
git clone https://github.com/gang-fang/network-viz-platform.git

A second workflow lets you attach your own proteins to a pre-built SJI network. This is useful when you have proteins of interest that are not part of the SJI network and want to infer possible annotations from how they connect to annotated proteins in the network.

The detailed workflow is located in:

Path
tools/preprocessing/pipelines/annotate_proteins_scripts
Interpretation

The resulting values should be treated as pseudo-SJI scores. Standard SJI estimation is most reliable when a complete proteome is available. For individually supplied proteins, only one-way SJI relationships can be computed, so the scores are useful for annotation guidance but should not be interpreted as fully equivalent to standard SJI values.

Run the workflow in the order indicated by the script names:

Configure paths before running

This workflow starts from your own query FASTA file. Before running the scripts, update the editable variables for your environment, including DB, QUERY, OUT_FOLDER, TOPN_OUT, TWO_D_OUT, QUERY_DIR, SIGNAL_DIR, Python environment activation, and script paths such as MAKE_2D, SON, and SJI_NET. Supporting tools live in the cloned repository under tools/bin and tools/preprocessing, but the input and output directories should point to your own mounted volume or working filesystem.

Use the result

The output network can be loaded with the viewer like other network CSV files. Inspect where your query proteins connect, review neighboring UniProt annotations, and use highlighting or export tools to compare candidate functional contexts.

Annotate Your Own Proteome

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell
git clone https://github.com/gang-fang/network-viz-platform.git

This workflow lets you attach an entire user-supplied proteome to a pre-built SJI network. It is useful when you want to infer potential annotations for many proteins by examining how the added proteome connects to proteins already present in the pre-built SJI network.

The detailed workflow is located in:

Path
tools/preprocessing/pipelines/annotate_wholeProteome_scripts

Run the workflow in the order indicated by the script names:

Configure paths before running

Before running the scripts, update the editable variables for your environment. Common variables include PROTEOME_ID, WORK_ROOT, DB, TWO_D_HOME, OUT_FOLDER, TOPN_OUT, TWO_D_ROOT, VENV_ACT, and script paths such as MAKE_2D, SON, and SJI_NET. Supporting tools live in the cloned repository under tools/bin and tools/preprocessing, but the input and output directories should point to your own mounted volume or working filesystem.

Large-scale runs

For large proteomes, expect to adapt the workflow to your AWS or HPC environment. Computationally intensive steps such as topN and son_spectral_final.py may require array jobs, batching, or other parallel execution strategies that match your scheduler and storage layout.

Monitor memory

Memory usage should be monitored carefully. In some cases, splitting proteomes into smaller chunks and providing those smaller inputs to son_spectral_final.py can substantially improve performance in parallel-computing environments. The best optimization strategy depends strongly on the specific AWS or HPC configuration, so this tutorial provides general guidance rather than cluster-specific scripts.