Tutorial — ProtDC Network Platform

Platform Overview

The SJI Network Explorer lets you extract subnetworks centered on your specified proteins from a large, pre-built SJI network. You can then explore these subnetworks interactively and export protein lists for downstream structural or evolutionary analysis.

The pre-built SJI network includes 51 eukaryotic and 355 bacterial UniProt reference proteomes. All proteins are represented by UniProt accessions to make protein annotation retrieval easier. As a result, queries against the pre-built network, as well as subnetwork extraction, are limited to UniProt accessions.

The package also includes the topN software and a set of Python scripts that allow users to attach their own proteins or proteomes to the pre-built network. The added proteins do not necessarily need to correspond to UniProt entries. Their potential annotations can be inferred by examining the network topology and the UniProt annotations of their neighboring proteins.

Guidance on building a custom SJI network is also provided. For these advanced topics, please refer to the end of this tutorial.

Key concept — Neighborhoods (NH)

Proteins are organized into pre-built neighborhood clusters. To reduce memory usage and simplify the initial view, the viewer displays these clusters in a collapsed state by default, with each cluster represented as a single large node. Click a cluster to expand it and view its individual protein members.

First-Time Setup (Docker)

The SJI Network Explorer is distributed as a Docker image. The image includes the web server, the subnetwork extractor, and all required runtime tools, but it does not include the data. Data are downloaded separately and mounted into the container.

Install Docker Desktop. Install Docker Desktop if it is not already installed:
Link
```
https://www.docker.com/products/docker-desktop/
```
You may need to create or sign in to a Docker account during installation. After installation, open Docker Desktop and make sure it is running.
Pull the Docker image. Open Terminal and run:
Shell
```
docker pull ghcr.io/gang-fang/sji-network-explorer:latest
```
Download the data files. Go to the GitHub release page:
Link
```
https://github.com/gang-fang/network-viz-platform/releases/tag/qfo-reference-proteomes-data-2026
```
Download all .gz files from the release assets into an empty folder. Also download install_data.sh to the same folder. Your folder should contain the downloaded .gz files and install_data.sh.
Decompress and organize the data. In Terminal, go to the folder containing the downloaded files:
Shell
```
cd /path/to/your/downloaded/files
```
Make the installation script executable:
Shell
```
chmod 755 install_data.sh
```
Run the script:
Shell
```
./install_data.sh
```
This will create a data/ folder with the required structure.
Start the server. Stay in the same folder that contains the data/ folder, then run:
Shell
```
docker run --rm \
  --name sji-network-explorer \
  -p 3000:3000 \
  -v "$PWD/data:/app/data" \
  ghcr.io/gang-fang/sji-network-explorer:latest
```
When the server starts successfully, open Chrome or Firefox and visit http://localhost:3000. Chrome or Firefox is recommended. Safari may be slower when handling large interactive network views.
Stop the server. To stop the server, open another Terminal window and run:
Shell
```
docker stop sji-network-explorer
```

Restart the server. To restart the server, go back to the folder that contains the data/ folder and run:

Shell

docker run --rm \
  --name sji-network-explorer \
  -p 3000:3000 \
  -v "$PWD/data:/app/data" \
  ghcr.io/gang-fang/sji-network-explorer:latest

Optional — Shortcut command

You can create your own shell alias to start the server more easily:

Shell

alias start-sji='docker run --rm --name sji-network-explorer -p 3000:3000 -v "$PWD/data:/app/data" ghcr.io/gang-fang/sji-network-explorer:latest'

Then start the server by running:

Shell

start-sji

Run this shortcut from the folder that contains the data/ folder.

Browser note

Chrome or Firefox is recommended for large interactive network views. Safari may be slower with larger subnetworks.

Run from a Git Clone

Use this path if you want to run the server directly from the repository, modify the source code, or use the advanced preprocessing scripts. The Docker path above is usually simpler for first-time use.

Clone the repository.

Shell

git clone https://github.com/gang-fang/network-viz-platform.git
cd network-viz-platform

Install Node.js dependencies.
Shell
```
npm install
```
Install Python dependencies. These are needed for subnetwork extraction and the preprocessing workflows when running from a cloned repository. They are not installed by npm install.
Shell
```
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
Build topN. The preprocessing workflows expect the compiled executable under tools/bin/topn.
Shell
```
cd tools/preprocessing/topN_cpp
make
cd ../../..
```
Download and organize the release data. Follow the data-download steps in the Docker setup section above: download all .gz files and install_data.sh into an empty folder, then run:
Shell
```
chmod 755 install_data.sh
./install_data.sh
```
This creates the data/ folder used by the local server.
Configure the environment.
Shell
```
cp .env.example .env
```
Edit .env as needed. If you created the virtual environment above, set PYTHON_COMMAND=.venv/bin/python.
Ingest the data and start the server.
Shell
```
npm run ingest
npm start
```
Or combine ingestion and startup:
Shell
```
npm run start:ingest-and-serve
```
Then open http://localhost:3000 in Chrome or Firefox.

Dependency note

The Docker image installs only the minimal runtime dependencies. A cloned repository uses requirements.txt so that both subnetwork extraction and preprocessing scripts are available.

The Landing Page

Once the server is up, http://localhost:3000 shows the SJI Network Explorer landing page. There are two things to know: the saved networks, and the extraction form on the right where you build a focused subnetwork from seed proteins.

Landing page
of SJI Network Explorer with navigation and extraction form
circled

Landing page. Left: open your saved subnetworks. Right: the subnetwork extraction form. The orange Extract Subnetwork button submits the form once you fill it in.

You can add or delete subnetworks directly in /data/subnetworks, then open them by selecting Open existing networks in viewer from the left panel to go directly to the viewer. See the Advanced Topics section below for details.

Extracting a Sub-Network

From the landing page, fill in the form on the right. The platform expands each seed protein outward through the SJI network and returns a focused subnetwork in CSV format, ready for visualization.

Subnetwork extraction form with each field annotated

Extraction form. Fill in Subnetwork name, pick a Domain (Bacteria or Eukaryotes — circled), set Max neighbors (500–1000 recommended), paste up to 10 UniProt accessions in Seed proteins, then click Extract Subnetwork.

1

Subnetwork name — letters, numbers, _, -, and . only. The .csv extension is added automatically. Max 80 characters.
2

Domain — select Bacteria or Eukaryotes based on where your proteins live. This selects the correct pre-built index. See the Advanced Topics section below for instructions on building custom indexes.
3

Max neighbors — total nodes in the resulting subnetwork (cap 2,500; 500–1,000 makes the visualization readable).
4

Seed proteins — up to 10 UniProt ACs separated by spaces, commas, semicolons, or new lines.

The extraction may take from a few seconds to several minutes, depending on your computer. Once the extraction is complete, the server saves a CSV file under your mounted directory, such as data/networks/, and the Open in Viewer button will appear. Click this button to inspect the subnetwork interactively. Your seed protein will be automatically filled into the Highlight Proteins box in the viewer.

To highlight your seed proteins, click Highlight Proteins (UniProt AC), choose a color, and then click anywhere outside the color palette to close it. You can expand or collapse clusters to examine the subnetwork in greater detail. Keep a note of any proteins of interest; you can always enter them in the Highlight Proteins box to locate them within the subnetwork.

Tip: For large subnetworks, network detection may be delayed. If the number of nodes in your subnetwork appears incomplete, refreshing the browser should resolve the issue.

What the SUB-NETWORK EXTRACTION algorithm does

Each seed first gets a guaranteed local neighborhood, then remaining budget is filled globally by merit score — nodes reached by multiple seeds are boosted. The result is a single connected subnetwork centered on your seeds.

Loading a Network in the Viewer

The viewer's left panel — Controls — is used to load, edit, and manage networks.

Viewer left control panel with network selector, Load
Network, Edit & Save, Highlight, and Clear buttons
annotated

Controls panel. Pick a CSV from Select Network and click Load Network. The status line below confirms the loaded counts (e.g. "Loaded: 61 nodes, 1364 edges"). Use Edit & Save Network to modify a network. For example, when a subnetwork is too large or contains clearly unrelated proteins, you can remove irrelevant clusters or proteins and save the edited version as a new subnetwork. Use Highlight by Species and Clear Highlights to locate proteins of interest. Note that you need to know the protein identifiers; by default, these are UniProt ACs.

The Load Network button renders a physics-based force layout that runs briefly and then settles. The status line under the dropdown confirms how many nodes and edges loaded.

Refresh the network list

If you have just extracted a subnetwork and it does not appear in the dropdown list, click the small ↻ button next to the dropdown (red circle) to refresh the list without reloading the page.

Navigation & Cluster Expansion

The graph starts with every neighborhood collapsed — each large gray node is a cluster of related proteins. Click any cluster to expand its members.

Collapsed network with three clusters and Expand All
button highlighted

Collapsed view. The three large gray nodes are neighborhoods (NH) — node size scales with the number of members. Click the orange Expand All sun-icon (red circle, bottom-right) to expand every cluster at once. Use +/− to zoom and the ↺ button to fit-to-screen.

Control	Action
Scroll wheel	Zoom in / out
Click + drag (background)	Pan the canvas
+ / − buttons	Zoom in / out
↺ Reset button	Fit all nodes to viewport
☀️ Expand All / 🌙 Collapse All	Expand or collapse every cluster at once
Click cluster node	Expand that single cluster into its protein members
Click protein node	Collapse the cluster it belongs to
Drag any node	Pin its position; double-click to release

Highlighting Proteins

You can color-mark proteins of interest in two ways: by UniProt accessions (the input field in the left panel) or by species (next section).

1

Type one or more UniProt ACs into Highlight Proteins (UniProt AC) — separated by spaces, commas, or semicolons.
2

Press Enter or click the 🔍 icon. A color picker appears — pick a swatch to color the matching nodes.
3

Multiple searches can be performed using distinct colors. Each search generates an independent visualization layer. When members of a cluster are assigned to multiple layers, the corresponding cluster node is displayed as a pie chart. For example, after highlighting a set of UniProt accessions, you may want to identify which proteins are of human origin. To do so, add a new layer by selecting Highlight by Species and assigning a different color. The corresponding nodes will then be displayed with pie-chart patterns.
4

Use Clear Highlights to remove all highlights and selections. Note that this button is often overlooked, which may lead to unintended selections.

Network with proteins highlighted in yellow on the
right side

Highlight applied. The proteins matching the highlight query (or the selected species) glow with a colored ring. Use Clear Highlights in the left panel to remove all colored layers at once.

Filtering by Species

Click Highlight by Species on the left panel to open a draggable taxonomic tree. Drill down through the taxonomy, check the species you care about, then pick a color and click Highlight.

Note that the species panel includes Select All and Deselect All buttons. For example, if you want to retain only human and mouse proteins in your subnetwork and remove proteins from all other species, first click Select All, then uncheck only human and mouse. Next, click Highlight so that proteins from all other species are selected. You can then click Edit & Save Network to generate a new species-focused subnetwork.

Highlight by Species panel showing taxonomic tree with
E. coli K-12 highlighted in the network

Species panel. Expand the tree by clicking the ▾ arrows. Numbers in parentheses show how many proteins in your network match each taxon. Pick a color swatch (bottom-left) and click Highlight (red circle) to color all matching nodes. Hover any node to see its UniProt ID and species (tooltip shown for P30750 in E. coli K-12).

Tip — Move the panel out of the way

The species panel is draggable — grab its header to move it anywhere on screen so it doesn't cover the graph.

Highlighting SJI Edges

Click Highlight SJI Edges in the left control panel to color edges whose SJI values fall within a selected range. This is useful when you want to visually inspect strong or weak relationship bands without editing the network.

1

Open Highlight SJI Edges. The popup starts with Min set to 0 and Max set to the largest SJI value available in the loaded network.
2

Enter a range from 0 to 1, inclusive. For example, Min 0.5 and Max 1 highlights edges with SJI values between 0.5 and 1. If an invalid value is entered, the fields reset to the full 0 to 1 range.
3

Choose a color swatch and click Highlight. Matching edges are drawn with the selected color and a thicker line so the highlight remains visible even when the original edges are thin.
4

Use Clear Highlights to remove both protein highlights and SJI edge highlights, returning the view to its original visual state.

SJI edges highlighted. Set a Min and Max range, pick a color, and click Highlight to draw matching edges in the selected color with a thicker stroke.

Tip — Highlight before editing

SJI edge highlighting is visual only. To permanently remove weak or strong edges and save a new network, use Edit & Save Network and its Edges by SJI Weight controls.

In dense networks, highlighting low-SJI edges can be hard to interpret because weak edges may appear throughout the graph. A more useful check is to highlight edges above your chosen threshold. If the edges between communities remain unchanged, those communities are weakly connected.

Edit and Save Networks

Use Edit & Save Network to remove proteins or edges from the current network view and save the edited result as a new network file. This is useful when you want to focus on a smaller biological subset, removing less relevant species, trimming weak SJI edges, or reducing the size of an overly large network.

Edit & Save Network. The popup summarizes visible proteins and edges, then groups editing tools into Selection, Edges by SJI Weight, Protein UniProt ACs, Species, and Save Edited Network sections.

1

Open the popup with Edit & Save Network button in the left control panel.
2

Use Selection section to remove proteins from the network. First, confirm that the currently highlighted or previously selected proteins are the ones you intend to remove. If not, click Clear Highlights at the bottom of the left control panel to clear the current selection.

Next, toggle the blue ★ button in the network viewer; its color will change to gold. You can then click to select clusters or individual proteins for removal. You can also drag the mouse to draw a selection box and select multiple nodes at once.

Finally, click Remove Selected to remove the selected nodes. To undo the changes and restore the network to its previous state, click Restore Edits.
3

Use Edges by SJI Weight to trim weak links. For example, Remove Edges Below 0.5 removes edges with SJI below 0.5, while Restore Edges Above 0.3 can restore previously removed edges with SJI above 0.3.
4

Use Protein UniProt ACs or Species when you want to remove or restore proteins by accession number or taxonomic group.
5

Enter a new name in Save Edited Network and click Save. The edited CSV is written to the mounted network folder and can be loaded from the viewer like any other network.

Keep the original network

Save edits under a new name rather than overwriting the source network. This keeps the original extraction available if you need to revisit the full topology later.

Export & Analysis Panel

In the viewer panel, you interact with the network, while the right-side panel displays the proteins selected for detailed analysis. The system therefore has two modes: viewer mode and export mode. These modes are controlled by the blue star ★ button at the top right of the viewer panel.

Both individual protein nodes and entire clusters can be exported. Toggle the blue star ★ button so that it turns gold; this switches the system to export mode. In this mode, clicking a node no longer collapses or expands a cluster. Instead, it selects or deselects the node.

In export mode, click the ➡ arrow immediately below the blue star ★ button to send the selected nodes to the export panel.

From the export panel, you can sort proteins by degree centrality, open direct UniProt links, select different UniProt annotation categories, download the protein list, or send the proteins for batch analysis.

Export and
Analysis panel with UniProt section buttons; Sort by ProtDC
and Expression button annotated

Export & Analysis panel. The top box lists all selected proteins and clusters. When valid entries are available, proteins are automatically linked to their corresponding UniProt records. The UniProt section buttons below (Function, Names & Taxonomy, Expression, …) re-target every link in the list to that section of UniProt — handy for quickly comparing, say, expression data across the whole selection. Click Function to return to the top of entry link.

Tooltip Sort proteins by ProtDC over the right-arrow
icon

Sort by ProtDC. The → right-pointing arrow icon (red circle) sorts the protein list by ProtDC centrality — proteins most central to their neighborhood appear first. Click again to toggle ascending/descending. Use the ⬇ icon to download the list as a CSV; the 🗑 icon clears the panel.

What is ProtDC?

ProtDC (Protein Degree Centrality) measures how well-connected a protein is within its neighborhood. Higher ProtDC = more central within its cluster, so sorting top-down surfaces likely "hub" candidates first.

Batch & Comparative Analysis

The Batch & Comparative Analysis popup (accessible from the UniProt section buttons) lets you define named groups of proteins and save them as separate files for cross-group comparison.

1

Click Batch & Comparative Analysis in the right panel section controls. A draggable popup appears.
2

Enter a group name (letters, numbers, _, -, . — max 16 characters) and paste UniProt ACs from the Export panel into the accessions area. ProtDC values in parentheses are stripped automatically.
3

Click + to add more groups. Groups must have unique names.
4

Click Save. The server writes one text file per group to data/exports/. You can then load these files into the example Colab notebooks for structural alignment or phylogenetic analysis.

Batch & Comparative Analysis. Create named groups, paste UniProt ACs from the Export panel, add more groups as needed, and save each group as a text file for downstream comparison.

Example workflows

The popup contains links to two Colab notebooks: one for pairwise structural alignment with USalign, and one for multiple sequence alignment and DIVERGE v4 analysis of evolutionary shifts.

Structural alignment workflow

Multiple sequence alignment and DIVERGE workflow

URL Parameters & Easy Access

The pages accept query parameters for deep-linking and automation.

viewer.html

Parameter	Example	Effect
network	`?network=my_proteins.csv`	Loads and displays this network automatically on page open
seeds	`&seeds=P04637,O15151`	Pre-populates the highlight search field with these accessions and executes the search immediately

Example — open the viewer on a specific network with two proteins pre-filled in the highlight box :

URL

http://localhost:3000/viewer.html?network=my_proteins.csv&seeds=P04637,O15151

Tip — Bookmarking sessions

Copy the viewer URL containing ?network= to access a specific network and easily highlight seed proteins.

Build Your Own SJI Network

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell

git clone https://github.com/gang-fang/network-viz-platform.git

The package includes a template workflow for building a custom SJI network from a selected set of UniProt reference proteomes. The detailed scripts are located in:

Path

tools/preprocessing/pipelines/build_your_net_scripts

Run the workflow in the order indicated by the script names. Each script validates the expected inputs from earlier steps and writes outputs used by later steps.

1

01_download_proteome_list.sh downloads and prepares the proteome FASTA files listed in your proteome ID file.
2

02_run_topn_all_vs_all.sh runs all-against-all topN comparisons. Build or install the topn executable from the cloned repository before running this step.
3

03_run_make_2d_all_vs_all.sh converts the topN outputs into two-dimensional similarity inputs for signal detection.
4

04_run_son_spectral_cluster_all_vs_all.sh runs son_spectral_final.py to separate signal from noise for the 2D similarity outputs.
5

05_run_build_network_all_vs_all.sh builds the SJI network edge table from the detected signal sets.
6

06.1_leiden_01_preprocess.sh preprocesses the SJI edge table for Leiden clustering by converting node ids into zero-based integer IDs.
7

06.2_leiden_02_global.sh runs global Leiden clustering.
8

06.3_leiden_03_refine.sh refines neighborhoods and optionally parallelizes refinement jobs.
9

06.4_leiden_04_restore_uniprot_ids.sh restores UniProt accessions after integer-ID clustering.
10

07_build_uAC_taxonomy_mapping.sh builds the UniProt accession to taxonomy mapping used by node attributes, along with two files required by the package: commontree.txt and NCBI_txID.csv. Copy these two files to /data/NCBI_txID/.
11

08_run_ct_attr.sh creates the final .nodes.attr file for the viewer. Move this file to /data/nodes_attr/.
12

09_run_preprocess_graph.sh preprocesses the graph into binary index files used by fast subnetwork extraction. Move these files to /data/indexes/.

Configure paths before running

Before starting, open the scripts and update the editable variables for your environment. In particular, set WORK_ROOT, input proteome locations, output folders, Python environment activation, and script paths such as MAKE_2D, SON, SJI_NET, Leiden scripts, CT_ATTR, and PREPROCESS_GRAPH. Supporting tools live in the cloned repository under directories such as tools/bin and tools/preprocessing, but the analysis data directories should point to your own mounted volume or working filesystem.

Large-scale runs

For large proteome sets, expect to adapt the template workflow to your HPC or AWS environment. The most expensive steps are usually all-against-all topN and son_spectral_final.py. These steps may need array jobs, job batching, or other parallel execution strategies that match your scheduler and storage layout.

Monitor memory

Memory usage can vary strongly with proteome size and the number of intermediate similarity records. If son_spectral_final.py becomes memory-limited, splitting proteomes or 2D inputs into smaller chunks can substantially improve throughput in parallel-computing environments. The best chunking strategy depends on the local HPC or AWS configuration.

Annotate Your Own Proteins

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell

git clone https://github.com/gang-fang/network-viz-platform.git

A second workflow lets you attach your own proteins to a pre-built SJI network. This is useful when you have proteins of interest that are not part of the SJI network and want to infer possible annotations from how they connect to annotated proteins in the network.

The detailed workflow is located in:

Path

tools/preprocessing/pipelines/annotate_proteins_scripts

Interpretation

The resulting values should be treated as pseudo-SJI scores. Standard SJI estimation is most reliable when a complete proteome is available. For individually supplied proteins, only one-way SJI relationships can be computed, so the scores are useful for annotation guidance but should not be interpreted as fully equivalent to standard SJI values.

Run the workflow in the order indicated by the script names:

1

01_addP_topn_targets.sh runs topN from your query protein FASTA against the target reference proteome files. Build or install the topn executable from the cloned repository before running this step.
2

02_run_make_2d.sh converts the topN output into two-dimensional similarity inputs for each query protein.
3

03_run_son_spectral.sh runs a spectral clustering algorithm son_spectral_final.py to identify signal proteins for the query proteins.
4

04_run_build_network_oneway.sh builds one-way SJI-style edges between your query proteins and proteins already represented in the SJI network. Each query protein has an SJI edge report that lists its neighbors in the pre-built network, ordered from closest to farthest. Use the closest neighbors as seeds to extract subnetworks, then concatenate the edge reports with the extracted subnetworks using the cat command or any plain-text editor to attach your proteins to the networks. Move your custom networks to /data/networks and load them from viewer.html. Enter your own protein IDs in the Highlight Proteins box to locate them in the networks.

Configure paths before running

This workflow starts from your own query FASTA file. Before running the scripts, update the editable variables for your environment, including DB, QUERY, OUT_FOLDER, TOPN_OUT, TWO_D_OUT, QUERY_DIR, SIGNAL_DIR, Python environment activation, and script paths such as MAKE_2D, SON, and SJI_NET. Supporting tools live in the cloned repository under tools/bin and tools/preprocessing, but the input and output directories should point to your own mounted volume or working filesystem.

Use the result

The output network can be loaded with the viewer like other network CSV files. Inspect where your query proteins connect, review neighboring UniProt annotations, and use highlighting or export tools to compare candidate functional contexts.

Annotate Your Own Proteome

Get the source

This workflow runs outside the Docker container. Clone the repository to access and edit the pipeline scripts:

Shell

git clone https://github.com/gang-fang/network-viz-platform.git

This workflow lets you attach an entire user-supplied proteome to a pre-built SJI network. It is useful when you want to infer potential annotations for many proteins by examining how the added proteome connects to proteins already present in the pre-built SJI network.

The detailed workflow is located in:

Path

tools/preprocessing/pipelines/annotate_wholeProteome_scripts

Run the workflow in the order indicated by the script names:

1

01_download_proteome_run_topn_targets.sh downloads the query proteome, normalizes FASTA headers, and runs topN from the query proteome against the reference target proteomes.
2

02_run_make_2d_proteome.sh converts the query-proteome topN outputs into 2D similarity inputs.
3

03_run_son_spectral_proteome.sh runs a spectral clustering algorithm son_spectral_final.py to identify signal proteins for the query proteome.
4

04_prepare_ProT_download_fastas.sh identifies target proteins needed for the reverse side of the comparison and prepares their FASTA files.
5

05_run_topn_T_against_query_proteome.sh runs topN from those target proteins back against the query proteome.
6

06_run_make_2d_T.sh converts the reverse topN results into 2D similarity inputs.
7

07_merge_existing_2d_with_new_T_2d.sh merges existing reference 2D files with the newly computed reverse-comparison 2D files.
8

08_run_son_spectral_T.sh runs the spectral clustering algorithm son_spectral_final.py again for the target-side signal sets.
9

09_run_build_network_addProteome.sh builds the final add-proteome network edges connecting the query proteome to the pre-built SJI network. Each query protein has an SJI edge report that lists its neighbors in the pre-built network, ordered from nearest to farthest. You can use a strategy similar to the one described in the protein annotation section to identify the closest neighbors of your proteins of interest and then examine each protein individually. The subnetwork extraction tool, tools/runtime/extract_subnetwork.py, can also be used as a standalone script, allowing this workflow to be automated for proteome annotation by training a parameter-selection system.

Configure paths before running

Before running the scripts, update the editable variables for your environment. Common variables include PROTEOME_ID, WORK_ROOT, DB, TWO_D_HOME, OUT_FOLDER, TOPN_OUT, TWO_D_ROOT, VENV_ACT, and script paths such as MAKE_2D, SON, and SJI_NET. Supporting tools live in the cloned repository under tools/bin and tools/preprocessing, but the input and output directories should point to your own mounted volume or working filesystem.

Large-scale runs

For large proteomes, expect to adapt the workflow to your AWS or HPC environment. Computationally intensive steps such as topN and son_spectral_final.py may require array jobs, batching, or other parallel execution strategies that match your scheduler and storage layout.

Monitor memory

Memory usage should be monitored carefully. In some cases, splitting proteomes into smaller chunks and providing those smaller inputs to son_spectral_final.py can substantially improve performance in parallel-computing environments. The best optimization strategy depends strongly on the specific AWS or HPC configuration, so this tutorial provides general guidance rather than cluster-specific scripts.

From network topology and neighborhood structure to biology insight

Platform Overview

First-Time Setup (Docker)

Run from a Git Clone

The Landing Page

Extracting a Sub-Network

Loading a Network in the Viewer

Navigation & Cluster Expansion

Highlighting Proteins

Filtering by Species

Highlighting SJI Edges

Edit and Save Networks

Export & Analysis Panel

Batch & Comparative Analysis

URL Parameters & Easy Access

Build Your Own SJI Network

Annotate Your Own Proteins

Annotate Your Own Proteome

From network topology and neighborhood structure to
biology insight