The applications you use in the enclave will depend on your tasks and prior knowledge. Refer to the following recommendations. More information here.

Tools are optimized for different analysis and source types

1 Analytic and Operations Applications
2 Manual Data Entry
- 2.1 Fusion
3 Code Based Analysis and Visualization
4 Code-Free Analysis and Dashboards
- 4.1 Contour
5 Report Findings
- 5.1 Notepad
6 Other Applications
- 6.1 Data Lineage (Monocle)
7 More information:

Analytic and Operations Applications

In the enclave, the left side bar contains “Applications”. From here you can access all applications related to Analytics and Code Operations. Here are some relevant applications for the average N3C researcher:

Code Workbook
Code Repository
Contour

Fusion
Notebook
Reports

Find all Applications in the Enclave

Manual Data Entry

Fusion

A spreadsheet-like tool for data analysis, limited to importing up to 2000 rows of an enclave table.

Use for:

It is preferable for manual data entry into smaller data sets, such as curating lists of concept sets.
It is useful for keeping track of developed concept sets and utilized to easily input them into Logic Liaison Templates.
Fusion also allows you to create datasets based on your spreadsheets. You can either sync a w

hole sheet to a dataset or select a table range to be synced. After the data is successfully synced to a dataset in Foundry, the data can be imported to any other Enclave application.

More Information:

1. Create a new spreadsheet from this page.

2. This is a spreadsheet like-tool to manually store data.

Code Based Analysis and Visualization

The Code Workbook and Repository provides tools for discovering, exploring, and analyzing clinical data. Researchers can request additional packages via the N3C Support Desk.

R & Python: Fully supported with pre-installed packages like tidyverse (R) and Pandas, SciKit-Learn (Python). The Code Workbook offers a graphical interface for data analysis and workflow management. Foundry documentation here.

Apache Spark: Spark SQL handles and queries structured data, supporting filtering, joining, and aggregating large datasets. It integrates natively or with R (SparkR) and Python (PySpark). Foundry documentation here.

Code Workbook:

Prepare original analytic datasets from raw OMOP tables. Multiple transformations can be strung together to create an analysis pipeline using SQL, R, Python, or a mix of these.

Use for:

Workbooks allow you to import and transform datasets using available code templates for various purposes:

Cleaning and joining raw data from external sources to produce curated datasets.
Analyzing processed data to derive useful insights.
Training and applying models for predictive analysis. Ex. Investigating the results of a clinical trial by testing out different p-values.
Creating parameterized visualizations for reports to share with others.
One-time capture of data that is then used in another analytical application.

More Information:

Code Repository:

This can also be used to prepare original analytic datasets from raw OMOP tables for the team's statistician. But the repository is best used to share code across multiple Code Workbooks or projects, or to develop a robust production pipeline.

Use for:

A daily pipeline at high data scale which requires incremental compute.
A high-visibility pipeline with strict governance requiring the ability to revert to previous versions of historical code, or to gate code changes on successful unit tests.

NOTE: There is no restriction on downloading Code that you developed in the Enclave, as long as your code does not contain patient data (raw or derived) embedded in it (e.g. as comments to your code)

More information:

Options indicated when creating a Code Repository.

NOTE: Only Jupyter Notebook is currently available in the Code Repository

Use Logic Liasion Code Templates in your Code Workbook

Logic Liaison code templates accelerate N3C analysis by providing commonly used variables and methods to quickly add custom elements. To find these templates, enter “Logic Liaison Templates” into the N3C Knowledge Store search field.

These code templates can added to your Project Workspace or used directly in your Code Workbooks. There are two types of templates:

→ Logic Liaision Facts Templates

are used to generate the base fact tables. Other fact templates utilize the day-level and person-level datasets of the base fact templates to efficiently generate additional derived variables

Main Facts Templates: generate a base table of project specific facts
- Confirmed Covid Positive;
- All Patients; SDoH (Level 3 Data Only)
Ancillary Fact Templates: build on a base table to derive specific variables:
- e.g. Combined Variable; CCI; Patient Severity,

→ Logic Liasion Quality Control, QC Templates

are used to assess available data, missing or sparse data by site, and overall data quality in the Enclave.

Overall QC Templates: evaluating overall quality of the harmonized data ingested from sites
- Data Density by Site and Domain; Whitelist Filtering
Ancillary QC Templates: further analyze data missingness, density, and contribution quality by site
- Systematic Missingness by Site and Study Variable; Fact Density by Site Visualization

More information:

1. Enter “Logic Liaison Templates” into the N3C Knowledge Store search field.

2. This is an Example Code Workbook using the Overall QC Template: Data Density by Site. Produces visualizations of fact density of Confirmed Covid-19 patients stratified by site z

The phenotype explorer contains helpful links to get started on importing Level 1 and Level 2 data sets into your a workbook for analysis.

The Phenotype explorer contains links to example workbooks using Logic Liasion templates.

The Expected workflow of using the Code Workbook and the Logic Liaison Fact Templates

Code-Free Analysis and Dashboards

Contour

Contour offers a point-and-click, programming-free interface for developing data analysis pipelines on tables at scale. These analyses can generate data summaries and visualizations, which can be integrated into interactive, dynamically updating dashboards.

Use For:

User-friendly graphical interface for filtering, merging, and modifying datasets. Useful for initial data filtering and preprocessing.
Organize complex analyses into analytical paths.
Create interactive dashboards to share findings.
Produces basic visualizations like histograms and heat maps.
Leverage Contour expression language for more advanced transformations and aggregations.
Automatically handles Apache Spark DataFrame operations, resulting in tables for further analysis.
Save analysis results as a new dataset for use in other Foundry tools. Indicated for one-time capture of data that is then used in another analytical application.

More Information:

Report Findings

Notepad

Notepad is a tool within the Enclave used to consolidate various research artifacts, such as summary datasets, statistical analyses, and visualizations, into a single coherent document. It allows users to embed formatted tables, charts, and images from multiple sources, add titles and captions, create sections, and provide narrative structure using Markdown, all through a point-and-click interface.

It is used to report results for secure team dissemination within the Enclave environment.

The main difference between Notepad and a Contour Dashboard is that Notepad provides a static report with figures that cannot be dynamically changed by the reader.

More Information:

Other Applications

Data Lineage (Monocle)

The Data Lineage tool It also provides details on dataset schemas, build dates, and the code that generated them, facilitating build scheduling and verification of data curation methods. The application allows you to

find datasets
visualize data pipelines on a real time basis
assess the origins and relationships of datasets through an intuitive, color-coded interface

More Information:

Palantir Data Lineage Documentation

Data Lineage of the De-Identifed Data Set

More information:

N3C Data Enclave Tools

Clinical Research Informatics FAQs

Tools for Code and Analysis

Analytic and Operations Applications

Manual Data Entry

Fusion

Code Based Analysis and Visualization

Code Workbook:

Code Repository:

Use Logic Liasion Code Templates in your Code Workbook

→ Logic Liaision Facts Templates

→ Logic Liasion Quality Control, QC Templates

Code-Free Analysis and Dashboards

Contour

Report Findings

Notepad

Other Applications

Data Lineage (Monocle)

More information:

Related content