rsynthbio is an R package that provides a convenient
interface to the Synthesize
Bio API, allowing users to generate realistic gene expression data
based on specified biological conditions. This package enables
researchers to easily access AI-generated transcriptomic data for
various modalities including bulk RNA-seq and single-cell RNA-seq.
Alternatively, you can AI generate datasets from our web platform.
You can install rsynthbio from CRAN:
If you want the development version, you can install using the
remotes package to install from GitHub:
if (!("remotes" %in% installed.packages())) {
install.packages("remotes")
}
remotes::install_github("synthesizebio/rsynthbio")Once installed, load the package:
Before using the Synthesize Bio API, you need to set up your API token. The package provides a secure way to handle authentication:
# Securely prompt for and store your API token
# The token will not be visible in the console
set_synthesize_token()
# You can also store the token in your system keyring for persistence
# across R sessions (requires the 'keyring' package)
set_synthesize_token(use_keyring = TRUE)Loading your API key for a session.
# In future sessions, load the stored token
load_synthesize_token_from_keyring()
# Check if a token is already set
has_synthesize_token()You can manually set the token, but don’t commit it to version control!
You can obtain an API token by registering at Synthesize Bio.
Synthesize Bio provides several types of models for different use cases:
Generate synthetic gene expression data from metadata alone. You describe the biological conditions (tissue type, disease state, perturbations, etc.) and the model generates realistic expression profiles.
gem-1-bulk: Bulk RNA-seq baseline
modelgem-1-sc: Single-cell RNA-seq baseline
modelSee the Baseline Models vignette for detailed usage.
Generate expression data conditioned on a real reference sample. This allows you to “anchor” to an existing expression profile while applying perturbations or modifications.
gem-1-bulk_reference-conditioning:
Bulk RNA-seq reference conditioning modelgem-1-sc_reference-conditioning:
Single-cell RNA-seq reference conditioning modelSee the Reference Conditioning vignette for detailed usage.
Infer metadata from observed expression data. Given a gene expression profile, predict the likely biological characteristics (cell type, tissue, disease state, etc.).
gem-1-bulk_predict-metadata: Bulk
RNA-seq metadata prediction modelgem-1-sc_predict-metadata: Single-cell
RNA-seq metadata prediction modelSee the Metadata Prediction vignette for detailed usage.
Only baseline models are available to all users. You can check which
models are available programmatically, use list_models().
Contact us at support@synthesize.bio if you have any questions.
Here’s a quick example using a baseline model:
# Get an example query structure
query <- get_example_query(model_id = "gem-1-bulk")$example_query
# Submit the query and get results
result <- predict_query(query, model_id = "gem-1-bulk")
# Access the results
metadata <- result$metadata
expression <- result$expressionFor more detailed examples and advanced usage, see the model-specific vignettes linked above.