Introduction to SyGra Studio

SyGra 2.0.0 introduces Studio, an interactive environment that turns synthetic data generation into transparent visual craft. Instead of working with YAML files and the terminal, create flows directly on the canvas, preview datasets before committing, use inline variable hints to adjust prompts, and monitor execution streams live. You can do all this from one pane. Under the hood, it’s the same platform, so every operation visually generates a corresponding SyGra-compatible graph configuration and task execution script.

What you can do with Studio

Configure and validate models using guided forms (OpenAI, Azure OpenAI, Ollama, Vertex, Bedrock, vLLM, custom endpoints). Connect to Hugging Face, file system, or ServiceNow data sources and preview rows before execution. Configure nodes by selecting a model, creating prompts (using auto-suggested variables), and defining outputs or structured schemas. Design downstream outputs using shared state variables and mappings powered by Pydantic. Run flows end-to-end and instantly see the generated results along with node-level progress. Debugging with inline logging, breakpoints, Monaco-assisted code editor, and autosaved drafts. Monitor per-execution token cost, latency, and guardrail results using execution history stored in .executions/.

Let’s walk through this experience step by step.

Step 1: Configure the data source

When you open Studio and click Create Flow, the start/end nodes automatically appear. Before adding anything else:

Select your connector (Hugging Face, Disk, or ServiceNow). Enter parameters such as repo_id, split, file path, and click (Preview) to fetch sample rows. Column names quickly become state variables (e.g. {prompt}, {genre}), so you know exactly what you can refer to in prompts and processors.

Once validation is complete, Studio keeps your configuration in sync and pipes these variables throughout your flow. No manual wiring or guesswork required.

Step 2: Build your flow visually

Drag the required block from the palette. For the story generation pipeline:

Drop an LLM node named “Story Generator”, select a configured model (such as gpt-4o-mini), create a prompt, and save the result to story_body. Add a second LLM node named “Story Summarizer”, reference {story_body} in the prompt, and output to story_summary. If you need reusable logic or branching behavior, you can toggle structured outputs, connect tools, and add Lambda/Subgraph nodes.

Studio’s details panel keeps everything in context: model parameters, prompt editor, tool configuration, pre/post-processing code, and even multi-LLM settings if you need parallel generation. Type { at the prompt to instantly display all available state variables.

Step 3: Verify and run

Open the code panel to inspect the exact YAML/JSON that Studio is generating. This is the same artifact written to tasks/examples/, so what you see is what gets committed.

When you’re ready to run:

Click Run Workflow. Select the number of records, batch size, retry behavior, etc. Press “Run” to see stream node status, token usage, latency, and cost in real-time in the run panel. Detailed logging provides observability and makes debugging easier. All executions are written to .executions/runs/*.json.

After a run, download the output, compare it to previous runs, and get metadata for latency and usage details.

Let’s see it in action!

Executing an existing workflow

Run the Glaive Code Assistant workflow

SyGra Studio can also run existing workflows within tasks. For example, the tasks/examples/glaive_code_assistant/ workflow ingests the glaiveai/glaive-code-assistant-v2 dataset, drafts an answer, critiques, and loops until critique returns “NO MORE FEEDBACK”.

Inside the studio you will notice:

Canvas layout – Two LLM nodes (generate_answer and critique_answer) linked by a conditional edge. Return for further revisions or exit with END when your critiques are satisfied. Adjustable inputs – Execution modals allow you to toggle dataset splitting, adjust batch size, cap records, or adjust temperature without touching YAML. Observable execution – both nodes light up in turn, inspect intermediate critiques, and monitor status in real time. Generated output – Synthetic data is generated and ready for model training, evaluation pipelines, or annotation tools.

Let’s get started

git clone https://github.com/ServiceNow/SyGra.git
CD SyGra&&Make Studio

SyGra Studio turns your synthetic data workflow into a visual, user-friendly experience. Configured once, you can build with confidence, run with full observability, and generate data without ever leaving the canvas.

versatileai

See Full Bio

What's Hot

KiloClaw targets shadow AI with autonomous agent governance

Gemini 2.0 Flash Native Image Generation Experiment

Inside the AI agent strategy that helps companies improve their profitability

KiloClaw targets shadow AI with autonomous agent governance

Gemini 2.0 Flash Native Image Generation Experiment

Inside the AI agent strategy that helps companies improve their profitability

We had Claude fine-tune our open source LLM

Build a great dataset for video generation

Faster Text Generation with Self-Speculative Decoding

Most Popular

We had Claude fine-tune our open source LLM

Build a great dataset for video generation

Faster Text Generation with Self-Speculative Decoding

Don't Miss

KiloClaw targets shadow AI with autonomous agent governance

Gemini 2.0 Flash Native Image Generation Experiment

Inside the AI agent strategy that helps companies improve their profitability

Subscribe to Updates

What's Hot

Introduction to SyGra Studio

What you can do with Studio

Step 1: Configure the data source

Step 2: Build your flow visually

Step 3: Verify and run

Let’s see it in action!

Executing an existing workflow

Run the Glaive Code Assistant workflow

Let’s get started

Related Posts