SyGra 2.0.0 introduces Studio, an interactive environment that turns synthetic data generation into transparent visual craft. Instead of working with YAML files and the terminal, create flows directly on the canvas, preview datasets before committing, use inline variable hints to adjust prompts, and monitor execution streams live. You can do all this from one pane. Under the hood, it’s the same platform, so every operation visually generates a corresponding SyGra-compatible graph configuration and task execution script.
What you can do with Studio
Configure and validate models using guided forms (OpenAI, Azure OpenAI, Ollama, Vertex, Bedrock, vLLM, custom endpoints). Connect to Hugging Face, file system, or ServiceNow data sources and preview rows before execution. Configure nodes by selecting a model, creating prompts (using auto-suggested variables), and defining outputs or structured schemas. Design downstream outputs using shared state variables and mappings powered by Pydantic. Run flows end-to-end and instantly see the generated results along with node-level progress. Debugging with inline logging, breakpoints, Monaco-assisted code editor, and autosaved drafts. Monitor per-execution token cost, latency, and guardrail results using execution history stored in .executions/.
Let’s walk through this experience step by step.
Step 1: Configure the data source
When you open Studio and click Create Flow, the start/end nodes automatically appear. Before adding anything else:
Select your connector (Hugging Face, Disk, or ServiceNow). Enter parameters such as repo_id, split, file path, and click (Preview) to fetch sample rows. Column names quickly become state variables (e.g. {prompt}, {genre}), so you know exactly what you can refer to in prompts and processors.
Once validation is complete, Studio keeps your configuration in sync and pipes these variables throughout your flow. No manual wiring or guesswork required.
Step 2: Build your flow visually
Drag the required block from the palette. For the story generation pipeline:
Drop an LLM node named “Story Generator”, select a configured model (such as gpt-4o-mini), create a prompt, and save the result to story_body. Add a second LLM node named “Story Summarizer”, reference {story_body} in the prompt, and output to story_summary. If you need reusable logic or branching behavior, you can toggle structured outputs, connect tools, and add Lambda/Subgraph nodes.
Studio’s details panel keeps everything in context: model parameters, prompt editor, tool configuration, pre/post-processing code, and even multi-LLM settings if you need parallel generation. Type { at the prompt to instantly display all available state variables.
Step 3: Verify and run
Open the code panel to inspect the exact YAML/JSON that Studio is generating. This is the same artifact written to tasks/examples/, so what you see is what gets committed.
When you’re ready to run:
Click Run Workflow. Select the number of records, batch size, retry behavior, etc. Press “Run” to see stream node status, token usage, latency, and cost in real-time in the run panel. Detailed logging provides observability and makes debugging easier. All executions are written to .executions/runs/*.json.
After a run, download the output, compare it to previous runs, and get metadata for latency and usage details.
Let’s see it in action!
Executing an existing workflow
Run the Glaive Code Assistant workflow
SyGra Studio can also run existing workflows within tasks. For example, the tasks/examples/glaive_code_assistant/ workflow ingests the glaiveai/glaive-code-assistant-v2 dataset, drafts an answer, critiques, and loops until critique returns “NO MORE FEEDBACK”.
Inside the studio you will notice:
Canvas layout – Two LLM nodes (generate_answer and critique_answer) linked by a conditional edge. Return for further revisions or exit with END when your critiques are satisfied. Adjustable inputs – Execution modals allow you to toggle dataset splitting, adjust batch size, cap records, or adjust temperature without touching YAML. Observable execution – both nodes light up in turn, inspect intermediate critiques, and monitor status in real time. Generated output – Synthetic data is generated and ready for model training, evaluation pipelines, or annotation tools.
Let’s get started
git clone https://github.com/ServiceNow/SyGra.git
CD SyGra&&Make Studio
SyGra Studio turns your synthetic data workflow into a visual, user-friendly experience. Configured once, you can build with confidence, run with full observability, and generate data without ever leaving the canvas.

