Skip to main content

Programmatic video creation

The Video Creation API renders finished MP4 videos from a timeline you describe in XML. You assemble clips, images, text, transitions, filters, and animations into layers, send the timeline to WeVideo, and poll for a signed download URL when the render is done.

This is a server-to-server API. There is no editor UI involved — the timeline is data, you build it however you like, and the API turns it into a video.

The flow at a glance

  1. Submit a timeline. POST /public/videos with your timeline XML returns a jobId.
  2. Poll for status. GET /public/videos/{jobId} reports PROCESSING until the render finishes, then returns COMPLETED with a signed url to the MP4 and a thumbnailUrl.
  3. Download the video from the signed URL, or hand it to the end user.

Render time scales with the timeline length, resolution, and queue load — short clips at 720p typically finish in well under a minute; long, high-resolution renders can take several. The API is asynchronous by design, so always treat the first response as "queued" and rely on the status endpoint to drive what happens next.

Quickstart

The smallest useful request is a single image held on screen for five seconds.

Submit a render
curl https://www.wevideo.com/api/5/public/videos \
-X POST \
-H 'Content-Type: application/json' \
-H 'Authorization: WEVSIMPLE your-api-secret-here' \
-d '{
"version": "1",
"resolution": "720p",
"content": "<timeline version=\"1\"><layers><layer><image src=\"https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/field.jpg\" duration=\"5000\" /></layer></layers></timeline>"
}'
Response
{ "jobId": "2026_05_12_a1b2c3d4-e5f6-7890-abcd-ef1234567890" }

Poll the status endpoint until status flips to COMPLETED:

Check status
curl https://www.wevideo.com/api/5/public/videos/2026_05_12_a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
-H 'Authorization: WEVSIMPLE your-api-secret-here'
Completed response
{
"status": "COMPLETED",
"url": "https://wevideo-export.s3.amazonaws.com/.../output.mp4?...",
"thumbnailUrl": "https://wevideo-export.s3.amazonaws.com/.../thumb.jpg?..."
}

The signed url is the MP4. In general we recommend copying the video to your own storage as soon as possible, since the signed URL expires after a limited time.

Request parameters

The full schema lives in the API reference. The fields you'll touch most often:

FieldRequiredDescription
versionyesTimeline schema version. Use "1".
contentyesThe timeline XML, as a string. See Timeline XML format below.
resolutionnoPreset (240p, 360p, 480p, 720p, 1080p, 2160p) or WIDTHxHEIGHT (e.g. 1080x1920 for vertical). Width and height must be even. Default: 480p.
fpsnoOutput frame rate. Default: 25.
crfnoH.264 quality. Lower is better quality at a larger file size; sane range is roughly 18–28. Default: 20.
thumbnailTimenoMilliseconds into the timeline to grab the poster frame from. Defaults to the start.
fromMs/toMsnoRender only a sub-range of the timeline. Useful for quick proofs before committing to a long export.

Timeline XML format

The timeline is a tree:

<timeline>
└── <layers>
└── <layer> ← one or more, stacked top-to-bottom
└── media ← <video>, <image>, <audio>, <text>, <html>, <motionTitle>
└── transitions ← between media items in the same layer
└── filters ← per-item effects
└── animations ← per-item motion

Two ground rules:

  • Durations are milliseconds. Everywhere. duration="5000" means five seconds.
  • Layers stack top to bottom. The first <layer> paints on top of the next. Put background music or a base image last; put titles and overlays first.

Layers

Every timeline needs at least one layer. Layers run in parallel — items inside a layer play sequentially.

<timeline version="1">
<layers>
<layer>
<text duration="3000" fontSize="60" color="#ffffff">Hello, world</text>
</layer>
<layer>
<video src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/videos/cats.mp4" duration="3000" />
</layer>
</layers>
</timeline>

The text floats over the video because the text layer comes first.

Video, image, and audio

All three share the same core attributes:

AttributeApplies toDescription
srcallPublicly reachable URL (HTTPS). The renderer downloads it server-side.
durationallHow long the item occupies the layer, in ms.
inPointvideo, audioOffset into the source media to start from, in ms. Defaults to 0.
loopablevideoSet to "true" to loop the clip when duration exceeds its native length.
volumevideo, audio0 (muted) to 1 (full). Can also be animated; see Volume.

Visual items (video, image) also accept positioning attributes. By default they fit to the canvas (while maintaining the aspect ratio); specify any of left, top, right, bottom, width, height to override.

<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/goldenGate.jpg"
duration="4000"
width="200" height="200"
right="40" top="40" />

Text

Text is rendered server-side using fonts that ship with the renderer, custom TTF files, or Google Fonts.

<text duration="3000"
font="Roboto" fontSize="72" color="#ffffff"
align="center" valign="center">
Coming up next
</text>
AttributeDescription
fontBuilt-in font name or a custom font declared via <font> / <href>.
fontSizePixels at 1080p; scales with the output resolution.
colorHex color, e.g. #ffffff.
backgroundColorHex color for the text box background.
alignleft, center, right.
valigntop, center, bottom.
width/heightText box size; defaults to the canvas.

For rich content (HTML markup, line breaks, mixed styles), set processHtml="true" and wrap the body in CDATA:

<text duration="3000" fontSize="60" processHtml="true">
<![CDATA[
<span style="color:#ff6b6b;">Bold</span> moves<br/>
<em>start here</em>
]]>
</text>

Built-in fonts: 2Dumb, 3Dumb, AlfaSlabOne, Alike, AnticSlab, Asap, Bangers, BEARPAW, Cabin, CantataOne, Cinzel, College, Distant Galaxy, DoppioOne, Electrolize, Existence, Existence Stencil, Flux Architect, Habibi, LearningCurve, Neo Retro, Playball, PROMESH, Roboto, Secret Typewriter, True Crimes.

Custom fonts: declare them once inside a <fonts> block on the timeline and reference by name. TTF files load via the src attribute; Google Fonts load via href.

<timeline version="1">
<fonts>
<font name="Ravi Prakash" href="https://fonts.googleapis.com/css?family=Ravi+Prakash" />
</fonts>
<layers>
<layer>
<text duration="3000" font="Ravi Prakash">A Google Font</text>
</layer>
</layers>
</timeline>

Motion titles

Motion titles are pre-built animated text sequences (lower thirds, openers, callouts) that you parameterize with your own copy, colors, and fonts. They behave like any other media item — drop them in a layer, reference the template by id, and give it a duration. Lines go inside a <lines> wrapper; an optional <colors> block overrides the template's palette. See the Motion Titles listing endpoint for available templates and parameters.

<motionTitle id="12345" duration="4000">
<lines>
<line font="Roboto" color="#ffffff">Field report</line>
<line font="Roboto" color="#ffffff">Vienna, May 2026</line>
</lines>
<colors>
<color key="primary" value="#ff6b6b" />
</colors>
</motionTitle>

Transitions

A <transition> placed between two media items in the same layer fades, wipes, or swaps the cut. The renderer automatically trims the surrounding clips by the transition's duration — you do not need to subtract it yourself.

<layer>
<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/field.jpg" duration="3000" />
<transition type="crossFade" duration="500" />
<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/goldenGate.jpg" duration="3000" />
</layer>

Add fadeAudio="true" to fade audio in tandem on transitions between clips with sound.

Available types (grouped roughly by feel):

  • Cross-fades: crossFade, crossDissolve, crossHatch, crossZoom, fadeThroughBlack, fadeThroughWhite
  • Wipes & slides: wipeLeft, wipeRight, wipeUp, wipeDown, slide, swap, directionalWipe, ripple, radial
  • 3D: 3dCubeHorizontal, 3dCubeVertical, 3dFlipHorizontal, 3dFlipVertical, 3dTileHorizontal, 3dTileVertical
  • Patterned: mosaic, burn, atmospheric, pageCurl, patternSquare, patternDiamond, patternBlock, patternBubble, patternCross, patternLine, patternOriental, patternStare, patternDotSmall, patternDotLarge
  • Grids & geometric: gridCircle, gridDiagonal, gridFalling, gridRandom, gridSequential, gridSliding, puzzleRight, flyEye
  • Decorative animations: animationBirds, animationRainbow, animationOrnamentalForm, animationCurtains, animationParasol, animationFilmstrip, animationPaperPlane

Filters

Filters attach to a single media item and run for its full duration unless you constrain them with startTime/endTime (in ms, relative to the item).

<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/field.jpg" duration="4000">
<filter type="colorCorrection" brightness="10" saturation="-20" temperature="7800" />
<filter type="vignette" alpha="0.6" radius="0.5" softness="0.3" />
</image>
FilterKey parameters
colorCorrectionbrightness, contrast, saturation (-100 to 100); hue (-180 to 180); temperature (3000–25000, default 6500); tint (-100 to 100).
blurstrength (default 0.03).
sepia
sharpen
blackAndWhite
vignettealpha, radius, softness.
scanlines
invertColors
dream
speedvalue between 0.1 and 10. Remember to adjust the item's duration to match.
colorOverlaycolor (hex RGBA), blendingMode (normal, multiply, screen, overlay, darken, lighten, colordodge, colorburn, hardlight, softlight, difference, exclusion).
chromaKeycolor (hex) — removes that color from the source, e.g. for green-screen footage.
masksrc (URL to a black/white image or video). Add invert="true" to flip the mask.
audioDuckingBeta. threshold, duckingLevel (0–1), attack, decay, sensitivity. Lowers background audio when a foreground track is speaking.

Animations

Animate position, scale, and opacity over the lifetime of an item.

Position — animate left, right, top, or bottom individually. Each animation runs on a single axis between startValue and endValue, over startTimeendTime (ms, relative to the item). Combine two animations to move on both axes.

<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/goldenGate.jpg" duration="3000" width="600" height="400" top="100">
<animation type="left" startValue="-600" endValue="200"
startTime="0" endTime="600" easing="cubicEaseOut" />
</image>

Scale — Ken Burns-style zoom:

<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/field.jpg" duration="5000">
<animation type="scale" startValue="1.0" endValue="1.2" easing="linear" />
</image>

For non-uniform scale, use startWidth/endWidth and startHeight/endHeight. Add unit="px" for pixel-based scaling, or gravity="NorthWest|North|NorthEast|West|Center|East|SouthWest|South|SouthEast" to anchor the scale to a specific corner.

Easing functions: linear, plus In/Out/InOut variants of bounce, back, circ, cubic, elastic, expo, quad, quart, quint, and sine (e.g. cubicEaseInOut, bounceEaseOut).

Opacity

Opacity is a keyframed graph, not a single value. Each <entry> is a time (ms, relative to the item) and an alpha value between 0 and 1. The renderer linearly interpolates between entries.

<image src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/images/goldenGate.jpg" duration="4000">
<opacity>
<entry time="0" value="0" />
<entry time="500" value="1" />
<entry time="3500" value="1" />
<entry time="4000" value="0" />
</opacity>
</image>

Volume

Volume works the same way on <audio> and <video> items: a single volume attribute for a constant level, or a keyframed graph for fades and ducks.

<audio src="https://wevideo-static.s3.us-east-1.amazonaws.com/assets/apidemo/audio/audio_1808fbf07a.mp3" duration="30000">
<volume>
<entry time="0" value="0" />
<entry time="2000" value="0.8" />
<entry time="28000" value="0.8" />
<entry time="30000" value="0" />
</volume>
</audio>

You can also set a layer-level volume attribute to scale everything in the layer.

YouTube publishing

Add a youtube object to the request to upload the finished video to YouTube in addition to storing it on WeVideo. The completed status response includes the YouTube URL under destinations.youtube.

{
"version": "1",
"content": "<timeline ...>",
"youtube": {
"title": "Field report — Vienna",
"description": "Behind the scenes.",
"categoryId": 22,
"keywords": "travel, vienna, behind the scenes",
"privacy": "unlisted"
}
}

categoryId is YouTube's numeric category ID (e.g. 22 for People & Blogs). privacy is one of public, private, or unlisted.

Status responses

In progress
{ "status": "PROCESSING" }
Done
{
"status": "COMPLETED",
"url": "https://wevideo-export-videos.s3.amazonaws.com/.../output.mp4?...",
"thumbnailUrl": "https://wevideo-export-videos.s3.amazonaws.com/.../thumb.jpg?..."
}
Failed
{
"status": "FAILED",
"errors": [ { /* ... */ } ],
"message": "Source media at https://example.com/missing.mp4 could not be downloaded."
}
StatusWhat it means
PROCESSINGQueued or actively rendering. Keep polling.
COMPLETEDRender finished. url and thumbnailUrl are signed and time-limited — download or cache promptly.
FAILEDRender failed. Inspect errors (structured) or message (human-readable) to diagnose.
NOT_FOUNDThe jobId doesn't match any known job, or it has aged out. Re-submit the request.

Signed URLs expire — re-fetch the status endpoint to mint a fresh one if you need to download again later.

Errors and limits

Common submission errors:

StatusLikely cause
400Malformed timeline XML, missing required fields (version, content), or invalid resolution.
401Missing, malformed, or wrong Authorization header. See Authentication.
403Credentials are valid but the user does not have permission to render in this Instance.
5xxTransient server error. Retry with backoff; if it persists, contact support.

Resolution constraints: custom dimensions must be even numbers and within 200–4000 px on each side. Aspect ratios outside 16:9 are fine — 1080x1920 (vertical) and 1080x1080 (square) are common.

Source media: every src must be reachable from the public internet over HTTPS. Pre-signed S3 URLs work as long as they have not expired by the time the renderer fetches them.

Next steps