Bria Fibo (original) (raw)

Text-to-image models have mastered imagination - but not control. FIBO changes that.

FIBO is trained on structured JSON captions up to 1,000+ words and designed to understand and control different visual parameters such as lighting, composition, color, and camera settings, enabling precise and reproducible outputs.

With only 8 billion parameters, FIBO provides a new level of image quality, prompt adherence and proffesional control.

FIBO is trained exclusively on a structured prompt and will not work with freeform text prompts. you can use the FIBO-VLM-prompt-to-JSON model or the FIBO-gemini-prompt-to-JSON to convert your freeform text prompt to a structured JSON prompt.

Avoid using freeform text prompts directly with FIBO because it does not produce the best results.

Refer to the Bria Fibo Hugging Face page to learn more.

Usage

As the model is gated, before using it with diffusers you first need to go to the Bria Fibo Hugging Face page, fill in the form and accept the gate. Once you are in, you need to login so that your system knows you’ve accepted the gate.

Use the command below to log in:

BriaFiboPipeline

class diffusers.BriaFiboPipeline

< source >

( transformer: BriaFiboTransformer2DModel scheduler: typing.Union[diffusers.schedulers.scheduling_flow_match_euler_discrete.FlowMatchEulerDiscreteScheduler, diffusers.schedulers.scheduling_utils.KarrasDiffusionSchedulers] vae: AutoencoderKLWan text_encoder: SmolLM3ForCausalLM tokenizer: AutoTokenizer )

Parameters

__call__

< source >

( prompt: typing.Union[str, typing.List[str]] = None height: typing.Optional[int] = None width: typing.Optional[int] = None num_inference_steps: int = 30 timesteps: typing.List[int] = None guidance_scale: float = 5 negative_prompt: typing.Union[str, typing.List[str], NoneType] = None num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None latents: typing.Optional[torch.FloatTensor] = None prompt_embeds: typing.Optional[torch.FloatTensor] = None negative_prompt_embeds: typing.Optional[torch.FloatTensor] = None output_type: typing.Optional[str] = 'pil' return_dict: bool = True joint_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None callback_on_step_end: typing.Optional[typing.Callable[[int, int, typing.Dict], NoneType]] = None callback_on_step_end_tensor_inputs: typing.List[str] = ['latents'] max_sequence_length: int = 3000 do_patching = False ) → ~pipelines.flux.BriaFiboPipelineOutput or tuple

Parameters

Returns

~pipelines.flux.BriaFiboPipelineOutput or tuple

~pipelines.flux.BriaFiboPipelineOutput ifreturn_dict is True, otherwise a tuple. When returning a tuple, the first element is a list with the generated images.

Function invoked when calling the pipeline for generation.

Example:

import torch from diffusers import BriaFiboPipeline from diffusers.modular_pipelines import ModularPipeline

torch.set_grad_enabled(False) vlm_pipe = ModularPipeline.from_pretrained("briaai/FIBO-VLM-prompt-to-JSON", trust_remote_code=True)

pipe = BriaFiboPipeline.from_pretrained( "briaai/FIBO", trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.enable_model_cpu_offload()

with torch.inference_mode():

output = vlm_pipe(prompt="a beautiful dog")
json_prompt_generate = output.values["json_prompt"]


results_generate = pipe(prompt=json_prompt_generate, num_inference_steps=50, guidance_scale=5)
results_generate.images[0].save("image_generate.png")

encode_prompt

< source >

( prompt: typing.Union[str, typing.List[str]] device: typing.Optional[torch.device] = None num_images_per_prompt: int = 1 guidance_scale: float = 5 negative_prompt: typing.Union[str, typing.List[str], NoneType] = None prompt_embeds: typing.Optional[torch.FloatTensor] = None negative_prompt_embeds: typing.Optional[torch.FloatTensor] = None max_sequence_length: int = 3000 lora_scale: typing.Optional[float] = None )

Parameters

Update on GitHub