<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech &#8211; Firethering</title>
	<atom:link href="https://firethering.com/tech/feed/" rel="self" type="application/rss+xml" />
	<link>https://firethering.com</link>
	<description>Firethering is Your Hub for AI, Open Source and Tech That Actually Matters</description>
	<lastBuildDate>Sat, 18 Apr 2026 20:28:30 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.6.5</generator>

<image>
	<url>https://firethering.com/wp-content/uploads/2024/10/cropped-firethering-FTR-favicon-32x32.png</url>
	<title>Tech &#8211; Firethering</title>
	<link>https://firethering.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Nucleus-Image: 17B Open-Source MoE Image Model Delivering GPT-Image Level Performance</title>
		<link>https://firethering.com/nucleus-image-open-source-moe-diffusion-model/</link>
					<comments>https://firethering.com/nucleus-image-open-source-moe-diffusion-model/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Sat, 18 Apr 2026 20:28:28 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Image Model]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6303</guid>

					<description><![CDATA[The mixture-of-experts trick changed how people think about LLMs. Instead of running every parameter on every token, you activate a small fraction of the network per forward pass and somehow the quality stays competitive while the compute drops. It's the reason models like Mixtral punched above their weight. Everyone in the LLM space understood it immediately. Nobody had done it openly for image generation. Until now.

Nucleus-Image is a 17B parameter diffusion transformer that activates roughly 2B parameters per forward pass. It beats Imagen4 on OneIG-Bench, sits at number one on DPG-Bench overall, and matches Qwen-Image on GenEval. 

It's also a base model. No fine-tuning, reinforcement learning or human preference tuning. What you're seeing in those benchmarks is raw pre-training performance. That's either impressive or a caveat depending on what you need it for, probably both.]]></description>
		
					<wfw:commentRss>https://firethering.com/nucleus-image-open-source-moe-diffusion-model/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>ERNIE-Image: Open-Source 8B Text-to-Image Model for Posters, Comics &#038; Structured Generation</title>
		<link>https://firethering.com/ernie-image-open-source-text-to-image/</link>
					<comments>https://firethering.com/ernie-image-open-source-text-to-image/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Thu, 16 Apr 2026 19:32:05 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Image Model]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6279</guid>

					<description><![CDATA[Text rendering in open source AI image generation has been broken for a long time. Ask most models to put readable words on a poster, lay out a comic panel, or generate anything where the text actually has to make sense and only few models can do it accurately and from rest you get something that looks like it was written by someone who learned the alphabet from a fever dream.

ERNIE-Image is Baidu's answer to that specific problem. It's an 8B open weight text-to-image model built on a Diffusion Transformer and it's genuinely good at dense text, structured layouts, posters, infographics and multi-panel compositions. 

It can run on a 24GB consumer GPU, it's on Hugging Face right now, and it comes in two versions, a full quality model and a turbo variant that gets there in 8 steps instead of 50.]]></description>
		
					<wfw:commentRss>https://firethering.com/ernie-image-open-source-text-to-image/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>MOSS-TTS-Nano: Real-Time Voice AI on CPU, Part of an Open-Source Stack Rivaling Gemini</title>
		<link>https://firethering.com/moss-tts-nano-open-source-tts/</link>
					<comments>https://firethering.com/moss-tts-nano-open-source-tts/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Wed, 15 Apr 2026 08:51:04 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[TTS]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6258</guid>

					<description><![CDATA[Most text-to-speech tools fall into two camps. The ones that sound good need serious hardware. The ones that run on anything sound robotic. MOSS-TTS-Nano is trying to be neither.

It's a 100 million parameter model that runs on a regular CPU and it actually sounds good. Good enough that the team behind it built an entire family of speech models around the same core technology, one of which has gone head to head with Gemini 2.5 Pro and ElevenLabs and come out ahead on speaker similarity.

It just dropped on April 10th and it's the newest addition to the MOSS-TTS family, a collection of five open source speech models from MOSI.AI and the OpenMOSS team. The family doesn't just cover lightweight local deployment. One of its models MOSS-TTSD outperforms Gemini 2.5 Pro and ElevenLabs on speaker similarity in benchmarks. Another generates voices purely from text descriptions with no reference audio needed. And one is built specifically for real-time voice agents with a 180ms first-byte latency.

Nano is the entry point. The family is the story.]]></description>
		
					<wfw:commentRss>https://firethering.com/moss-tts-nano-open-source-tts/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		<enclosure url="https://openmoss.github.io/MOSS-TTS-Nano-Demo/assets/%F0%9F%87%BA%F0%9F%87%B8%20A%20Gentle%20Reminder.wav" length="6051918" type="audio/wav" />

			</item>
		<item>
		<title>Gen-Searcher: An Open Source AI That Searches the Web Before Generating Images</title>
		<link>https://firethering.com/gen-searcher-open-source-image-generation/</link>
					<comments>https://firethering.com/gen-searcher-open-source-image-generation/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Mon, 13 Apr 2026 15:25:12 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Image Generation Model]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6237</guid>

					<description><![CDATA[Your image generator has never seen today. It was trained months ago, maybe longer, and everything it draws comes from that frozen snapshot of the world. Ask it to generate a current news moment, a product that launched last month, or anything that requires knowing what's happening right now and it fills in the gaps with a confident guess. Sometimes that guess is close. Often it isn't.

Gen-Searcher does something none of the mainstream tools do. Before it draws a single pixel, it goes and looks things up. It searches the web. It browses sources. It pulls visual references. Then it generates. The result is an image grounded in actual current information.

It's open source, the weights are on Hugging Face, and the team released everything including code, training data, benchmark, the lot.]]></description>
		
					<wfw:commentRss>https://firethering.com/gen-searcher-open-source-image-generation/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>MiniMax M2.7: The Agentic Model That Helped Build Itself</title>
		<link>https://firethering.com/minimax-m2-7-agentic-model/</link>
					<comments>https://firethering.com/minimax-m2-7-agentic-model/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Sun, 12 Apr 2026 08:29:30 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Tech News]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6218</guid>

					<description><![CDATA[MiniMax handed an internal version of M2.7 a programming scaffold and let it run unsupervised. Over 100 rounds it analyzed its own failures, modified its own code, ran evaluations, and decided what to keep and what to revert. The result was a 30% performance improvement with nobody directing each step. That is not a benchmark result. That is a different way of thinking about how AI models get built.

M2.7 is now available on HuggingFace with weights you can download and deploy. NVIDIA is offering free API access if you want to try it without the hardware overhead. The license has a commercial limitation worth knowing about, we will get to that.]]></description>
		
					<wfw:commentRss>https://firethering.com/minimax-m2-7-agentic-model/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Marco MoE Uses 5% of Its Parameters but Outperforms Models 3× Its Size</title>
		<link>https://firethering.com/marco-moe-nano-mini/</link>
					<comments>https://firethering.com/marco-moe-nano-mini/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Sat, 11 Apr 2026 19:59:54 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6203</guid>

					<description><![CDATA[Most AI models are what they appear to be. A 12B parameter model uses 12B parameters. What you see is what runs.

Marco MoE does not work that way. Alibaba built two models, Marco Nano and Marco Mini, that carry billions of parameters but wake up only a tiny fraction of them for each request. Marco Nano activates 0.6 billion out of 8 billion. Marco Mini activates 0.86 billion out of 17.3 billion. Less than 5% of either model is actually working at any moment.

The part that makes this worth paying attention to is what that 5% manages to do against models running at full capacity.]]></description>
		
					<wfw:commentRss>https://firethering.com/marco-moe-nano-mini/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>VoxCPM2 lets you create voices just by describing them and it is open source</title>
		<link>https://firethering.com/voxcpm2-voice-cloning/</link>
					<comments>https://firethering.com/voxcpm2-voice-cloning/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 20:47:50 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6164</guid>

					<description><![CDATA[Most AI voice tools give you two options. Clone an existing voice or pick from a list of defaults. If neither works for what you need, you are stuck.

VoxCPM2 adds a third option. You describe what you want. A young woman, gentle tone, slightly slow pace. A deep male voice with a formal cadence. Whatever you can put into words, it generates from scratch, no recording needed.

That alone would make it interesting. But it also does voice cloning, supports 30 languages without needing a language tag, outputs 48kHz audio, runs on 8GB of VRAM, and ships under Apache 2.0. The whole thing is two billion parameters and installs with a single pip command.

I tried the audio samples and the results are genuinely good. Not fully human, but natural enough that you stop noticing the model and start paying attention to what it is saying. Mixed languages, different emotions, and you can steer all of it.]]></description>
		
					<wfw:commentRss>https://firethering.com/voxcpm2-voice-cloning/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		<enclosure url="https://openbmb.github.io/voxcpm2-demopage/audio/voice_design/vd_acoustic_asmr.wav" length="1213484" type="audio/wav" />

			</item>
		<item>
		<title>Meta’s Muse Spark: A Closed Bet on Multimodal, Multi-Agent AI</title>
		<link>https://firethering.com/meta-muse-spark-multimodal-ai/</link>
					<comments>https://firethering.com/meta-muse-spark-multimodal-ai/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 13:30:36 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Models]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6159</guid>

					<description><![CDATA[Meta has a new AI model and for the first time in years it is not called Llama.

Muse Spark launched yesterday under Meta Superintelligence Labs, a new internal division Meta quietly formed by bringing together researchers from Google DeepMind and other frontier labs. It is natively multimodal, supports multi-agent reasoning, and is available right now at meta.ai. It is also not being released as open weights.

That last part is worth sitting with for a second. Meta built one of the most trusted brands in open source AI through Llama. Developers built on it, researchers published with it. Muse Spark continues none of that. No weights, no HuggingFace release, private API preview only.

What you get instead is a genuinely capable multimodal model with some benchmark numbers that are hard to ignore and a new reasoning mode called Contemplating that puts it in conversation with Gemini Deep Think and GPT Pro. Whether that trade is worth it depends entirely on what you were using Meta AI for in the first place.]]></description>
		
					<wfw:commentRss>https://firethering.com/meta-muse-spark-multimodal-ai/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>GLM 5.1: The open source model that gets better the longer you run it</title>
		<link>https://firethering.com/glm-5-1-open-source-agentic-model/</link>
					<comments>https://firethering.com/glm-5-1-open-source-agentic-model/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Wed, 08 Apr 2026 16:23:40 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Agents]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6149</guid>

					<description><![CDATA[Give an AI agent a hard problem and it usually figures out the easy wins fast. After that, more time does not help. It just sits there, trying the same things.

ZhipuAI ran GLM-5.1 on a vector database optimization problem and let it go for 600 iterations. It did not run out of ideas. At iteration 50 it was sitting at roughly the same performance as the best single-session result any model had achieved. By iteration 600 it had reached 21,500 queries per second. The previous best was 3,547.

That gap is not incremental improvement. It is a different category of result. GLM-5.1 is open source, MIT licensed, and the weights are on HuggingFace right now. It works with Claude Code, vLLM, and SGLang. If you are building anything that runs agents over long tasks, this one is worth understanding.]]></description>
		
					<wfw:commentRss>https://firethering.com/glm-5-1-open-source-agentic-model/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Bonsai 8B: A 1-Bit LLM That Delivers 8B-Class Performance at 1/14th the Size</title>
		<link>https://firethering.com/bonsai-8b-1bit-llm/</link>
					<comments>https://firethering.com/bonsai-8b-1bit-llm/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Tue, 07 Apr 2026 20:11:07 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6128</guid>

					<description><![CDATA[Nobody expected a 1.15 GB model to score competitively against full precision 8B models. That is not how this usually goes.

PrismML released Bonsai 8B last month and the headline number is almost absurd. The whole model, weights and all, fits in 1.15 GB. For context, the standard FP16 version of a comparable 8B model sits at around 16 GB. Bonsai beats or matches several of them on benchmarks while being 14 times smaller. It runs on a phone. There is literally an iPhone build.

I want to be clear that these numbers come from PrismML's own evaluations, not independent third party testing. But even with that caveat, this is worth paying attention to.]]></description>
		
					<wfw:commentRss>https://firethering.com/bonsai-8b-1bit-llm/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
