<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Firethering</title>
	<atom:link href="https://firethering.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://firethering.com</link>
	<description>Firethering is Your Hub for AI, Open Source and Tech That Actually Matters</description>
	<lastBuildDate>Thu, 16 Apr 2026 19:59:25 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.6.5</generator>

<image>
	<url>https://firethering.com/wp-content/uploads/2024/10/cropped-firethering-FTR-favicon-32x32.png</url>
	<title>Firethering</title>
	<link>https://firethering.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>ERNIE-Image: Open-Source 8B Text-to-Image Model for Posters, Comics &#038; Structured Generation</title>
		<link>https://firethering.com/ernie-image-open-source-text-to-image/</link>
					<comments>https://firethering.com/ernie-image-open-source-text-to-image/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Thu, 16 Apr 2026 19:32:05 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Image Model]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6279</guid>

					<description><![CDATA[Text rendering in open source AI image generation has been broken for a long time. Ask most models to put readable words on a poster, lay out a comic panel, or generate anything where the text actually has to make sense and only few models can do it accurately and from rest you get something that looks like it was written by someone who learned the alphabet from a fever dream.

ERNIE-Image is Baidu's answer to that specific problem. It's an 8B open weight text-to-image model built on a Diffusion Transformer and it's genuinely good at dense text, structured layouts, posters, infographics and multi-panel compositions. 

It can run on a 24GB consumer GPU, it's on Hugging Face right now, and it comes in two versions, a full quality model and a turbo variant that gets there in 8 steps instead of 50.]]></description>
		
					<wfw:commentRss>https://firethering.com/ernie-image-open-source-text-to-image/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>MOSS-TTS-Nano: Real-Time Voice AI on CPU, Part of an Open-Source Stack Rivaling Gemini</title>
		<link>https://firethering.com/moss-tts-nano-open-source-tts/</link>
					<comments>https://firethering.com/moss-tts-nano-open-source-tts/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Wed, 15 Apr 2026 08:51:04 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[TTS]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6258</guid>

					<description><![CDATA[Most text-to-speech tools fall into two camps. The ones that sound good need serious hardware. The ones that run on anything sound robotic. MOSS-TTS-Nano is trying to be neither.

It's a 100 million parameter model that runs on a regular CPU and it actually sounds good. Good enough that the team behind it built an entire family of speech models around the same core technology, one of which has gone head to head with Gemini 2.5 Pro and ElevenLabs and come out ahead on speaker similarity.

It just dropped on April 10th and it's the newest addition to the MOSS-TTS family, a collection of five open source speech models from MOSI.AI and the OpenMOSS team. The family doesn't just cover lightweight local deployment. One of its models MOSS-TTSD outperforms Gemini 2.5 Pro and ElevenLabs on speaker similarity in benchmarks. Another generates voices purely from text descriptions with no reference audio needed. And one is built specifically for real-time voice agents with a 180ms first-byte latency.

Nano is the entry point. The family is the story.]]></description>
		
					<wfw:commentRss>https://firethering.com/moss-tts-nano-open-source-tts/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		<enclosure url="https://openmoss.github.io/MOSS-TTS-Nano-Demo/assets/%F0%9F%87%BA%F0%9F%87%B8%20A%20Gentle%20Reminder.wav" length="6051918" type="audio/wav" />

			</item>
		<item>
		<title>Gen-Searcher: An Open Source AI That Searches the Web Before Generating Images</title>
		<link>https://firethering.com/gen-searcher-open-source-image-generation/</link>
					<comments>https://firethering.com/gen-searcher-open-source-image-generation/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Mon, 13 Apr 2026 15:25:12 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Image Generation Model]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6237</guid>

					<description><![CDATA[Your image generator has never seen today. It was trained months ago, maybe longer, and everything it draws comes from that frozen snapshot of the world. Ask it to generate a current news moment, a product that launched last month, or anything that requires knowing what's happening right now and it fills in the gaps with a confident guess. Sometimes that guess is close. Often it isn't.

Gen-Searcher does something none of the mainstream tools do. Before it draws a single pixel, it goes and looks things up. It searches the web. It browses sources. It pulls visual references. Then it generates. The result is an image grounded in actual current information.

It's open source, the weights are on Hugging Face, and the team released everything including code, training data, benchmark, the lot.]]></description>
		
					<wfw:commentRss>https://firethering.com/gen-searcher-open-source-image-generation/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>File Converter Pro offline file converter for images audio video and documents</title>
		<link>https://firethering.com/file-converter-pro-offline-file-converter-images-audio-video-documents/</link>
					<comments>https://firethering.com/file-converter-pro-offline-file-converter-images-audio-video-documents/#respond</comments>
		
		<dc:creator><![CDATA[Firethering Team]]></dc:creator>
		<pubDate>Mon, 13 Apr 2026 10:17:44 +0000</pubDate>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Utilities]]></category>
		<category><![CDATA[Windows]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6228</guid>

					<description><![CDATA[Most file converters still push you to upload your files somewhere. Even for basic stuff like changing a PDF or converting an image. It works, but it’s not something you feel great about, especially with random files.

File Converter Pro works like a simple offline converter. You drop files in, pick what you want, and it converts everything locally. No uploads or any server.

The UI isn’t just functional, it actually looks like someone cared. Smooth startup, proper dark mode, small touches that make it feel like a real app instead of a side project.

There’s also some extra stuff like stats and achievements. Sounds gimmicky, but it kind of works. You start noticing how often you use it. It’s not lightweight though. And if you want audio or video conversions, you’ll need FFmpeg. But once that’s sorted, you’re done setting things up.]]></description>
		
					<wfw:commentRss>https://firethering.com/file-converter-pro-offline-file-converter-images-audio-video-documents/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>MiniMax M2.7: The Agentic Model That Helped Build Itself</title>
		<link>https://firethering.com/minimax-m2-7-agentic-model/</link>
					<comments>https://firethering.com/minimax-m2-7-agentic-model/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Sun, 12 Apr 2026 08:29:30 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Tech News]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6218</guid>

					<description><![CDATA[MiniMax handed an internal version of M2.7 a programming scaffold and let it run unsupervised. Over 100 rounds it analyzed its own failures, modified its own code, ran evaluations, and decided what to keep and what to revert. The result was a 30% performance improvement with nobody directing each step. That is not a benchmark result. That is a different way of thinking about how AI models get built.

M2.7 is now available on HuggingFace with weights you can download and deploy. NVIDIA is offering free API access if you want to try it without the hardware overhead. The license has a commercial limitation worth knowing about, we will get to that.]]></description>
		
					<wfw:commentRss>https://firethering.com/minimax-m2-7-agentic-model/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Marco MoE Uses 5% of Its Parameters but Outperforms Models 3× Its Size</title>
		<link>https://firethering.com/marco-moe-nano-mini/</link>
					<comments>https://firethering.com/marco-moe-nano-mini/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Sat, 11 Apr 2026 19:59:54 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6203</guid>

					<description><![CDATA[Most AI models are what they appear to be. A 12B parameter model uses 12B parameters. What you see is what runs.

Marco MoE does not work that way. Alibaba built two models, Marco Nano and Marco Mini, that carry billions of parameters but wake up only a tiny fraction of them for each request. Marco Nano activates 0.6 billion out of 8 billion. Marco Mini activates 0.86 billion out of 17.3 billion. Less than 5% of either model is actually working at any moment.

The part that makes this worth paying attention to is what that 5% manages to do against models running at full capacity.]]></description>
		
					<wfw:commentRss>https://firethering.com/marco-moe-nano-mini/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>DockDoor macOS app for window previews and Alt Tab switching</title>
		<link>https://firethering.com/dockdoor-macos-window-previews-alt-tab-switching/</link>
					<comments>https://firethering.com/dockdoor-macos-window-previews-alt-tab-switching/#respond</comments>
		
		<dc:creator><![CDATA[Firethering Team]]></dc:creator>
		<pubDate>Sat, 11 Apr 2026 09:56:42 +0000</pubDate>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Productivity]]></category>
		<category><![CDATA[Utilities]]></category>
		<category><![CDATA[macOS utilities]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6183</guid>

					<description><![CDATA[macOS looks clean until you have five Safari windows open and no clue which one actually has the tab you need. DockDoor fixes that in the simplest way possible. Hover over an app in the dock, and it shows you every open window right there. You just click the one you want. That’s it.

It also adds a proper Alt+Tab experience. Not the macOS version that switches apps, but actual window switching with previews, the way Windows users are used to. Once you try it, going back feels weird.]]></description>
		
					<wfw:commentRss>https://firethering.com/dockdoor-macos-window-previews-alt-tab-switching/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>VoxCPM2 lets you create voices just by describing them and it is open source</title>
		<link>https://firethering.com/voxcpm2-voice-cloning/</link>
					<comments>https://firethering.com/voxcpm2-voice-cloning/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Fri, 10 Apr 2026 20:47:50 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6164</guid>

					<description><![CDATA[Most AI voice tools give you two options. Clone an existing voice or pick from a list of defaults. If neither works for what you need, you are stuck.

VoxCPM2 adds a third option. You describe what you want. A young woman, gentle tone, slightly slow pace. A deep male voice with a formal cadence. Whatever you can put into words, it generates from scratch, no recording needed.

That alone would make it interesting. But it also does voice cloning, supports 30 languages without needing a language tag, outputs 48kHz audio, runs on 8GB of VRAM, and ships under Apache 2.0. The whole thing is two billion parameters and installs with a single pip command.

I tried the audio samples and the results are genuinely good. Not fully human, but natural enough that you stop noticing the model and start paying attention to what it is saying. Mixed languages, different emotions, and you can steer all of it.]]></description>
		
					<wfw:commentRss>https://firethering.com/voxcpm2-voice-cloning/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		<enclosure url="https://openbmb.github.io/voxcpm2-demopage/audio/voice_design/vd_acoustic_asmr.wav" length="1213484" type="audio/wav" />

			</item>
		<item>
		<title>Meta’s Muse Spark: A Closed Bet on Multimodal, Multi-Agent AI</title>
		<link>https://firethering.com/meta-muse-spark-multimodal-ai/</link>
					<comments>https://firethering.com/meta-muse-spark-multimodal-ai/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Thu, 09 Apr 2026 13:30:36 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Models]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6159</guid>

					<description><![CDATA[Meta has a new AI model and for the first time in years it is not called Llama.

Muse Spark launched yesterday under Meta Superintelligence Labs, a new internal division Meta quietly formed by bringing together researchers from Google DeepMind and other frontier labs. It is natively multimodal, supports multi-agent reasoning, and is available right now at meta.ai. It is also not being released as open weights.

That last part is worth sitting with for a second. Meta built one of the most trusted brands in open source AI through Llama. Developers built on it, researchers published with it. Muse Spark continues none of that. No weights, no HuggingFace release, private API preview only.

What you get instead is a genuinely capable multimodal model with some benchmark numbers that are hard to ignore and a new reasoning mode called Contemplating that puts it in conversation with Gemini Deep Think and GPT Pro. Whether that trade is worth it depends entirely on what you were using Meta AI for in the first place.]]></description>
		
					<wfw:commentRss>https://firethering.com/meta-muse-spark-multimodal-ai/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>GLM 5.1: The open source model that gets better the longer you run it</title>
		<link>https://firethering.com/glm-5-1-open-source-agentic-model/</link>
					<comments>https://firethering.com/glm-5-1-open-source-agentic-model/#respond</comments>
		
		<dc:creator><![CDATA[Mohit Geryani]]></dc:creator>
		<pubDate>Wed, 08 Apr 2026 16:23:40 +0000</pubDate>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[AI Models]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI Agents]]></category>
		<guid isPermaLink="false">https://firethering.com/?p=6149</guid>

					<description><![CDATA[Give an AI agent a hard problem and it usually figures out the easy wins fast. After that, more time does not help. It just sits there, trying the same things.

ZhipuAI ran GLM-5.1 on a vector database optimization problem and let it go for 600 iterations. It did not run out of ideas. At iteration 50 it was sitting at roughly the same performance as the best single-session result any model had achieved. By iteration 600 it had reached 21,500 queries per second. The previous best was 3,547.

That gap is not incremental improvement. It is a different category of result. GLM-5.1 is open source, MIT licensed, and the weights are on HuggingFace right now. It works with Claude Code, vLLM, and SGLang. If you are building anything that runs agents over long tasks, this one is worth understanding.]]></description>
		
					<wfw:commentRss>https://firethering.com/glm-5-1-open-source-agentic-model/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
