​​​​​​​​​​​​​​​​​         

Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Meta needs to win over AI developers at its first LlamaCon


Tuesday is a target host his first ever llalacon AI programming conference at its headquarters of Menlo Park, where the company will try to engage in developers on the construction of applications with their open Lema AI models. Just a year ago, it was not difficult to sell.

However, in recent months, the target has been struggling to be ongoing with “open” AI laboratories such as Deepseek and closed commercial competitors like Openii AI races that develop quickly. Llalacon comes at a critical moment for the target in his efforts to build a spacious ecosystem of the llam.

Programmers’ win may be as simple as delivery of better open models. But it could be harder to achieve than it sounds.

Promising an early beginning

Target Starting Llam 4 earlier this month Underfled developers, with a series of reference results that come under models such as Deepseek’s R1 and V3. It was far from what was once Lama: the border model of the model.

When the target launched its model Llam 3.1 405b last summer, executive director Mark Zuckerberg advertised as a great victory. In a blog blogMeta named Llam 3.1 405b “The most capable open-up-to-be-available foundation model”, with performance at that time rival Openi’s best model, GPT-4o.

It was an impressive model, to be safe – and other models in the target of the Llam 3 family. Jeremy Nixon, who has hosted Hackathone in AGI House in San Francisco in the last few years, called Llam 3 of the “historical moments”.

Llam 3 made the target dear among the developers of AI, providing top performance with freedom to host models wherever they chose. Today, the Meta -Is Llam 3.3 is downloaded more often than Llam 4, he said in an interview, hugging the faces of the product and growth, Jeff Boudier.

Compare with the reception of the meta -Llam 4 family, and the difference is great. But Llam 4 was controversial from the beginning.

Benchmarking Shenanigans

Meta optimized the version of one of his Llam 4, Llam 4 Maverick, for “talks”, which helped him to first find himself in a full reference LM Arena. Meta, however, never posted this model – a version of the maverick that has been widely developed ended up a lot worse On the LM arena.

A group behind LM Arena said it was a target It was supposed to be “clearer” About deviating. Ion Stoic, co -founder of LM Arena and Professor UC Berkeley, who also faced companies, including any despair and Databricks, Techcrunh said that the incident had harmed the community of programmers in the target.

“[Meta] should have been more explicit that the Maverick model that was on [LM Arena] He was different from the published model, “Stoic said in an interview with Techcrunh.” When that happens, it is a little loss of trust with the community. Of course, they can recover this by release of better models. “

No reasoning

A great omission of the Llam 4 family was Ai model of resonance. Rasing models can work carefully through questions before you answer them. In the last year, much of AI industry has released models of resonancewhich tend to be better performed on certain benchmarks.

Target Teasing the model of LEMA 4But the company did not indicate when to expect it.

Nathan Lambert, a researcher with AI2, says the fact that the target has not posted a model of reasoning with Llam 4 suggests that the company may have hurried to launch.

“Everyone releases the reasoning model and makes their models look so good,” Lambert said. “Why he couldn’t [Meta] Wait to do that? I have no answer to that question. It seems like the usual strangeness of the company. “

Lambert noted that rival open models are closer to the border than ever before and now come in multiple shapes and sizes – significantly increasing pressure on the target. For example, on Monday, Alibaba posted a model collection, Qwen 3which allegedly outweigh some of the best models of Openi and Google Code on Codeforses, programming reference values.

In order to recover the open model, the target simply has to deliver top models, according to Ravid SHWARTZ-SOVI, researcher AI at the NYU Center for Science for Data. This can include more risks, such as using new techniques, he told Techcrunch.

Whether the target is able to take great risks now is not clear. The current and former employees have said earlier Wealth Meta -O’s AI research lab “Dies slow death.” Under VP AI Research, Joelle Pineau, announced this month to leave.

Llalacon is the target of a chance to show what he cooked to beat upcoming editions from AI lab such as Openi, Google, Xai and others. If it fails to deliver, the company could be even more behind in an ultra competitive space.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *