Like other code-generating models, Codestral is supposed to help developers write and interact with code. It has been trained in more than 80 programming languages, including Python, Java, C++, and JavaScript, as Mistral elucidates in a blog post. Codestral can complete coding functions, write tests, " fill in" partial code, and respond to queries about a codebase in English.
Mistral claims the model is "open," but the startup's licence forbids employing Codestral and its outputs for commercial endeavours. There is an exception for "development," but this too comes with limitations: The licence explicitly prohibits "any internal usage by employees in the context of the company's business activities."
The rationale might be that Codestral was partly trained on copyrighted content. Mistral neither confirmed nor refuted this in the blog post, but it wouldn't be unexpected; there is evidence that the startup's prior training datasets included copyrighted material.
Codestral may not justify the effort. With 22 billion parameters, the model needs a powerful PC to operate. (Parameters essentially determine the proficiency of an AI model on a problem, such as analysing and generating text.) And although it outperforms the competition according to benchmarks (which could be unreliable), the margin is not substantial.
Despite being impractical for most developers and offering only marginal performance enhancements, Codestral is bound to provoke discussions about the prudence of depending on code-generating models as programming aides.
Developers are adopting generative AI instruments for a portion of their coding tasks. In a Stack Overflow survey from June 2023, forty-four per cent of developers reported that they currently utilise AI tools in their development process, while twenty-six per cent plan to do so shortly. Nevertheless, these instruments have shortcomings.
An examination of over 150 million lines of code committed to project repositories in recent years by GitClear revealed that generative AI development tools are leading to an increase in erroneous code being introduced to codebases. Moreover, security experts have cautioned that such tools can exacerbate existing bugs and security vulnerabilities in software projects; over half of the solutions OpenAI's ChatGPT provides to programming queries are incorrect, according to research from Purdue.