Skip to content

PaperCodex

Subscribe

Code Generation

MFTCoder: Boost Code LLM Performance with Multi-Task Fine-Tuning—Faster, Smarter, and Open Source

MFTCoder: Boost Code LLM Performance with Multi-Task Fine-Tuning—Faster, Smarter, and Open Source 705

If you’re building or maintaining AI-powered coding assistants, you’ve likely faced a frustrating trade-off: fine-tune a model for one specific…

01/13/2026Code Generation, Model Fine-tuning, Multi-task Learning
ICPC-Eval: Stress-Test LLM Reasoning with Real-World Competitive Programming Challenges

ICPC-Eval: Stress-Test LLM Reasoning with Real-World Competitive Programming Challenges 739

Evaluating the true reasoning capabilities of large language models (LLMs) in coding has long been hampered by benchmarks that are…

01/09/2026Algorithmic Reasoning, Code Generation, Model Evaluation
DiffuCoder: Generate Better Code with Iterative, Non-Autoregressive Diffusion Models

DiffuCoder: Generate Better Code with Iterative, Non-Autoregressive Diffusion Models 745

If you’re evaluating next-generation code generation tools, you’ve likely worked with autoregressive (AR) large language models—systems that build code one…

01/09/2026Code Generation, Diffusion Language Models, Reinforcement Learning For Code
CodeGeeX: Open-Source Multilingual Code Generation That Boosts Developer Productivity Across 23 Languages

CodeGeeX: Open-Source Multilingual Code Generation That Boosts Developer Productivity Across 23 Languages 8713

For software teams working across multiple programming languages—or developers tired of vendor lock-in with proprietary AI coding tools—CodeGeeX offers a…

12/27/2025Code Generation, Code Translation, Multilingual Programming
PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed

PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed 1783

If you’re working to improve large language models (LLMs) on hard reasoning tasks—like math problem solving or competitive programming—you’ve likely…

12/27/2025Code Generation, Mathematical Reasoning, Reinforcement Learning
DeepCode: Turn Research Papers and Text into Production-Ready Code—Faster Than Human Experts

DeepCode: Turn Research Papers and Text into Production-Ready Code—Faster Than Human Experts 12706

Imagine being able to feed a research paper, a technical specification, or even a rough product description into a system—and…

12/26/2025Agentic AI, Code Generation, Research Reproduction
aiXcoder-7B: High-Accuracy Code Completion in a Lightweight 7B Model for Real-Time Developer Workflows

aiXcoder-7B: High-Accuracy Code Completion in a Lightweight 7B Model for Real-Time Developer Workflows 2274

aiXcoder-7B is a 7-billion-parameter open-source large language model (LLM) purpose-built for code processing. Unlike larger models that trade inference speed…

12/26/2025Code Completion, Code Generation, Fill-in-the-middle
DeepSeek-V3: A High-Performance, Cost-Efficient MoE Language Model That Delivers Closed-Source Power with Open-Source Flexibility

DeepSeek-V3: A High-Performance, Cost-Efficient MoE Language Model That Delivers Closed-Source Power with Open-Source Flexibility 100738

For technical decision-makers evaluating large language models (LLMs) for real-world applications, balancing raw capability, inference cost, training efficiency, and deployment…

12/26/2025Code Generation, Mathematical Reasoning, Multilingual Language Modeling
WizardCoder: Open-Source Code LLM That Outperforms ChatGPT and Gemini in Code Generation

WizardCoder: Open-Source Code LLM That Outperforms ChatGPT and Gemini in Code Generation 9472

WizardCoder is a state-of-the-art open-source Code Large Language Model (Code LLM) that delivers exceptional performance on code generation tasks—often surpassing…

12/26/2025Code Generation, Instruction Tuning, Programming Assistance
SWE-Lancer: Benchmark Real-World Freelance Coding Tasks to Measure LLMs’ True Engineering Value

SWE-Lancer: Benchmark Real-World Freelance Coding Tasks to Measure LLMs’ True Engineering Value 1438

Evaluating large language models (LLMs) on synthetic coding benchmarks often fails to reflect their real-world utility. Enter SWE-Lancer—a rigorously constructed…

12/22/2025Code Generation, Software Engineering Evaluation, Technical Decision-making

Posts pagination

1 2 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex